Data
Thyroid_Disease

Thyroid_Disease

active ARFF Attribution (CC BY) Visibility: public Uploaded 31-05-2024 by Iwo Godzwon
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Description: The "Thyroid_Diff.csv" dataset is a comprehensive collection of clinical data relating to thyroid diseases. With attributes capturing a wide range of information from patient demographics (age, gender) to specific clinical findings (smoking history, radiotherapy history, thyroid function, physical examination findings), it provides a detailed overview of patients diagnosed with various forms of thyroid conditions. The dataset encapsulates aspects such as the presence of adenopathy, pathology findings, focality of the disease, and risk categorization. Further, it delves into the TNM classification system, providing insights into the size and extent of tumors (T), presence of cancer in nearby lymph nodes (N), and metastasis (M), thereby contributing to the staging of the disease. The clinical response to treatment and recurrence status is also recorded, offering valuable data for outcomes analysis. Attribute Description: - Age: Numeric, represents the age of the patient. - Gender: Categorical, 'M' for male, 'F' for female. - Smoking: Binary, 'Yes' if the patient has a history of smoking, 'No' otherwise. - Hx Smoking: Binary, indicating a historical record of smoking. - Hx Radiotherapy: Binary, indicates if the patient has undergone radiotherapy. - Thyroid Function: Categorical, reports the thyroid's functional state. - Physical Examination: Text, describes findings from physical examination. - Adenopathy: Binary, 'Yes' if adenopathy is present, 'No' otherwise. - Pathology: Categorical, type of thyroid pathology diagnosed. - Focality: Categorical, 'Multi-Focal' or 'Uni-Focal' disease spread. - Risk: Categorical, assessed risk level ('Low', 'Intermediate', 'High'). - T, N, M: Staging parameters as per the TNM classification. - Stage: Categorical, stage of the disease. - Response: Categorical, patient's response to treatment. - Recurred: Binary, 'Yes' if the disease has recurred, 'No' otherwise. Use Case: This dataset is instrumental for researchers and clinicians focusing on thyroid diseases. Its detailed attributes facilitate analyses on the relationship between demographic factors, lifestyle choices (such as smoking), clinical findings, and treatment outcomes. Furthermore, it can serve as a valuable resource for predictive modeling of disease progression, recurrence, and response to therapy. Machine learning applications can leverage this dataset for developing algorithms that predict patient outcomes, guide treatment plans, and assess risk factors for disease recurrence or poor treatment response.

17 features

Agenumeric65 unique values
0 missing
Gendernominal2 unique values
0 missing
Smokingnominal2 unique values
0 missing
Hx Smokingnominal2 unique values
0 missing
Hx Radiothreapynominal2 unique values
0 missing
Thyroid Functionnominal5 unique values
0 missing
Physical Examinationnominal5 unique values
0 missing
Adenopathynominal6 unique values
0 missing
Pathologynominal4 unique values
0 missing
Focalitynominal2 unique values
0 missing
Risknominal3 unique values
0 missing
Tstring7 unique values
0 missing
Nnominal3 unique values
0 missing
Mstring2 unique values
0 missing
Stagenominal5 unique values
0 missing
Responsenominal4 unique values
0 missing
Recurrednominal2 unique values
0 missing

19 properties

383
Number of instances (rows) of the dataset.
17
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
1
Number of numeric attributes.
14
Number of nominal attributes.
6
Number of binary attributes.
35.29
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
Average class difference between consecutive instances.
5.88
Percentage of numeric attributes.
0.04
Number of attributes divided by the number of instances.
82.35
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.

0 tasks

Define a new task