Data
Parkinson_Dataset

Parkinson_Dataset

active ARFF Public Domain (CC0) Visibility: public Uploaded 31-05-2024 by Iwo Godzwon
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Description: This dataset, named "parkinsons data.csv", encompasses a collection of voice measurement data from individuals, some of whom have Parkinson's disease. It includes a diverse range of voice signal attributes aimed at assisting in the early diagnosis and tracking of Parkinson's disease through non-invasive methods. The dataset contains several columns such as 'MDVP:Fo(Hz)', 'MDVP:Fhi(Hz)', 'MDVP:Flo(Hz)', signifying the voice frequency measurements, and others like 'MDVP:Jitter(%)', 'MDVP:Shimmer', 'NHR', 'HNR', relaying the variation in voice frequency and amplitude. 'Status' is a binary indicator where '1' denotes the presence and '0' the absence of Parkinson's disease. Additional metrics relevant to voice disorders are included, covering various aspects of voice quality and dynamics, such as 'RPDE', 'DFA', 'spread1', 'spread2', 'D2', and 'PPE', offering a comprehensive set of features for analysis. Attribute Description: 1. MDVP:Fo(Hz): Average vocal fundamental frequency. 2. MDVP:Fhi(Hz): Maximum vocal fundamental frequency. 3. MDVP:Flo(Hz): Minimum vocal fundamental frequency. 4. MDVP:Jitter(%), MDVP:Jitter(Abs), MDVP:RAP, MDVP:PPQ, Jitter:DDP: Various measures of variation in frequency. 5. MDVP:Shimmer, MDVP:Shimmer(dB), Shimmer:APQ3, Shimmer:APQ5, MDVP:APQ, Shimmer:DDA: Different measures of variation in amplitude. 6. NHR, HNR: Ratios depicting noise components in the voice. 7. Status: Binary status for the presence of Parkinson's disease. 8. RPDE, DFA, spread1, spread2, D2, PPE: Nonlinear dynamical measurements. Use Case: This dataset can serve multiple purposes ranging from academic research in biomedical voice signal processing to the practical development of diagnostic tools for early detection of Parkinson's disease. It can be utilized by data scientists and researchers to devise machine learning models capable of distinguishing between healthy individuals and those affected by Parkinson's disease based on voice measurements alone. Furthermore, the dataset can contribute to enhancing our understanding of how Parkinson's disease impacts voice characteristics, aiding in the development of new therapies and treatments.

24 features

namestring195 unique values
0 missing
MDVP:Fo(Hz)numeric195 unique values
0 missing
MDVP:Fhi(Hz)numeric195 unique values
0 missing
MDVP:Flo(Hz)numeric195 unique values
0 missing
MDVP:Jitter(%)numeric173 unique values
0 missing
MDVP:Jitter(Abs)nominal19 unique values
0 missing
MDVP:RAPnumeric155 unique values
0 missing
MDVP:PPQnumeric165 unique values
0 missing
Jitter:DDPnumeric180 unique values
0 missing
MDVP:Shimmernumeric188 unique values
0 missing
MDVP:Shimmer(dB)numeric149 unique values
0 missing
Shimmer:APQ3numeric184 unique values
0 missing
Shimmer:APQ5numeric189 unique values
0 missing
MDVP:APQnumeric189 unique values
0 missing
Shimmer:DDAnumeric189 unique values
0 missing
NHRnumeric185 unique values
0 missing
HNRnumeric195 unique values
0 missing
statusnominal2 unique values
0 missing
RPDEnumeric195 unique values
0 missing
DFAnumeric195 unique values
0 missing
spread1numeric195 unique values
0 missing
spread2numeric194 unique values
0 missing
D2numeric195 unique values
0 missing
PPEnumeric195 unique values
0 missing

19 properties

195
Number of instances (rows) of the dataset.
24
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
21
Number of numeric attributes.
2
Number of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
1
Number of binary attributes.
4.17
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.
0.12
Number of attributes divided by the number of instances.
87.5
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
8.33
Percentage of nominal attributes.

0 tasks

Define a new task