Data
Diabetes_Dataset

Diabetes_Dataset

active ARFF Public Domain (CC0) Visibility: public Uploaded 02-06-2024 by Iwo Godzwon
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Description: The "diabetes.csv" dataset is a medical dataset constructed for the evaluation of machine learning models in predicting diabetes occurrences based on various diagnostic measurements. This dataset encapsulates the clinical parameters of several patients, providing a foundational basis for diabetes prediction research and healthcare analytics. Attribute Description: 1. Pregnancies: Number of times pregnant (Sample Values: 3, 8, 2, 3, 1) 2. Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test (Sample Values: 155, 87, 87, 84, 113) 3. BloodPressure: Diastolic blood pressure (mm Hg) (Sample Values: 68, 0, 70, 80, 80) 4. SkinThickness: Triceps skin fold thickness (mm) (Sample Values: 0, 27, 39, 23, 32) 5. Insulin: 2-Hour serum insulin (mu U/ml) (Sample Values: 105, 0, 110, 325, 194) 6. BMI: Body mass index (weight in kg/(height in m)^2) (Sample Values: 29.7, 38.5, 27.6, 0.0, 27.4) 7. DiabetesPedigreeFunction: Diabetes pedigree function (Sample Values: 0.466, 0.283, 0.252, 0.19, 0.355) 8. Age: Age (years) (Sample Values: 25, 33, 34, 46, 28) 9. Outcome: Class variable (0 or 1) where 1 represents the presence of diabetes and 0 represents absence (Sample Values: 0, 1, 1, 0, 1) Use Case: This dataset is immensely useful for researchers, data scientists, and healthcare professionals aiming to develop and validate predictive models for diabetes. It can facilitate a variety of analyses, from basic correlations between variables to advanced machine learning models that predict diabetes occurrence based on patient data. Additionally, the dataset supports educational purposes for students and academicians in medical data analysis and machine learning application fields.

9 features

Pregnanciesnumeric17 unique values
0 missing
Glucosenumeric136 unique values
0 missing
BloodPressurenumeric47 unique values
0 missing
SkinThicknessnumeric51 unique values
0 missing
Insulinnumeric186 unique values
0 missing
BMInumeric248 unique values
0 missing
DiabetesPedigreeFunctionnumeric517 unique values
0 missing
Agenumeric52 unique values
0 missing
Outcomenumeric2 unique values
0 missing

19 properties

768
Number of instances (rows) of the dataset.
9
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
9
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.
0.01
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task