Data
Diabetes(scikit-learn)

Diabetes(scikit-learn)

active ARFF BSD (from scikit-learn) Visibility: public Uploaded 05-10-2022 by louis geiler
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
.. _diabetes_dataset: Diabetes dataset ---------------- Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442 diabetes patients, as well as the response of interest, a quantitative measure of disease progression one year after baseline. Data Set Characteristics: :Number of Instances: 442 :Number of Attributes: First 10 columns are numeric predictive values :Target: Column 11 is a quantitative measure of disease progression one year after baseline :Attribute Information: - age age in years - sex - bmi body mass index - bp average blood pressure - s1 tc, total serum cholesterol - s2 ldl, low-density lipoproteins - s3 hdl, high-density lipoproteins - s4 tch, total cholesterol / HDL - s5 ltg, possibly log of serum triglycerides level - s6 glu, blood sugar level Note: Each of these 10 feature variables have been mean centered and scaled by the standard deviation times the square root of `n_samples` (i.e. the sum of squares of each column totals 1). Source URL: https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html For more information see: Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani (2004) "Least Angle Regression," Annals of Statistics (with discussion), 407-499. (https://web.stanford.edu/~hastie/Papers/LARS/LeastAngle_2002.pdf)

11 features

class (target)numeric214 unique values
0 missing
agenumeric58 unique values
0 missing
sexnumeric2 unique values
0 missing
bminumeric163 unique values
0 missing
bpnumeric100 unique values
0 missing
s1numeric141 unique values
0 missing
s2numeric302 unique values
0 missing
s3numeric63 unique values
0 missing
s4numeric66 unique values
0 missing
s5numeric184 unique values
0 missing
s6numeric56 unique values
0 missing

19 properties

442
Number of instances (rows) of the dataset.
11
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
11
Number of numeric attributes.
0
Number of nominal attributes.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
-84.64
Average class difference between consecutive instances.
0
Percentage of missing values.
0.02
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.

0 tasks

Define a new task