Data
OpenML
Help
Sign in
×
Sign in
No account? Join OpenML
Forgot password
×
JavaScript is required to properly view the contents of this page!
OpenML
Explore
Data
Task
Flow
Run
Study
Task type
Measure
People
Help
Blog
Contact
Please cite us
adult
ARFF
CSV
JSON
XML
RDF
adult
active
ARFF
public
Visibility: public
Uploaded 31-05-2022 by
Mine Gazioglu
0 likes
downloaded by 0 people , 0 total downloads
0 issues
0 downvotes
Computational Universe
Life Science
Add tag
Issue
#Downvotes for this reason
By
Loading wiki
Help us complete this description
Edit
Predict whether income exceeds $50K/yr based on census data. Also known as Census Income dataset. Train and test sets combined. Null values represented with question mark is replaced with na. 52 duplicate values found and dropped
15 features
class
(target)
nominal
2 unique values
0 missing
age
numeric
74 unique values
0 missing
workclass
nominal
8 unique values
2795 missing
fnlwgt
numeric
28523 unique values
0 missing
education
nominal
16 unique values
0 missing
education_num
numeric
16 unique values
0 missing
marital_status
nominal
7 unique values
0 missing
occupation
nominal
14 unique values
2805 missing
relationship
nominal
6 unique values
0 missing
race
nominal
5 unique values
0 missing
sex
nominal
2 unique values
0 missing
capital_gain
numeric
123 unique values
0 missing
capital_loss
numeric
99 unique values
0 missing
hours_per_week
numeric
96 unique values
0 missing
native_country
nominal
41 unique values
856 missing
Show all 15 features
19 properties
NumberOfInstances
48790
Number of instances (rows) of the dataset.
NumberOfFeatures
15
Number of attributes (columns) of the dataset.
NumberOfClasses
2
Number of distinct values of the target attribute (if it is nominal).
NumberOfMissingValues
6456
Number of missing values in the dataset.
NumberOfInstancesWithMissingValues
3615
Number of instances with at least one value missing.
NumberOfNumericFeatures
6
Number of numeric attributes.
NumberOfSymbolicFeatures
9
Number of nominal attributes.
MinorityClassPercentage
23.94
Percentage of instances belonging to the least frequent class.
MinorityClassSize
11681
Number of instances belonging to the least frequent class.
NumberOfBinaryFeatures
2
Number of binary attributes.
PercentageOfBinaryFeatures
13.33
Percentage of binary attributes.
PercentageOfInstancesWithMissingValues
7.41
Percentage of instances having missing values.
AutoCorrelation
0.63
Average class difference between consecutive instances.
PercentageOfMissingValues
0.88
Percentage of missing values.
Dimensionality
0
Number of attributes divided by the number of instances.
PercentageOfNumericFeatures
40
Percentage of numeric attributes.
PercentageOfSymbolicFeatures
60
Percentage of nominal attributes.
MajorityClassPercentage
76.06
Percentage of instances belonging to the most frequent class.
MajorityClassSize
37109
Number of instances belonging to the most frequent class.
Show all 19 properties
0 tasks
Define a new task