OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

Income_Adult_Predictor

active ARFF Public Domain (CC0) Visibility: public Uploaded 31-05-2024 by Iwo Godzwon
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Description: The adult.csv dataset encompasses a collection of socio-economic data for adult individuals. Provided data attributes include demographics, education, employment, and income indicators. This dataset is designed to offer insight into factors influencing income levels, providing a foundation for socio-economic analysis, labor market studies, and educational outcome research. Attribute Description: - age: An individual's age. Sample values include integers ranging from 23 to 58. - workclass: The type of employing sector. Examples include 'State-gov', 'Federal-gov', 'Private', and unspecified categories represented as '?'. - fnlwgt: Final weight. This number reflects the number of people the census believes the entry represents. Sample values range from 107302 to 261012. - education: The highest level of education attained by an individual. Categories range from 'Bachelors' to 'HS-grad'. - education.num: A numerical representation of the highest education attained. Values range from 9 to 13. - marital.status: Marital status of the individual, e.g., 'Married-civ-spouse', 'Separated', 'Never-married'. - occupation: The individual's occupation, including 'Prof-specialty', 'Transport-moving', 'Exec-managerial'. - relationship: The individual's role in the family, such as 'Wife', 'Husband', 'Not-in-family'. - race: Race of the individual, with examples including 'Black' and 'White'. - sex: The sex of the individual, either 'Male' or 'Female'. - capital.gain: Capital gains recorded, with sample entries uniformly at 0. - capital.loss: Capital losses recorded, sample values are consistently 0. - hours.per.week: Number of hours worked per week. Sample values include 20, 35, and 40. - native.country: Country of origin, with all sample individuals from 'United-States'. - income: Income categories divided into '<=50K' and '>50K'. Use Case: The adult.csv dataset is pivotal for studies focusing on income disparity, employment trends, the impact of education on earnings, and demographic analysis. Researchers and policymakers can leverage this dataset to understand the dynamics of the labor market, identify educational or skill gaps, and develop targeted social welfare programs. Moreover, it serves as a valuable dataset for machine learning projects aimed at predicting income levels based on a wide range of socio-economic factors.

15 features

age	numeric	73 unique values 0 missing
workclass	nominal	8 unique values 1836 missing
fnlwgt	numeric	21648 unique values 0 missing
education	nominal	16 unique values 0 missing
education.num	numeric	16 unique values 0 missing
marital.status	nominal	7 unique values 0 missing
occupation	nominal	14 unique values 1843 missing
relationship	nominal	6 unique values 0 missing
race	nominal	5 unique values 0 missing
sex	nominal	2 unique values 0 missing
capital.gain	numeric	119 unique values 0 missing
capital.loss	numeric	92 unique values 0 missing
hours.per.week	numeric	94 unique values 0 missing
native.country	nominal	41 unique values 583 missing
income	nominal	2 unique values 0 missing