Data
cps88wages

cps88wages

active ARFF Publicly available Visibility: public Uploaded 22-12-2022 by Sebastian Fischer
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Life Science Machine Learning study_353
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Data Description This study uses data on males from the 1988 March CPS to sample the data. The March CPS contains information on previous year's wages, schooling, industry, and occupation. We select a sample of men ages 18 to 70 with positive annual income greater than 50 Dollars in 1992, who are not self-employed nor working without pay. The wage data is deflated by the deflator of Personal Consumption Expenditure for 1992. The data contains 28,155 observations and has variables characterizing the individuals. The goal is to estimate the wage using information about working individuals. Attribute Description 1. *wage* - target feature 2. *education* - years of schooling 3. *experience* - years of potential work experience 4. *ethnicity* - race ("cauc", "afam") 5. *smsa* - whether living in SMSA ("no", "yes") 6. *region* - living region ("northeast", "midwest", "south", "west") 7. *parttime* - whether working parttime ("no", "yes")

7 features

wage (target)numeric5970 unique values
0 missing
educationnumeric19 unique values
0 missing
experiencenumeric67 unique values
0 missing
ethnicitynominal2 unique values
0 missing
smsanominal2 unique values
0 missing
regionnominal4 unique values
0 missing
parttimenominal2 unique values
0 missing

19 properties

28155
Number of instances (rows) of the dataset.
7
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
3
Number of numeric attributes.
4
Number of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
3
Number of binary attributes.
42.86
Percentage of binary attributes.
0
Percentage of instances having missing values.
-389.44
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
42.86
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
57.14
Percentage of nominal attributes.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: wage
Define a new task