Data
health_insurance

health_insurance

active ARFF GPL (>= 2) Visibility: public Uploaded 22-12-2022 by Sebastian Fischer
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Demographics Geography study_353
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Data Description Dataset is a cross-section study from 1993 in United States. It presents dataset about health insurance and hours worked by wives. Each instance is a data about a married woman. Attribute Description 1. *whrswk* - hours worked per week by wife, target feature 2. *hhi* - whether wife covered by husband's health insurance 3. *whi* - whether wife has health insurance through her job ? 4. *hhi2* - whether husband has health insurance through her job ? 5. *education* - a factor with levels, "<9years", "9-11years", "12years", "13-15years", "16years", ">16years" 6. *race* - "white", "black", "other" 7. *hispanic* - "yes" or "no" 8". *experience* - years of potential work experience 9. *kidslt6* - number of kids under age of 6 10. *kids618* - number of kids 6-18 years old 11. *husby* - husband's income in thousands of dollars 12. *region* - one of "other", "northcentral", "south", "west" 13. *wght* - sampling weight (should be ignored)

12 features

whrswk (target)numeric75 unique values
0 missing
hhinominal2 unique values
0 missing
whinominal2 unique values
0 missing
hhi2nominal2 unique values
0 missing
educationnominal6 unique values
0 missing
racenominal3 unique values
0 missing
hispanicnominal2 unique values
0 missing
experiencenumeric100 unique values
0 missing
kidslt6numeric6 unique values
0 missing
kids618numeric9 unique values
0 missing
husbynumeric2540 unique values
0 missing
regionnominal4 unique values
0 missing
wght (ignore)numeric13875 unique values
0 missing

19 properties

22272
Number of instances (rows) of the dataset.
12
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
5
Number of numeric attributes.
7
Number of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
4
Number of binary attributes.
33.33
Percentage of binary attributes.
0
Percentage of instances having missing values.
-18.49
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
41.67
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
58.33
Percentage of nominal attributes.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: whrswk
Define a new task