Data
medical_cost

medical_cost

active ARFF Database: Open Database, Contents: Database Contents Visibility: public Uploaded 23-07-2024 by Bruno Belucci Teixeira
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
From original source: ----- Context Machine Learning with R by Brett Lantz is a book that provides an introduction to machine learning using R. As far as I can tell, Packt Publishing does not make its datasets available online unless you buy the book and create a user account which can be a problem if you are checking the book out from the library or borrowing the book from a friend. All of these datasets are in the public domain but simply needed some cleaning up and recoding to match the format in the book. Content Columns age: age of primary beneficiary sex: insurance contractor gender, female, male bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9 children: Number of children covered by health insurance / Number of dependents smoker: Smoking region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest. charges: Individual medical costs billed by health insurance -----

7 features

charges (target)numeric1337 unique values
0 missing
agenominal47 unique values
0 missing
sexnominal2 unique values
0 missing
bminumeric548 unique values
0 missing
childrennumeric6 unique values
0 missing
smokernominal2 unique values
0 missing
regionnominal4 unique values
0 missing

19 properties

1338
Number of instances (rows) of the dataset.
7
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
3
Number of numeric attributes.
4
Number of nominal attributes.
-12304.18
Average class difference between consecutive instances.
0
Percentage of missing values.
0.01
Number of attributes divided by the number of instances.
42.86
Percentage of numeric attributes.
57.14
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
2
Number of binary attributes.
28.57
Percentage of binary attributes.
0
Percentage of instances having missing values.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: charges
Define a new task