Data
WorkersCompensation

WorkersCompensation

active ARFF Public Domain (CC0) Visibility: public Uploaded 11-05-2021 by Michael Mayer
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset describes 100,000 realistic, synthetically generated worker compensation insurance claims. Along the ultimate financial losses, each claim is described by the initial case estimate, date of accident and reporting date, a text describing the accident and demographic info on the worker. The dataset was kindly created and provided by Colin Priest. While similar, it is not identical to the dataset used in www.kaggle.com/c/actuarial-loss-estimation.

14 features

UltimateIncurredClaimCost (target)numeric47199 unique values
0 missing
ClaimNumber (row identifier)string100000 unique values
0 missing
DateTimeOfAccidentstring55797 unique values
0 missing
DateReportedstring4963 unique values
0 missing
Agenumeric64 unique values
0 missing
Genderstring2 unique values
0 missing
MaritalStatusstring3 unique values
0 missing
DependentChildrennumeric7 unique values
0 missing
DependentsOthernumeric3 unique values
0 missing
WeeklyPaynumeric1778 unique values
0 missing
PartTimeFullTimestring2 unique values
0 missing
HoursWorkedPerWeeknumeric109 unique values
0 missing
DaysWorkedPerWeeknumeric7 unique values
0 missing
ClaimDescriptionstring97219 unique values
0 missing
InitialCaseEstimatenumeric4379 unique values
0 missing

19 properties

100000
Number of instances (rows) of the dataset.
14
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
8
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
57.14
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
-20926.77
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task