Data
insurance_dataset

insurance_dataset

active ARFF Publicly available Visibility: public Uploaded 31-01-2022 by Oleksandr Zadorozhnyi
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Dataset description Insurance is a network for evaluating car insurance risks. Format of the dataset The insurance data set contains the following 27 variables: GoodStudent (good student): a two-level factor with levels False and True. Age (age): a three-level factor with levels Adolescent, Adult and Senior. SocioEcon (socio-economic status): a four-level factor with levels Prole, Middle, UpperMiddle and Wealthy. RiskAversion (risk aversion): a four-level factor with levels Psychopath, Adventurous, Normal and Cautious. VehicleYear (vehicle age): a two-level factor with levels Current and older. ThisCarDam (damage to this car): a four-level factor with levels None, Mild, Moderate and Severe. RuggedAuto (ruggedness of the car): a three-level factor with levels EggShell, Football and Tank. Accident (severity of the accident): a four-level factor with levels None, Mild, Moderate and Severe. MakeModel (car's model): a five-level factor with levels SportsCar, Economy, FamilySedan, Luxury and SuperLuxury. DrivQuality (driving quality): a three-level factor with levels Poor, Normal and Excellent. Mileage (mileage): a four-level factor with levels FiveThou, TwentyThou, FiftyThou and Domino. Antilock (ABS): a two-level factor with levels False and True. DrivingSkill (driving skill): a three-level factor with levels SubStandard, Normal and Expert. SeniorTrain (senior training): a two-level factor with levels False and True. ThisCarCost (costs for the insured car): a four-level factor with levels Thousand, TenThou, HundredThou and Million. Theft (theft): a two-level factor with levels False and True. CarValue (value of the car): a five-level factor with levels FiveThou, TenThou, TwentyThou, FiftyThou and Million. HomeBase (neighbourhood type): a four-level factor with levels Secure, City, Suburb and Rural. AntiTheft (anti-theft system): a two-level factor with levels False and True. PropCost (ratio of the cost for the two cars): a four-level factor with levels Thousand, TenThou, HundredThou and Million. OtherCarCost (costs for the other car): a four-level factor with levels Thousand, TenThou, HundredThou and Million. OtherCar (other cars involved in the accident): a two-level factor with levels False and True. MedCost (cost of the medical treatment): a four-level factor with levels Thousand, TenThou, HundredThou and Million. Cushioning (cushioning): a four-level factor with levels Poor, Fair, Good and Excellent. Airbag (airbag): a two-level factor with levels False and True. ILiCost (inspection cost): a four-level factor with levels Thousand, TenThou, HundredThou and Million. DrivHist (driving history): a three-level factor with levels Zero, One and Many. Source Binder J, Koller D, Russell S, Kanazawa K (1997). "Adaptive Probabilistic Networks with Hidden Variables". Machine Learning, 29(2-3):213-244.

27 features

GoodStudentnominal2 unique values
0 missing
Agenominal3 unique values
0 missing
SocioEconnominal4 unique values
0 missing
RiskAversionnominal4 unique values
0 missing
VehicleYearnominal2 unique values
0 missing
ThisCarDamnominal4 unique values
0 missing
RuggedAutonominal3 unique values
0 missing
Accidentnominal4 unique values
0 missing
MakeModelnominal5 unique values
0 missing
DrivQualitynominal3 unique values
0 missing
Mileagenominal4 unique values
0 missing
Antilocknominal2 unique values
0 missing
DrivingSkillnominal3 unique values
0 missing
SeniorTrainnominal2 unique values
0 missing
ThisCarCostnominal4 unique values
0 missing
Theftnominal2 unique values
0 missing
CarValuenominal5 unique values
0 missing
HomeBasenominal4 unique values
0 missing
AntiTheftnominal2 unique values
0 missing
PropCostnominal4 unique values
0 missing
OtherCarCostnominal3 unique values
0 missing
OtherCarnominal2 unique values
0 missing
MedCostnominal4 unique values
0 missing
Cushioningnominal4 unique values
0 missing
Airbagnominal2 unique values
0 missing
ILiCostnominal4 unique values
0 missing
DrivHistnominal3 unique values
0 missing

19 properties

20000
Number of instances (rows) of the dataset.
27
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
0
Number of numeric attributes.
27
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
0
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
100
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
8
Number of binary attributes.
29.63
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task