

active ARFF GPL 2 Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By

Loading wiki
Help us complete this description Edit
Context In the dataset freMTPL2freq risk features and claim numbers were collected for 677,991 motor third-part liability policies (observed on a year). Content freMTPL2freq contains 11 columns (+IDpol): IDpol The policy ID (used to link with the claims dataset). ClaimNb Number of claims during the exposure period. Exposure The exposure period. Area The area code. VehPower The power of the car (ordered categorical). VehAge The vehicle age, in years. DrivAge The driver age, in years (in France, people can drive a car at 18). BonusMalus Bonus/malus, between 50 and 350: 100 means malus in France. VehBrand The car brand (unknown categories). VehGas The car gas, Diesel or regular. Density The density of inhabitants (number of inhabitants per km2) in the city the driver of the car lives in. Region The policy regions in France (based on a standard French classification) Acknowledgements Source: R-Package CASDatasets, Version 1.0-6 (2016) by Christophe Dutang [aut, cre], Arthur Charpentier [ctb] Inspiration The Swiss Actuarial Society's data science tutorials ( ) are build on the original dataset (see above) . This copy enables the use of notebooks (kernels) to further study this interesting topic.

11 features

IDpol (ignore)numeric678013 unique values
0 missing
ClaimNbnumeric11 unique values
0 missing
Exposurenumeric187 unique values
0 missing
Areastring6 unique values
0 missing
VehPowernumeric12 unique values
0 missing
VehAgenumeric78 unique values
0 missing
DrivAgenumeric83 unique values
0 missing
BonusMalusnumeric115 unique values
0 missing
VehBrandstring11 unique values
0 missing
VehGasstring2 unique values
0 missing
Densitynumeric1607 unique values
0 missing
Regionstring22 unique values
0 missing

19 properties

Number of instances (rows) of the dataset.
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
Number of missing values in the dataset.
Number of instances with at least one value missing.
Number of numeric attributes.
Number of nominal attributes.
Percentage of binary attributes.
Percentage of instances having missing values.
Average class difference between consecutive instances.
Percentage of missing values.
Number of attributes divided by the number of instances.
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
Number of binary attributes.

0 tasks

Define a new task