Data
traffic_violations_100k

traffic_violations_100k

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or the officer issuing the violation will not be published. Acknowledgements source of original dataset: https://catalog.data.gov/dataset/traffic-violations-56dda Inspiration The original dataset was just too time-consuming to perform several basic tasks. Thus, I shuffled it and took the first 100k rows.

28 features

Unnamed:_0numeric100000 unique values
0 missing
datestring1647 unique values
0 missing
timestring1440 unique values
0 missing
descriptionstring2808 unique values
0 missing
locationstring30652 unique values
0 missing
latitudenumeric46805 unique values
0 missing
longitudenumeric48070 unique values
0 missing
accidentstring1 unique values
0 missing
beltsstring2 unique values
0 missing
personal_injurystring2 unique values
0 missing
property_damagestring2 unique values
0 missing
fatalstring2 unique values
0 missing
commercial_licensestring2 unique values
0 missing
hazmatstring2 unique values
0 missing
commercial_vehiclestring2 unique values
0 missing
alcoholstring2 unique values
0 missing
work_zonestring2 unique values
0 missing
vehicletypestring23 unique values
0 missing
yearnumeric91 unique values
746 missing
makestring752 unique values
0 missing
modelstring3497 unique values
12 missing
colorstring26 unique values
1076 missing
violation_typestring3 unique values
0 missing
contributed_to_accidentstring2 unique values
0 missing
racestring6 unique values
0 missing
genderstring3 unique values
0 missing
driver_statestring56 unique values
0 missing
dl_statestring61 unique values
151 missing

19 properties

100000
Number of instances (rows) of the dataset.
28
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
1985
Number of missing values in the dataset.
1231
Number of instances with at least one value missing.
4
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
14.29
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
1.23
Percentage of instances having missing values.
Average class difference between consecutive instances.
0.07
Percentage of missing values.

0 tasks

Define a new task