Data
sf-police-incidents

sf-police-incidents

in_preparation ARFF Publicly available Visibility: public Uploaded 16-10-2019 by Janek Thomas
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Problem validating uploaded description file: XML does not correspond to XSD schema. Error Element '{http://openml.org/openml}nominal_value': [facet 'minLength'] The value has a length of '0'; this underruns the allowed minimum length of '1'. on line 119 column 0. Error Element '{http://openml.org/openml}nominal_value': '' is not a valid value of the atomic type '{http://openml.org/openml}basic_latin256'. on line 119 column 0. ,Problem validating uploaded description file: XML does not correspond to XSD schema. Error Element '{http://openml.org/openml}nominal_value': [facet 'minLength'] The value has a length of '0'; this underruns the allowed minimum length of '1'. on line 119 column 0. Error Element '{http://openml.org/openml}nominal_value': '' is not a valid value of the atomic type '{http://openml.org/openml}basic_latin256'. on line 119 column 0. ,Problem validating uploaded description file: XML does not correspond to XSD schema. Error Element '{http://openml.org/openml}nominal_value': [facet 'minLen
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Incident reports from the San Franciso Police Department between January 2003 and May 2018, provided by the City and County of San Francisco. The dataset was downloaded on 05.11.2018. from [https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-Historical-2003/tmnf-yvry]. For a description of all variables, checkout the homepage of the data provider. The original data was published under ODC Public Domain Dedication and Licence (PDDL) [https://opendatacommons.org/licenses/pddl/1.0/]. As target, the binary variable 'ViolentCrime' was created. A 'ViolentCrime' was defined as 'Category' %in% c('ASSAULT', 'ROBBERY', 'SEX OFFENSES, FORCIBLE', 'KIDNAPPING') | 'Descript' %in% c('GRAND THEFT PURSESNATCH', 'ATTEMPTED GRAND THEFT PURSESNATCH'). Additional date and time features 'Hour', 'DayOfWeek', 'Month', and 'Year' were created. The original variables 'Category', 'Descript', 'Date', 'Time', 'Resolution', 'Location', and 'PdId' were removed from the dataset. One record which contained the only missing value in the variable 'PdDistrict' was removed from the dataset. Using this dataset for machine learning was inspired by Nina Zumel's blogpost [http://www.win-vector.com/blog/2012/07/modeling-trick-impact-coding-of-categorical-variables-with-many-levels/]. Note that incidents consist of multiple rows in the dataset when the crime belongs to more than one 'Category', which is indicated by the ID variable 'IncidntNum' (ignored by default).

0 features

Data features are not analyzed yet. Refresh the page in a few minutes.

0 properties

Data properties are not analyzed yet. Refresh the page in a few minutes.

8 tasks

0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task