Data
Filter results by:
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach0 impact
50789 instances - 20 features - 3 classes - 154107 missing values
This is the same data as version 5 (OpenML ID = 1220) with '_id' features coded as nominal factor variables.
0 runs0 likes0 downloads0 reach0 impact
39948 instances - 12 features - 2 classes - 0 missing values
Dataset KDD98 challenge: https://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html The goal is to estimate the return from a direct mailing in order to maximize donation profits. This dataset…
0 runs0 likes1 downloads1 reach9 impact
82318 instances - 478 features - 2 classes - 2399311 missing values
Incident reports from the San Franciso Police Department between January 2003 and May 2018, provided by the City and County of San Francisco. The dataset was downloaded on 05.11.2018. from…
0 runs0 likes0 downloads0 reach0 impact
538638 instances - 7 features - 2 classes - 0 missing values
This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or…
0 runs1 likes1 downloads2 reach9 impact
70340 instances - 21 features - 3 classes - 2288 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
2 runs0 likes0 downloads0 reach12 impact
595212 instances - 38 features - 2 classes - 846458 missing values
Hourly particulate matter air polution data of Great Britain for the year 2017, provided by Ricardo Energy and Environment on behalf of the UK Department for Environment, Food and Rural Affairs…
0 runs0 likes0 downloads0 reach0 impact
394299 instances - 10 features - 0 classes - 0 missing values
Trip Record Data provided by the New York City Taxi and Limousine Commission (TLC) [http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml]. The dataset includes TLC trips of the green line in…
0 runs0 likes0 downloads0 reach0 impact
581835 instances - 15 features - 0 classes - 0 missing values
This data represents crime reported to the Seattle Police Department (SPD). Each row contains the record of a unique event where at least one criminal offense was reported by a member of the community…
0 runs0 likes0 downloads0 reach0 impact
52358 instances - 8 features - 0 classes - 650 missing values
Airlines Dataset Inspired in the regression dataset from Elena Ikonomovska. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure. For this…
0 runs0 likes0 downloads0 reach0 impact
26969 instances - 8 features - 2 classes - 0 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017. For this version, the task was downsampled to 0.5 percent. Some…
0 runs0 likes0 downloads0 reach0 impact
27327 instances - 18 features - 0 classes - 657 missing values
130k wine reviews with variety, location, winery, price, and description. Downloaded from Kaggle [https://www.kaggle.com/zynicide/wine-reviews/home] on 29.10.2018. The original data was scraped from…
0 runs0 likes1 downloads1 reach9 impact
129971 instances - 13 features - 0 classes - 204752 missing values