Data
Filter results by:
Wikidata with top-474 most frequent types and ingoing/outgoing properties as features
0 runs0 likes0 downloads0 reach0 impact
19254100 instances - 2331 features - classes - 0 missing values
Airlines Departure Delay Prediction (Regression). Original data can be found at: http://www.transtats.bts.gov This is a processed version of the original data, designed to predict departure delay (in…
0 runs0 likes0 downloads0 reach0 impact
10000000 instances - 10 features - 0 classes - 0 missing values
Dataset Description Story View the ReadMe file in my Github repo for this project. Check out all the info on my portfolio's webpage for this project. As I write this, I'm a Data Science student. To…
0 runs0 likes0 downloads0 reach0 impact
9534417 instances - 14 features - classes - 0 missing values
Coal mining requires working in hazardous conditions. Miners in an underground coal mine can face several threats, such as, e.g. methane explosions or rock-burst. To provide protection for people…
0 runs0 likes0 downloads0 reach0 impact
9199930 instances - 34 features - classes - 0 missing values
Context and Content The COVID-19 case surveillance system database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of…
0 runs0 likes0 downloads0 reach0 impact
8405079 instances - 11 features - classes - 9543526 missing values
Context Daily price information for stocks, aggregated into one big file. Content Data was pulled using an api and contains general price information for all stocks that are tradable. Fields include…
0 runs0 likes0 downloads0 reach0 impact
7801920 instances - 7 features - classes - 0 missing values
## Guess which points belong to signal track [COMET](http://comet.kek.jp/Introduction.html) is an experiment being constructed at the J-PARC proton beam laboratory in Japan. It will search for…
0 runs0 likes0 downloads0 reach0 impact
7619400 instances - 6 features - 0 classes - 0 missing values
## Guess which points belong to signal track [COMET](http://comet.kek.jp/Introduction.html) is an experiment being constructed at the J-PARC proton beam laboratory in Japan. It will search for…
0 runs0 likes0 downloads0 reach0 impact
7619400 instances - 6 features - 0 classes - 0 missing values
## Guess which points belong to signal track [COMET](http://comet.kek.jp/Introduction.html) is an experiment being constructed at the J-PARC proton beam laboratory in Japan. It will search for…
0 runs0 likes1 downloads1 reach11 impact
7619400 instances - 6 features - 0 classes - 0 missing values
Description This is a countrywide weather events dataset that includes 6.3 million events, and covers 49 states of the United States. Examples of weather events are rain, snow, storm, and freezing…
0 runs0 likes0 downloads0 reach0 impact
7479165 instances - 14 features - classes - 73797 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017.
0 runs0 likes0 downloads0 reach0 impact
5465575 instances - 15 features - 0 classes - 132617 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) Data set for KDD Cup 1999 Modified by TunedIT (converted to ARFF format)…
4 runs0 likes0 downloads0 reach0 impact
4898431 instances - 42 features - 23 classes - 0 missing values
INTRUSION DETECTOR LEARNING Software to detect network intrusions protects a computer network from unauthorized users, including perhaps insiders. The intrusion detector learning task is to build a…
0 runs0 likes0 downloads0 reach0 impact
4898431 instances - 42 features - 23 classes - 0 missing values
Context Traffic data collected from the several Wavetronix radar sensors deployed by the City of Austin. Dataset is augmented with geo coordinates from sensor location dataset. Source:…
0 runs0 likes0 downloads0 reach0 impact
4603861 instances - 12 features - classes - 0 missing values
Context Inspired by the New York City Taxi Trip Duration playground I created a dataset using the publicly available data from this link). Citi Bike is a bike sharing service available in New York…
0 runs0 likes0 downloads0 reach0 impact
4500000 instances - 8 features - classes - 0 missing values
The BoT-IoT dataset was created by designing a realistic network environment in the Cyber Range Lab of The center of UNSW Canberra Cyber. The environment incorporates a combination of normal and…
0 runs0 likes0 downloads0 reach0 impact
3668522 instances - 45 features - 0 classes - 0 missing values
What is it ? This dataset is a record of 3.5 Million+ US Domestic Flights from 1990 to 2009. It has been taken from OpenFlights website which have a huge database of different travelling mediums…
0 runs0 likes0 downloads0 reach0 impact
3606803 instances - 15 features - classes - 27522 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes0 downloads0 reach0 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
Overview This dataset contains 3 million Sudoku puzzles and their solutions. The level of difficulty varies -- some can be solved easily by a beginner, while others will challenge experienced solvers.…
0 runs0 likes0 downloads0 reach0 impact
3000000 instances - 4 features - 0 classes - 0 missing values
Experiment data obtained by running random configurations of xgboost through mlr on 118 different classification tasks from openml. Parameter descriptions:…
0 runs0 likes0 downloads0 reach0 impact
2955210 instances - 21 features - classes - 7051006 missing values
BitcoinHeist Ransomware Dataset Akcora, C.G., Li, Y., Gel, Y.R. and Kantarcioglu, M., 2019. BitcoinHeist. Topological Data Analysis for Ransomware Detection on the Bitcoin Blockchain. IJCAI-PRICAI…
0 runs0 likes0 downloads0 reach0 impact
2916697 instances - 10 features - 29 classes - 0 missing values
DBpedia with top-474 most frequent YAGO types HMC dataset for type prediction. Ingoing and outgoing properties as features
0 runs0 likes0 downloads0 reach0 impact
2886305 instances - 2401 features - classes - 0 missing values
P2P Lending I concatenated historical loans from both Prosper and Lending Club 2013 - 2018. Currently only the summary of the loan (terms, origination date, loan amount, status, etc) are up but…
0 runs0 likes0 downloads0 reach0 impact
2875146 instances - 18 features - classes - 863078 missing values
()[]
0 runs0 likes0 downloads0 reach0 impact
2845342 instances - 46 features - classes - 3414349 missing values
Description This is a countrywide car accident dataset, which covers 49 states of the USA. The accident data are collected from February 2016 to Dec 2020, using two APIs that provide streaming traffic…
0 runs0 likes0 downloads0 reach0 impact
2845342 instances - 46 features - classes - 3414349 missing values
Context The objective of this dataset is to create a chess engine through machine learning. In this first part we will first predict the pieces to be moved depending on the position of the chessboard…
0 runs0 likes0 downloads0 reach0 impact
2632753 instances - 66 features - classes - 0 missing values
Context This Data is gathered from NSE website for the past three months I am posting this here so people can analyse this data and gather meaningful insights from this. Example - Probability of Stock…
0 runs0 likes0 downloads0 reach0 impact
2533210 instances - 16 features - classes - 0 missing values
This dataset is an artificial simulation of the Duffing system with random changes from the chaotic to the non-chaotic regime at different noise levels.
0 runs0 likes0 downloads0 reach0 impact
2493200 instances - 26 features - classes - 0 missing values
Context This is historical data on cryptocurrency tradings for the period from 2016-01-01 to 2021-02-21. If you enjoy this dataset please upvote so I can see it is popular and I need to update it.…
0 runs0 likes0 downloads0 reach0 impact
2382643 instances - 17 features - classes - 4862194 missing values
Context Google PlayStore App analytics. (1.1 Million + App Data) Source: https://github.com/gauthamp10/Google-Playstore-Dataset Content I've collected the data with the help of Python and Scrapy…
0 runs0 likes0 downloads0 reach0 impact
2312944 instances - 24 features - classes - 1355506 missing values
Incident reports from the San Franciso Police Department between January 2003 and May 2018, provided by the City and County of San Francisco. The dataset was downloaded on 05.11.2018. from…
0 runs0 likes0 downloads0 reach0 impact
2215023 instances - 9 features - 2 classes - 0 missing values
This dataset is now updated annually here. Context This dataset contains the salary, pay rate, and total compensation of every New York City employee. In this dataset this information is provided for…
0 runs0 likes0 downloads0 reach0 impact
2194488 instances - 16 features - classes - 1398151 missing values
Context This data set was created to help Kaggle users in the New Your City Taxi Trip Duration competition. New features were generated using Wolfram Mathematica system. Hope that this data set will…
0 runs0 likes0 downloads0 reach0 impact
2083778 instances - 24 features - classes - 6810 missing values
Context Amazon.com is one of the largest electronic commerce and cloud computing companies. Just a few Amazon related facts They lost 4.8 million in August 2013, when their website went down for 40…
0 runs0 likes0 downloads0 reach0 impact
2023070 instances - 4 features - classes - 0 missing values
Balanced version of click prediction data
36 runs0 likes0 downloads0 reach0 impact
1997410 instances - 12 features - 2 classes - 0 missing values
UserID
0 runs0 likes0 downloads0 reach0 impact
1974675 instances - 10 features - classes - 1974675 missing values
web services evaluations in this table
0 runs0 likes0 downloads0 reach0 impact
1974675 instances - 10 features - classes - 1974675 missing values
Context This dataset deals with pollution in the U.S. Pollution in the U.S. has been well documented by the U.S. EPA but it is a pain to download all the data and arrange them in a format that…
0 runs0 likes0 downloads0 reach0 impact
1746661 instances - 29 features - classes - 1746230 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach0 impact
1692056 instances - 26 features - classes - 9275842 missing values
The ACSIncome dataset is one of five datasets created by Ding et al. as an improved alternative to the popular UCI Adult dataset. The authors compiled data from the American Community Survey (ACS)…
0 runs0 likes0 downloads0 reach0 impact
1664500 instances - 12 features - classes - 0 missing values
The ACSIncome dataset is one of five datasets created by Ding et al. as an improved alternative to the popular UCI Adult dataset. The authors compiled data from the American Community Survey (ACS)…
0 runs0 likes0 downloads0 reach0 impact
1664500 instances - 12 features - classes - 0 missing values
The ACSIncome dataset is one of five datasets created by Ding et al. as an improved alternative to the popular UCI Adult dataset. The authors compiled data from the American Community Survey (ACS)…
0 runs0 likes0 downloads0 reach0 impact
1664500 instances - 12 features - 0 classes - 0 missing values
Context This dataset is created for the prediction of future New York Housing Price based on the past 17 years of record. Content Please check the details under the column description.…
0 runs0 likes0 downloads0 reach0 impact
1600202 instances - 10 features - classes - 40496 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach0 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or…
0 runs0 likes0 downloads0 reach0 impact
1578154 instances - 43 features - 4 classes - 8006541 missing values
Context Just made a scraper for stackoverflow, and created a dataset. Hope it will be useful for your task Content Contains 1 csv file, containing following columns question_vote_count : Number of…
0 runs0 likes0 downloads0 reach0 impact
1544049 instances - 4 features - classes - 0 missing values
Data on predicting clicks on ads in a search engine.
0 runs0 likes0 downloads0 reach0 impact
1496391 instances - 10 features - 2 classes - 0 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach0 impact
1468825 instances - 26 features - 0 classes - 7881776 missing values
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. More details can be found in: Albert Bifet, Geoff Holmes,…
332 runs0 likes0 downloads0 reach0 impact
1455525 instances - 73 features - 10 classes - 0 missing values
Context It's the preprocessed train data from Quora Insincere Questions competition 2018 The original train data is preprocessed to remove stop words, numbers, punctuations, common words and converted…
0 runs0 likes0 downloads0 reach0 impact
1306122 instances - 4 features - classes - 1 missing values
Embedding of molecules bonds in HIV inhibitors dataset
0 runs0 likes0 downloads0 reach0 impact
1151940 instances - 30 features - classes - 0 missing values
The dataset was reproduced following instructions from this paper: https://arxiv.org/pdf/2108.04884.pdf. The data originates from ACS PUMS.
0 runs0 likes0 downloads0 reach0 impact
1138289 instances - 19 features - classes - 0 missing values
ACSPublicCoverage dataset reproduced from this paper: https://arxiv.org/pdf/2108.04884.pdf.
0 runs0 likes0 downloads0 reach0 impact
1138289 instances - 20 features - 0 classes - 0 missing values
Context I have gathered this dataset over the course of 8 years and put a lot of effort in it (see soccerverse.com). If you use the data for any kind of project, please drop me a line or ping me on…
0 runs0 likes0 downloads0 reach0 impact
1078214 instances - 17 features - classes - 4031 missing values
No data.
253 runs0 likes0 downloads0 reach0 impact
1076790 instances - 30 features - 2 classes - 7275 missing values
Embedding of atoms for HIV inhibitors dataser
0 runs0 likes0 downloads0 reach0 impact
1069964 instances - 30 features - classes - 0 missing values
Context This Online Retail II data set contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/12/2009 and 09/12/2011.The company mainly sells unique…
0 runs0 likes0 downloads0 reach0 impact
1067371 instances - 8 features - classes - 247481 missing values
Source: Charles Gaydon This data only contains 5 variables of Productcode, Warehouse, ProductCategory, Date, Order_demand I showed that it is possible, with trivial models, to lower the mean average…
0 runs0 likes0 downloads0 reach0 impact
1048575 instances - 5 features - classes - 11239 missing values
The pandemic context brings new challenges to cities. This fantastic resource was created by Google with aggregated, anonymized sets of data from users who have turned on the Location History setting…
0 runs0 likes0 downloads0 reach0 impact
1048575 instances - 14 features - classes - 5573689 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
1 runs0 likes0 downloads0 reach0 impact
1025010 instances - 11 features - 0 classes - 0 missing values
This is the poker dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: -join test and…
23 runs0 likes0 downloads0 reach0 impact
1025010 instances - 11 features - 2 classes - 0 missing values
* Abstract: Purpose is to predict poker hands * Source - Creators: Robert Cattral (cattral '@' gmail.com) Franz Oppacher (oppacher '@' scs.carleton.ca) Carleton University, Department of Computer…
1 runs0 likes0 downloads0 reach0 impact
1025009 instances - 11 features - 10 classes - 0 missing values
* Abstract: 9-class version of poker-hand dataset, it was removed the minority class.
1 runs0 likes0 downloads0 reach0 impact
1025000 instances - 11 features - 9 classes - 0 missing values
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark.…
0 runs0 likes0 downloads0 reach0 impact
1022616 instances - 6 features - 0 classes - 0 missing values
Traffic violations followed the invention of the automobile: the first traffic ticket in the United States was allegedly given to a New York City cab driver on May 20, 1899, for going at the breakneck…
0 runs0 likes0 downloads0 reach0 impact
1018634 instances - 34 features - classes - 328559 missing values
No data.
28 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
32 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
37 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
33 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
28 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
32 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
33 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
28 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
9 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
10 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
9 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
10 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
6 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
6 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values