OpenML
Filter results by:
Subsampling of the dataset madeline (41144) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset madeline (41144) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset madeline (41144) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset madeline (41144) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset philippine (41145) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset arcene (41157) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
100 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset arcene (41157) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
100 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset arcene (41157) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
100 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset philippine (41145) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset philippine (41145) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset philippine (41145) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset philippine (41145) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset ada (41156) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 49 features - 2 classes - 0 missing values
Subsampling of the dataset ada (41156) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 49 features - 2 classes - 0 missing values
Subsampling of the dataset ada (41156) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 49 features - 2 classes - 0 missing values
Subsampling of the dataset ada (41156) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 49 features - 2 classes - 0 missing values
Subsampling of the dataset ada (41156) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 49 features - 2 classes - 0 missing values
This dataset contains drugs and their potency. The features of the drug are FCFP 1024bit Molecular Fingerprints which were generated from SMILES strings. They were obtained using the Pipeline Pilot…
0 runs0 likes0 downloads0 reach0 impact
5742 instances - 1025 features - 0 classes - 0 missing values
The dataset includes New York City Taxi and Limousine Commission (TLC) trips of the green line in December 2016. All trips are paid with a credit card leaving some tip. The variable 'tip_amount' was…
0 runs0 likes0 downloads0 reach0 impact
581835 instances - 19 features - 0 classes - 0 missing values
The dataset contains 3,107 observations on U.S. county votes cast in the 1980 presidential election. Given population, education levels in the population, number of owned houses, incomes and…
0 runs0 likes0 downloads0 reach0 impact
3107 instances - 7 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
85 instances - 22284 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
72 instances - 7130 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
72 instances - 7130 features - 3 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
72 instances - 12583 features - 3 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
253 instances - 15155 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
102 instances - 12601 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
72 instances - 7130 features - 4 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
203 instances - 12601 features - 5 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
66 instances - 4027 features - 3 classes - 12269 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
62 instances - 2001 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
77 instances - 5470 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
187 instances - 19994 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
83 instances - 2309 features - 4 classes - 0 missing values
daily bike dataset
0 runs0 likes0 downloads0 reach0 impact
731 instances - 13 features - 606 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
96 instances - 4027 features - 9 classes - 19667 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
96 instances - 4027 features - 11 classes - 19667 missing values
Context A collection of tweets (in dutch) and features, gathered in april 2022 using the Twitter API. A small portion of the tweets are annotated by volunteer annotators. The main task is to identify…
0 runs0 likes0 downloads0 reach0 impact
451200 instances - 20 features - 0 classes - 0 missing values
bnlearn Bayesian Network Repository reference: [URL](https://www.bnlearn.com/bnrepository/discrete-medium.html#alarm) - Number of nodes: 37 - Number of arcs: 46 - Number of parameters: 509 - Average…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 37 features - classes - 0 missing values
bnlearn Bayesian Network Repository reference: [URL](https://www.bnlearn.com/bnrepository/discrete-medium.html#alarm) - Number of nodes: 37 - Number of arcs: 46 - Number of parameters: 509 - Average…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 37 features - classes - 0 missing values
daily bike dataset
0 runs0 likes0 downloads0 reach0 impact
731 instances - 13 features - 606 classes - 0 missing values
A copy of PLK_Mini dataset from Meta album Set0
0 runs0 likes0 downloads0 reach0 impact
3440 instances - 3 features - 86 classes - 3440 missing values
This motor third-part liability (MTPL) pricing dataset describes 1 Mio insurance policies and their corresponding claim counts, see Mayer, M., Meier, D. and Wuthrich, M.V. (2023) SHAP for Actuaries:…
0 runs0 likes0 downloads0 reach0 impact
1000000 instances - 7 features - 0 classes - 0 missing values
bnlearn Bayesian Network Repository reference: [URL](https://www.bnlearn.com/bnrepository/discrete-medium.html#alarm) - Number of nodes: 37 - Number of arcs: 46 - Number of parameters: 509 - Average…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 37 features - classes - 0 missing values
bnlearn Bayesian Network Repository reference: [URL](https://www.bnlearn.com/bnrepository/discrete-medium.html#alarm) - Number of nodes: 37 - Number of arcs: 46 - Number of parameters: 509 - Average…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 37 features - classes - 0 missing values
bnlearn Bayesian Network Repository reference: [URL](https://www.bnlearn.com/bnrepository/discrete-medium.html#alarm) - Number of nodes: 37 - Number of arcs: 46 - Number of parameters: 509 - Average…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 37 features - classes - 0 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017.
0 runs0 likes0 downloads0 reach0 impact
5465575 instances - 15 features - 0 classes - 132617 missing values
Data from https://doi.org/10.5281/zenodo.269636
0 runs0 likes0 downloads0 reach0 impact
4758 instances - 39 features - classes - 0 missing values
#study_1
0 runs0 likes0 downloads0 reach0 impact
944 instances - 17 features - classes - 0 missing values
Twenty two observations of the Dwarf planet Ceres as observed by Giueseppe Piazzi and published in the September edition of Monatlicher Correspondenz in 1801. These were the measurements used by Gauss…
0 runs0 likes0 downloads0 reach0 impact
22 instances - 9 features - classes - 17 missing values
Gold medal winning pace in minutes per kilometer for the men's marathon from the first 1896 until 2016.
0 runs0 likes0 downloads0 reach0 impact
28 instances - 2 features - classes - 0 missing values
Two colour spotted cDNA array data set of a series of experiments to identify which genes in Yeast are cell cycle regulated.
0 runs0 likes0 downloads0 reach0 impact
6178 instances - 82 features - classes - 59017 missing values
The QSAR biodegradation dataset was built in the Milano Chemometrics and QSAR Research Group. The research leading to these results has received funding from the European Communitys Seventh Framework…
0 runs0 likes0 downloads0 reach0 impact
1055 instances - 41 features - 2 classes - 0 missing values
Context I found this data occasionally and I just could not pass by. So I hope that this dasatet will help anyone who interested in food nutrition values. Content This dataset contains nutrition…
0 runs0 likes0 downloads0 reach0 impact
8789 instances - 77 features - classes - 1590 missing values
The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet.
0 runs0 likes0 downloads0 reach0 impact
20000 instances - 42 features - classes - 0 missing values
This dataset is composed of demographic data about 5000 people and their sushi preferences.
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 136 features - classes - 0 missing values
The dataset contains 15 classes of 24 instances each, where each class references to a hand movement type in libras.
0 runs0 likes0 downloads0 reach0 impact
360 instances - 105 features - classes - 0 missing values
This dataset includes demographic data for users who have rated the top 15 most-rated movies, ranked based on a star rating system.
0 runs0 likes0 downloads0 reach0 impact
260 instances - 79 features - classes - 0 missing values
This dataset includes demographic data for users who have rated the top 15 most-rated movies, ranked based on a star rating system.
0 runs0 likes0 downloads0 reach0 impact
260 instances - 79 features - classes - 0 missing values
A brief description of your dataset.
0 runs0 likes0 downloads0 reach0 impact
3 instances - 3 features - 3 classes - 0 missing values
The original Titanic dataset, describing the survival status of individual passengers on the Titanic. The titanic data does not contain information from the crew, but it does contain actual ages of…
0 runs3 likes45 downloads48 reach12 impact
1309 instances - 14 features - 2 classes - 3855 missing values
The dataset consists of measurements of fetal heart rate and uterine contraction features on cardiotocograms classified by expert obstetricians.
0 runs0 likes0 downloads0 reach0 impact
2126 instances - 33 features - classes - 0 missing values
will be updated
0 runs0 likes0 downloads0 reach0 impact
2111 instances - 17 features - classes - 0 missing values
This dataset aims to distinguish seven different types of dry beans, taking into account the features such as form, shape, type, and structure by the market situation.
0 runs0 likes0 downloads0 reach0 impact
13611 instances - 23 features - classes - 0 missing values
Test dataset to see upload.
0 runs0 likes0 downloads0 reach0 impact
73503 instances - 4 features - 2 classes - 0 missing values
Predicting forest cover ...
0 runs0 likes0 downloads0 reach0 impact
18182 instances - 14 features - 0 classes - 2 missing values
Predicting forest cover ...
0 runs0 likes0 downloads0 reach0 impact
18182 instances - 14 features - 0 classes - 2 missing values
This dataset consists of predicting the cellular localization sites of proteins.
0 runs0 likes0 downloads0 reach0 impact
1484 instances - 18 features - classes - 0 missing values
Predicting forest cover ...
0 runs0 likes0 downloads0 reach0 impact
73503 instances - 4 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
73503 instances - 4 features - classes - 0 missing values
The "Cookbook Reviews" is an extensive data set that includes a range of information about user interactions and recipe reviews. It contains important details like the recipe name, where it stands in…
0 runs0 likes0 downloads0 reach0 impact
18182 instances - 14 features - 0 classes - 2 missing values
This data set consists of the marks secured by the students in various subjects.
0 runs0 likes0 downloads0 reach0 impact
1000 instances - 8 features - 2 classes - 0 missing values
This dataset contains 340 instances concerning the frequencies of seven types of algae populations in different environments.
0 runs0 likes0 downloads0 reach0 impact
316 instances - 25 features - classes - 0 missing values
amphibians
0 runs0 likes0 downloads0 reach0 impact
189 instances - 16 features - 0 classes - 0 missing values
fake dataset without any value
0 runs0 likes0 downloads0 reach0 impact
73503 instances - 4 features - classes - 0 missing values
Anonymized dataset of churn and uplift modeling from a series of marketing campaigns in 2020 by a telecom company.
0 runs0 likes0 downloads0 reach0 impact
11896 instances - 180 features - 0 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
35717 instances - 4 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach17 impact
100 instances - 10001 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach19 impact
3153 instances - 971 features - 2 classes - 0 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
0 runs0 likes2 downloads2 reach18 impact
416188 instances - 61 features - 355 classes - 0 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach2 impact
50789 instances - 20 features - 3 classes - 154107 missing values
INTRUSION DETECTOR LEARNING Software to detect network intrusions protects a computer network from unauthorized users, including perhaps insiders. The intrusion detector learning task is to build a…
0 runs1 likes0 downloads1 reach3 impact
4898431 instances - 42 features - 23 classes - 0 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
0 runs0 likes0 downloads0 reach2 impact
595212 instances - 58 features - 2 classes - 846458 missing values
Incident reports from the San Franciso Police Department between January 2003 and May 2018, provided by the City and County of San Francisco. The dataset was downloaded on 05.11.2018. from…
0 runs1 likes0 downloads1 reach1 impact
2215023 instances - 9 features - 2 classes - 0 missing values
This is the same data as version 5 (OpenML ID = 1220) with '_id' features coded as nominal factor variables.
0 runs0 likes0 downloads0 reach2 impact
39948 instances - 12 features - 2 classes - 0 missing values
Context This is a Glass Identification Data Set from UCI. It contains 10 attributes including id. The response is glass type(discrete 7 values) Content Attribute Information: Id number: 1 to 214…
0 runs0 likes0 downloads0 reach0 impact
214 instances - 10 features - classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 15 features - 2 classes - 68 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 17 features - 0 classes - 76 missing values
Myocardial infarction complications Database
0 runs0 likes0 downloads0 reach0 impact
1649 instances - 105 features - 2 classes - 0 missing values
Myocardial infarction complications Database
0 runs0 likes0 downloads0 reach0 impact
30 instances - 6 features - 0 classes - 0 missing values
Myocardial infarction complications Database
0 runs0 likes0 downloads0 reach0 impact
30 instances - 6 features - 0 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - 2 classes - 0 missing values
Rossmann Store Sales from Kaggle with some pre-processing
0 runs0 likes0 downloads0 reach0 impact
804056 instances - 18 features - 0 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 29 features - 0 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 11 features - 0 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 10 features - 0 classes - 34 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 9 features - 2 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 10 features - 0 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 15 features - 0 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 19 features - 0 classes - 3 missing values