OpenML
Filter results by:
Context Electronic dance music (EDM) is a genre where thousands of new songs are released every week. The list of EDM subgenres considered is long, but it also evolves according to trends and musical…
0 runs0 likes0 downloads0 reach0 impact
2900 instances - 94 features - classes - 0 missing values
University of Sao Paulo, School of Art, Sciences and Humanities, Sao Paulo, SP, Brazil ### LIBRAS Movement Database LIBRAS, acronym of the Portuguese name "LIngua BRAsileira de Sinais", is the…
5 runs0 likes0 downloads0 reach0 impact
360 instances - 91 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1000000 instances - 91 features - 0 classes - 0 missing values
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
515345 instances - 91 features - 0 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
71 runs0 likes0 downloads0 reach0 impact
88 instances - 91 features - 4 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
129 runs0 likes0 downloads0 reach0 impact
117 instances - 91 features - 3 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
130 runs0 likes0 downloads0 reach0 impact
164 instances - 91 features - 5 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
71 runs0 likes0 downloads0 reach0 impact
47 instances - 91 features - 5 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
71 runs0 likes0 downloads0 reach0 impact
47 instances - 91 features - 4 classes - 0 missing values
Nomao collects data about places (name, phone, localization...) from many sources. Deduplication consists in detecting what data refer to the same place. Instances in the dataset compare 2 spots.The…
0 runs0 likes0 downloads0 reach0 impact
34465 instances - 119 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
360 instances - 105 features - classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach0 impact
156 instances - 91 features - 2 classes - 0 missing values
The dataset contains 15 classes of 24 instances each, where each class references to a hand movement type in libras.
0 runs0 likes0 downloads0 reach0 impact
360 instances - 105 features - classes - 0 missing values
1. Data set title: Nomao Data Set 2. Abstract: Nomao collects data about places (name, phone, localization...) from many sources. Deduplication consists in detecting what data refer to the same place.…
67704 runs0 likes16 downloads16 reach30 impact
34465 instances - 119 features - 2 classes - 0 missing values
Context All credit of this database goes to Tim Sevenhuysen of OraclesElixir.com. Im just uploading it here because I want to see what you guys do with this dataset before Worlds! Im super hyped!…
0 runs0 likes0 downloads0 reach0 impact
67980 instances - 103 features - classes - 1527593 missing values
Information about customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company…
0 runs0 likes0 downloads0 reach0 impact
9822 instances - 86 features - 0 classes - 0 missing values
Context This data is the result of using neural networks and reinforcement learning to simulate the board game "Machi Koro". Here is the source code for the AI and simulation:…
0 runs0 likes0 downloads0 reach0 impact
614584 instances - 86 features - classes - 0 missing values
This datasets covers features from various categories of technical indicators, futures contracts, price of commodities, important indices of markets around the world, price of major companies in the…
0 runs0 likes0 downloads0 reach0 impact
1984 instances - 84 features - classes - 3029 missing values
This datasets covers features from various categories of technical indicators, futures contracts, price of commodities, important indices of markets around the world, price of major companies in the…
0 runs0 likes0 downloads0 reach0 impact
1984 instances - 84 features - classes - 3029 missing values
This datasets covers features from various categories of technical indicators, futures contracts, price of commodities, important indices of markets around the world, price of major companies in the…
0 runs0 likes0 downloads0 reach0 impact
1984 instances - 84 features - classes - 3029 missing values
This datasets covers features from various categories of technical indicators, futures contracts, price of commodities, important indices of markets around the world, price of major companies in the…
0 runs0 likes0 downloads0 reach0 impact
1984 instances - 84 features - classes - 3029 missing values
The data contains information on 21263 superconductors. The first 81 columns contain extracted features and the 82nd column contains the critical temperature which is used as the target variable. The…
0 runs0 likes0 downloads0 reach0 impact
21263 instances - 82 features - 0 classes - 0 missing values
Two colour spotted cDNA array data set of a series of experiments to identify which genes in Yeast are cell cycle regulated.
0 runs0 likes0 downloads0 reach0 impact
6178 instances - 82 features - classes - 59017 missing values
Dataset contains data on 21263 superconductors and their relevant features including the critical temperature. The goal here is to predict the latter based on the features extracted. All features…
0 runs0 likes0 downloads0 reach0 impact
21263 instances - 82 features - 0 classes - 0 missing values
This is the training set of the COIL 2000 challenge as used by Huang et al. (2020). > Huang, X., Khetan, A., Cvitkovic, M., & Karnin, Z. (2020). > Tabtransformer: Tabular data modeling using…
0 runs0 likes0 downloads0 reach0 impact
5822 instances - 86 features - 0 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach0 impact
156 instances - 81 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach0 impact
156 instances - 81 features - 2 classes - 0 missing values
DATA
0 runs0 likes0 downloads0 reach0 impact
3119345 instances - 88 features - classes - 24532528 missing values
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
21263 instances - 80 features - 0 classes - 0 missing values
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark.…
0 runs0 likes0 downloads0 reach0 impact
21263 instances - 80 features - 0 classes - 0 missing values
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
21263 instances - 80 features - 0 classes - 0 missing values
Subsampling of the dataset KDDCup09_appetency (1111) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 92 features - 2 classes - 121455 missing values
Abstract: This data-set contains examples of buzz events from two different social networks: Twitter, and Tom's Hardware, a forum network focusing on new technology with more conservative dynamics.…
2 runs0 likes0 downloads0 reach0 impact
583250 instances - 78 features - 0 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Supply Chain Management datasets are derived from the Trading Agent Competition in Supply…
0 runs0 likes0 downloads0 reach0 impact
8966 instances - 77 features - classes - 0 missing values
Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning. The…
9545 runs0 likes0 downloads0 reach21 impact
1080 instances - 82 features - 8 classes - 1396 missing values
Subsampling of the dataset KDDCup09_appetency (1111) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 93 features - 2 classes - 127027 missing values
Subsampling of the dataset KDDCup09_appetency (1111) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 94 features - 2 classes - 118767 missing values
Context Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative…
0 runs0 likes0 downloads0 reach0 impact
1080 instances - 81 features - classes - 1396 missing values
Subsampling of the dataset nomao (1486) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 78 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
758 runs0 likes11 downloads11 reach15 impact
2000 instances - 77 features - 2 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38439 runs0 likes0 downloads0 reach0 impact
2000 instances - 77 features - 10 classes - 0 missing values
No data.
290 runs0 likes0 downloads0 reach0 impact
1000000 instances - 77 features - 10 classes - 0 missing values
Subsampling of the dataset nomao (1486) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset nomao (1486) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset nomao (1486) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
Subsampling of the dataset nomao (1486) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 101 features - 2 classes - 0 missing values
No data.
405 runs0 likes0 downloads0 reach0 impact
45164 instances - 75 features - 11 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
144 instances - 77 features - 0 classes - 0 missing values
Subsampling of the dataset KDDCup09_appetency (1111) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 93 features - 2 classes - 125883 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs0 likes0 downloads0 reach0 impact
593 instances - 78 features - classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The river flow datasets concern the prediction of river network flows for 48 h in the future at…
0 runs0 likes0 downloads0 reach0 impact
9125 instances - 72 features - classes - 3264 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs0 likes0 downloads0 reach0 impact
593 instances - 78 features - classes - 0 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs0 likes0 downloads0 reach0 impact
593 instances - 78 features - 2 classes - 0 missing values
Subsampling of the dataset KDDCup09_appetency (1111) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 94 features - 2 classes - 124468 missing values
Subsampling of the dataset ozone-level-8hr (1487) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample(…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 73 features - 2 classes - 0 missing values
Subsampling of the dataset ozone-level-8hr (1487) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample(…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 73 features - 2 classes - 0 missing values
Subsampling of the dataset ozone-level-8hr (1487) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample(…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 73 features - 2 classes - 0 missing values
Subsampling of the dataset ozone-level-8hr (1487) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample(…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 73 features - 2 classes - 0 missing values
Subsampling of the dataset ozone-level-8hr (1487) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample(…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 73 features - 2 classes - 0 missing values
Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond, Knowledge and Information Systems, Vol. 14, No. 3, 2008. 1 . Abstract: Two ground ozone level data sets are included in…
188264 runs1 likes20 downloads21 reach30 impact
2534 instances - 73 features - 2 classes - 0 missing values
Home Credit Default Risk Main Table > Huang, X., Khetan, A., Cvitkovic, M., & Karnin, Z. (2020). > Tabtransformer: Tabular data modeling using contextual embeddings. > arXiv preprint…
0 runs0 likes0 downloads0 reach0 impact
307511 instances - 121 features - 2 classes - 9152465 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
801 runs0 likes0 downloads0 reach0 impact
841 instances - 71 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
841 instances - 74 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
841 instances - 74 features - classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
29026 runs0 likes8 downloads8 reach36 impact
841 instances - 71 features - 4 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
98 instances - 69 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
92 instances - 69 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
97 instances - 69 features - classes - 0 missing values
price col is int now. autoHorse dataset
15 runs0 likes0 downloads0 reach0 impact
201 instances - 69 features - 0 classes - 0 missing values
Fixed dataset for autoHorse.csv I suggest...
0 runs0 likes0 downloads0 reach0 impact
201 instances - 69 features - 186 classes - 0 missing values
This dataset combines records from the MLCQ dataset with metrics extracted using the PMD Tool and the Understand tool, to determine whether a file contains code smells. Please note that the records…
0 runs0 likes0 downloads0 reach0 impact
86467 instances - 67 features - 0 classes - 2852906 missing values
This dataset combines records from the MLCQ dataset with metrics extracted using the PMD Tool and the Understand tool, to determine whether a file contains code smells. Please note that the records…
0 runs0 likes0 downloads0 reach0 impact
83943 instances - 67 features - 0 classes - 2801627 missing values
Hello Hello
0 runs0 likes0 downloads0 reach11 impact
44690 instances - 77 features - classes - 0 missing values
The experiments were carried out with a group of 30 volunteers within an age bracket of 19-48 years. They performed a protocol of activities composed of six basic activities: three static postures…
83 runs0 likes0 downloads0 reach0 impact
180 instances - 68 features - 6 classes - 0 missing values
No description available
0 runs0 likes0 downloads0 reach0 impact
66469 instances - 66 features - 0 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38885 runs0 likes0 downloads0 reach0 impact
2000 instances - 65 features - 10 classes - 0 missing values
1. Title of Database: Optical Recognition of Handwritten Digits 2. Source: E. Alpaydin, C. Kaynak Department of Computer Engineering Bogazici University, 80815 Istanbul Turkey alpaydin@boun.edu.tr…
36118 runs0 likes0 downloads0 reach0 impact
5620 instances - 65 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
794 runs0 likes9 downloads9 reach15 impact
2000 instances - 65 features - 2 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143811 runs1 likes17 downloads18 reach419 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143764 runs1 likes40 downloads41 reach417 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143332 runs2 likes67 downloads69 reach419 impact
1599 instances - 65 features - 100 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
5910 instances - 65 features - classes - 4666 missing values
test
0 runs0 likes0 downloads0 reach0 impact
10503 instances - 65 features - classes - 9888 missing values
test
0 runs0 likes0 downloads0 reach0 impact
9792 instances - 65 features - classes - 8776 missing values
The dataset is about bankruptcy prediction of Polish companies. The data was collected from Emerging Markets Information Service (EMIS, [Web Link]), which is a database containing information on…
0 runs0 likes0 downloads0 reach0 impact
7027 instances - 65 features - classes - 5835 missing values
test
0 runs0 likes0 downloads0 reach0 impact
10173 instances - 65 features - classes - 12157 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
765 runs0 likes0 downloads0 reach0 impact
5620 instances - 65 features - 2 classes - 0 missing values
This dataset includes demographic data for users who have rated the top 15 most-rated movies, ranked based on a star rating system.
0 runs0 likes0 downloads0 reach0 impact
260 instances - 79 features - classes - 0 missing values
This dataset includes demographic data for users who have rated the top 15 most-rated movies, ranked based on a star rating system.
0 runs0 likes0 downloads0 reach0 impact
260 instances - 79 features - classes - 0 missing values
No data.
52 runs0 likes0 downloads0 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
CD4 count prediction date
0 runs0 likes0 downloads0 reach0 impact
16484 instances - 62 features - classes - 0 missing values
This work was partially supported by national funds through FCT and IST through the UID/EEA/50009/2013 project", "BL89/2017-IST-ID grant. In this dataset, we present usability (SUS), workload…
0 runs0 likes0 downloads0 reach0 impact
31 instances - 62 features - classes - 0 missing values
Coronavirus Country Profiles We built 207 country profiles which allow you to explore the statistics on the coronavirus pandemic for every country in the world. In a fast-evolving pandemic it is not a…
0 runs0 likes0 downloads0 reach0 impact
170646 instances - 66 features - classes - 5082293 missing values
See [https://github.com/slds-lmu/paper_2023_ci_for_ge](https://github.com/slds-lmu/paper_2023_ci_for_ge) for a description.
0 runs0 likes0 downloads0 reach0 impact
5100000 instances - 63 features - 2 classes - 0 missing values
No data.
882 runs0 likes0 downloads0 reach0 impact
71 instances - 63 features - 6 classes - 0 missing values
No data.
948 runs0 likes0 downloads0 reach0 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
949 runs0 likes0 downloads0 reach0 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
996 runs0 likes0 downloads0 reach0 impact
74 instances - 63 features - 4 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: scaled to [-1,1]
0 runs0 likes0 downloads0 reach0 impact
3175 instances - 61 features - 0 classes - 0 missing values