OpenML
Filter results by:
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
60 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
59 runs0 likes0 downloads0 reach0 impact
1545 instances - 10937 features - 2 classes - 0 missing values
The dataset and this description is made available on http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html. Normalized handwritten digits, automatically scanned from envelopes by the U.S. Postal…
57 runs0 likes1 downloads1 reach11 impact
9298 instances - 257 features - 10 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: C1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
54 runs0 likes0 downloads0 reach0 impact
28626 instances - 4 features - 5 classes - 0 missing values
Short Summary: Lists estimates of the percentage of body fat determined by underwater weighing and various body circumference measurements for 252 men. Classroom use of this data set: This data set…
54 runs0 likes0 downloads0 reach0 impact
252 instances - 15 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
53 runs0 likes0 downloads0 reach0 impact
92 instances - 6 features - 0 classes - 26 missing values
This dataset contains a set of face images taken between April 1992 and April 1994 at AT&T Laboratories Cambridge. As described on the original website: There are ten different images of each of 40…
53 runs0 likes0 downloads0 reach0 impact
400 instances - 4097 features - 40 classes - 0 missing values
The Sheffield (previously UMIST) Face Database consists of 564 images of 20 individuals (mixed race/gender/appearance). Each individual is shown in a range of poses from profile to frontal views -…
53 runs0 likes1 downloads1 reach16 impact
575 instances - 10305 features - 20 classes - 0 missing values
simple engine data
52 runs0 likes0 downloads0 reach0 impact
383 instances - 6 features - 3 classes - 0 missing values
SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor…
52 runs0 likes0 downloads0 reach0 impact
99289 instances - 3073 features - 10 classes - 0 missing values
No data.
52 runs0 likes0 downloads0 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
52 runs0 likes0 downloads0 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
51 runs0 likes0 downloads0 reach0 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
51 runs0 likes0 downloads0 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
50 runs0 likes0 downloads0 reach0 impact
95 instances - 8 features - 5 classes - 9 missing values
No data.
50 runs0 likes0 downloads0 reach0 impact
1000000 instances - 18 features - 22 classes - 0 missing values
No data.
50 runs0 likes0 downloads0 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
50 runs0 likes0 downloads0 reach0 impact
1000000 instances - 61 features - 2 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
48 runs0 likes0 downloads0 reach0 impact
159 instances - 61360 features - 2 classes - 0 missing values
Donor: David W. Aha (aha@ics.uci.edu) This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one…
48 runs0 likes0 downloads0 reach0 impact
303 instances - 14 features - 0 classes - 6 missing values
No data.
48 runs0 likes0 downloads0 reach0 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
47 runs0 likes0 downloads0 reach0 impact
1000000 instances - 45 features - 2 classes - 0 missing values
No data.
45 runs0 likes0 downloads0 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
44 runs0 likes0 downloads0 reach0 impact
1000000 instances - 13 features - 11 classes - 0 missing values
No data.
44 runs0 likes0 downloads0 reach0 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
43 runs0 likes0 downloads0 reach0 impact
1000000 instances - 45 features - 2 classes - 0 missing values
In human civilisation, languages evolved first, and then came scripts. The Devanagari script is one of the oldest scripts of India, having evolved from the ancient Brahmi script. It came to be adopted…
43 runs2 likes8 downloads10 reach15 impact
92000 instances - 1025 features - 46 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
41 runs0 likes0 downloads0 reach0 impact
1340 instances - 17 features - 3 classes - 20 missing values
Datasets for `Pattern Recognition and Neural Networks' by B.D. Ripley ===================================================================== Cambridge University Press (1996) ISBN 0-521-46086-7 The…
41 runs0 likes0 downloads0 reach0 impact
27 instances - 3 features - 4 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-subjects * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set…
40 runs0 likes0 downloads0 reach0 impact
64 instances - 243 features - 2 classes - 0 missing values
CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image…
40 runs0 likes0 downloads0 reach0 impact
13000 instances - 27649 features - 10 classes - 0 missing values
No data.
37 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
)), [PMLB](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/tokyo1) This is Performance co-pilot (PCP) data for the Tokyo server at Silicon Graphics International…
37 runs0 likes1 downloads1 reach22 impact
959 instances - 45 features - 2 classes - 0 missing values
Balanced version of click prediction data
36 runs0 likes15 downloads15 reach13 impact
1997410 instances - 12 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
35 runs0 likes0 downloads0 reach0 impact
23 instances - 5 features - 3 classes - 0 missing values
PMLB version of the Titanic dataset, which only uses 3 features. See version 1 for the complete version: https://www.openml.org/d/40945
35 runs0 likes2 downloads2 reach23 impact
2201 instances - 4 features - 2 classes - 0 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
34 runs0 likes0 downloads0 reach0 impact
105 instances - 12 features - 6 classes - 61 missing values
No data.
34 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
analcatdata_fraud-pmlb
34 runs0 likes0 downloads0 reach0 impact
42 instances - 12 features - 2 classes - 0 missing values
No data.
33 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
33 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
32 runs0 likes0 downloads0 reach0 impact
57 instances - 11 features - 5 classes - 1 missing values
County data from the 2000 Presidential Election in Florida. Compiled by Brett Presnell Department of Statistics, University of Florida These data are derived from three sources, described below. As…
32 runs0 likes0 downloads0 reach0 impact
67 instances - 16 features - 5 classes - 0 missing values
The AAUP dataset for the ASA Statistical Graphics Section's 1995 Data Analysis Exposition contains information on faculty salaries for 1161 American colleges and universities. The data may be obtained…
32 runs0 likes0 downloads0 reach0 impact
1161 instances - 15 features - 4 classes - 256 missing values
No data.
32 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
32 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
MyExampleIris
32 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
32 runs0 likes0 downloads0 reach0 impact
2800 instances - 27 features - 5 classes - 0 missing values
1. Title: meta-data 2. Sources: (a) Creator: LIACC - University of Porto R.Campo Alegre 823 4150 PORTO (b) Donor: P.B.Brazdil or J.Gama Tel.: +351 600 1672 LIACC, University of Porto Fax.: +351 600…
32 runs0 likes0 downloads0 reach0 impact
528 instances - 22 features - 0 classes - 504 missing values
flare-pmlb
32 runs0 likes0 downloads0 reach0 impact
1066 instances - 11 features - 2 classes - 0 missing values
cleve-pmlb
32 runs0 likes0 downloads0 reach0 impact
303 instances - 14 features - 2 classes - 0 missing values
parity5-pmlb
32 runs0 likes0 downloads0 reach0 impact
32 instances - 6 features - 2 classes - 0 missing values
Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not…
31 runs0 likes0 downloads0 reach0 impact
105 instances - 12 features - 6 classes - 61 missing values
Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics, 7(173), 2006. This…
31 runs0 likes0 downloads0 reach0 impact
488565 instances - 9 features - 2 classes - 0 missing values
No data.
31 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
31 runs0 likes0 downloads0 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
Abstract: The data set is composed of 60 chorales (5665 events) by J.S. Bach (1675-1750). Each event of each chorale is labelled using 1 among 101 chord labels and described through 14 features.…
31 runs0 likes0 downloads0 reach0 impact
5665 instances - 17 features - 102 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.1H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
corral-pmlb
31 runs0 likes0 downloads0 reach0 impact
160 instances - 7 features - 2 classes - 0 missing values
ecoli-pmlb
31 runs0 likes0 downloads0 reach0 impact
327 instances - 8 features - 5 classes - 0 missing values
car-evaluation-pmlb
31 runs0 likes0 downloads0 reach0 impact
1728 instances - 22 features - 4 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes0 downloads0 reach0 impact
476 instances - 169 features - 2 classes - 0 missing values
Derived from the Musk dataset: https://www.openml.org/d/1116
31 runs0 likes0 downloads0 reach0 impact
6598 instances - 169 features - 2 classes - 0 missing values
led24-pmlb
31 runs0 likes2 downloads2 reach22 impact
3200 instances - 25 features - 10 classes - 0 missing values
led7-pmlb
31 runs0 likes0 downloads0 reach0 impact
3200 instances - 8 features - 10 classes - 0 missing values
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
calendarDOW-pmlb
31 runs0 likes0 downloads0 reach0 impact
399 instances - 33 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs0 likes0 downloads0 reach0 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs0 likes0 downloads0 reach0 impact
2800 instances - 27 features - 5 classes - 0 missing values
This directory contains Thyroid datasets. "ann-train.data" contains 3772 learning examples and "ann-test.data" contains 3428 testing examples. I have obtained this data from…
31 runs0 likes0 downloads0 reach0 impact
3772 instances - 22 features - 3 classes - 0 missing values
The data was collected retrospectively at Wroclaw Thoracic Surgery Centre for patients who underwent major lung resections for primary lung cancer in the years 2007 - 2011. The Centre is associated…
31 runs0 likes0 downloads0 reach0 impact
470 instances - 17 features - 2 classes - 0 missing values
#modelage
31 runs0 likes0 downloads0 reach0 impact
202 instances - 20 features - 2 classes - 17 missing values
Dataset used by Buntine and Niblett (1992). Composed of 10 features, one of which is irrelevant. The target is a disjunctive normal form formula over the nine other attributes, with additional…
31 runs0 likes0 downloads0 reach22 impact
973 instances - 10 features - 2 classes - 0 missing values
cars1-pmlb
31 runs0 likes0 downloads0 reach0 impact
392 instances - 8 features - 3 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes0 downloads0 reach0 impact
1066 instances - 13 features - 6 classes - 0 missing values
threeOf9-pmlb
31 runs0 likes0 downloads0 reach0 impact
512 instances - 10 features - 2 classes - 0 missing values
wine-quality-red-pmlb
31 runs1 likes7 downloads8 reach23 impact
1599 instances - 12 features - 6 classes - 0 missing values
parity5_plus_5-pmlb
31 runs0 likes0 downloads0 reach22 impact
1124 instances - 11 features - 2 classes - 0 missing values
allbp-pmlb
31 runs0 likes0 downloads0 reach0 impact
3772 instances - 30 features - 3 classes - 0 missing values
allrep-pmlb
31 runs0 likes0 downloads0 reach0 impact
3772 instances - 30 features - 4 classes - 0 missing values
analcatdata_happiness-pmlb
31 runs0 likes0 downloads0 reach0 impact
60 instances - 4 features - 3 classes - 0 missing values
The origin is not clear, but presumably this is an artificial problem representing M-of-N rules. The target is 1 if a certain M 'bits' are '1'? (Joaquin Vanschoren)
31 runs0 likes1 downloads1 reach22 impact
1324 instances - 11 features - 2 classes - 0 missing values
mux6-pmlb
31 runs0 likes0 downloads0 reach0 impact
128 instances - 7 features - 2 classes - 0 missing values
new-thyroid-pmlb
31 runs0 likes0 downloads0 reach0 impact
215 instances - 6 features - 3 classes - 0 missing values
Relevant Information: -- The database contains 3 potential classes, one for the number of times a certain type of solar flare occured in a 24 hour period. -- Each instance represents captured features…
31 runs0 likes0 downloads0 reach0 impact
315 instances - 13 features - 5 classes - 0 missing values
cleveland-nominal-pmlb
31 runs0 likes0 downloads0 reach0 impact
303 instances - 8 features - 5 classes - 0 missing values
dis-pmlb
31 runs0 likes0 downloads0 reach0 impact
3772 instances - 30 features - 2 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
A copy of the data set proposed in: S. M. Weiss, and C. A. Kulikowski, Computer Systems That Learn (1991).
30 runs0 likes0 downloads0 reach0 impact
106 instances - 8 features - classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values