OpenML
Filter results by:
No data.
66 runs0 likes0 downloads0 reach0 impact
1000000 instances - 14 features - 5 classes - 0 missing values
No data.
334 runs0 likes0 downloads0 reach0 impact
1000000 instances - 33 features - 2 classes - 0 missing values
No data.
70 runs0 likes0 downloads0 reach0 impact
1000000 instances - 14 features - 2 classes - 0 missing values
No data.
68 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
69 runs0 likes0 downloads0 reach0 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
356 runs0 likes0 downloads0 reach0 impact
131072 instances - 17 features - 2 classes - 0 missing values
No data.
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 30 features - 4 classes - 0 missing values
No data.
230 runs0 likes0 downloads0 reach0 impact
1000000 instances - 35 features - 2 classes - 0 missing values
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 18 features - 7 classes - 0 missing values
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. More details can be found in: Albert Bifet, Geoff Holmes,…
332 runs0 likes0 downloads0 reach0 impact
1455525 instances - 73 features - 10 classes - 0 missing values
Normalized version of the Forest Covertype dataset (see version 1), so that the numerical values are between 0 and 1. Contains the forest cover type for 30 x 30 meter cells obtained from US Forest…
342 runs0 likes0 downloads0 reach0 impact
581012 instances - 55 features - 7 classes - 0 missing values
The dataset (originally named ELEC2) contains 45,312 instances dated from 7 May 1996 to 5 December 1998. Each example of the dataset refers to a period of 30 minutes, i.e. there are 48 instances for…
107182 runs0 likes0 downloads0 reach0 impact
45312 instances - 9 features - 2 classes - 0 missing values
No data.
310 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 2 classes - 0 missing values
Synthetic dataset. Almost identical to [dataset 152](https://www.openml.org/d/153/edit)
319 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 2 classes - 0 missing values
No data.
304 runs0 likes0 downloads0 reach0 impact
1000000 instances - 25 features - 10 classes - 0 missing values
Normalized version of the pokerhand data set. Automated file upload of pokerhand-normalized.arff
314 runs0 likes0 downloads0 reach0 impact
829201 instances - 11 features - 10 classes - 0 missing values
No data.
298 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
305 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
308 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
307 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
309 runs0 likes0 downloads0 reach0 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
328 runs0 likes0 downloads0 reach0 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
330 runs0 likes0 downloads0 reach0 impact
1000000 instances - 4 features - 2 classes - 0 missing values
1. Title: Lung Cancer Data 2. Source Information: - Data was published in : Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the…
1238 runs0 likes0 downloads0 reach0 impact
32 instances - 57 features - 3 classes - 5 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs0 likes0 downloads0 reach0 impact
106 instances - 58 features - 2 classes - 0 missing values
Citation Request: This primary tumor domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
1261 runs0 likes0 downloads0 reach0 impact
339 instances - 18 features - 21 classes - 225 missing values
# Space Shuttle Autolanding Domain NASA: Mr. Roger Burke's autolander design team ##### Past Usage: (several, it appears) Example: Michie,D. (1988). The Fifth Generation's Unbridged Gap. In Rolf…
1466 runs0 likes0 downloads0 reach0 impact
15 instances - 7 features - 2 classes - 26 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
2671 runs0 likes0 downloads0 reach0 impact
48842 instances - 15 features - 2 classes - 6465 missing values
Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service…
216 runs0 likes0 downloads0 reach0 impact
110393 instances - 55 features - 7 classes - 0 missing values
No data.
2198 runs1 likes17 downloads18 reach10 impact
1484 instances - 9 features - 10 classes - 0 missing values
The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each neighbourhood. The aim is to…
30180 runs0 likes0 downloads0 reach0 impact
6430 instances - 37 features - 6 classes - 0 missing values
1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and Fisheries, Tasmania…
34899 runs0 likes0 downloads0 reach0 impact
4177 instances - 9 features - 28 classes - 0 missing values
Classify a chess game based on the position of the white king, the white rook and the black king.
1777 runs0 likes0 downloads0 reach0 impact
28056 instances - 7 features - 18 classes - 0 missing values
Database of baseball players and play statistics, including 'Games_played', 'At_bats', 'Runs', 'Hits', 'Doubles', 'Triples', 'Home_runs', 'RBIs', 'Walks', 'Strikeouts', 'Batting_average',…
795 runs0 likes0 downloads0 reach0 impact
1340 instances - 17 features - 3 classes - 20 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
1187 runs0 likes0 downloads0 reach0 impact
412 instances - 9 features - 7 classes - 96 missing values
1. Title of Database: Wine recognition data Updated Sept 21, 1998 by C.Blake : Added attribute information 2. Sources: (a) Forina, M. et al, PARVUS - An Extendible Package for Data Exploration,…
1192 runs0 likes0 downloads0 reach0 impact
178 instances - 14 features - 3 classes - 0 missing values
The objective was to determine which seedlots in a species are best for soil conservation in seasonally dry hill country. Determination is found by measurement of height, diameter by height, survival,…
27620 runs0 likes12 downloads12 reach11 impact
736 instances - 20 features - 5 classes - 448 missing values
This is data set is concerned with the forward kinematics of an 8 link robot arm. Among the existing variants of this data set we have used the variant 8nm, which is known to be highly non-linear and…
23 runs0 likes0 downloads0 reach0 impact
8192 instances - 9 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
7 runs0 likes0 downloads0 reach0 impact
61 instances - 3 features - 0 classes - 0 missing values
1. Title: Wisconsin Prognostic Breast Cancer (WPBC) 2. Source Information a) Creators: Dr. William H. Wolberg, General Surgery Dept., University of Wisconsin, Clinical Sciences Center, Madison, WI…
5 runs0 likes0 downloads0 reach0 impact
194 instances - 33 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes0 downloads0 reach0 impact
52 instances - 3 features - 0 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) SUMMARY: Data from an experiment on the affects of machine adjustments on the time to count bolts. Data appear as the STATS (Issue 10) Challenge. DATA:…
4 runs0 likes0 downloads0 reach0 impact
40 instances - 7 features - 0 classes - 0 missing values
Donor: David W. Aha (aha@ics.uci.edu) This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one…
48 runs0 likes0 downloads0 reach0 impact
303 instances - 14 features - 0 classes - 6 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
11 runs0 likes0 downloads0 reach0 impact
159 instances - 16 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identifier attribute deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based…
2 runs0 likes0 downloads0 reach0 impact
398 instances - 8 features - 0 classes - 6 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs0 likes0 downloads0 reach0 impact
8192 instances - 22 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons domain. The target variable here…
7 runs0 likes0 downloads0 reach0 impact
9517 instances - 7 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identifier attribute deleted. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! NAME: Sexual activity and the lifespan of male fruitflies TYPE: Designed (almost factorial)…
4 runs0 likes0 downloads0 reach0 impact
125 instances - 5 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. X treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes0 downloads0 reach0 impact
418 instances - 19 features - 0 classes - 1239 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
5 runs0 likes0 downloads0 reach0 impact
15000 instances - 49 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Horsepower treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes0 downloads0 reach0 impact
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identification code deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based…
4 runs0 likes0 downloads0 reach0 impact
189 instances - 10 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Cholesterol treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
175 runs0 likes0 downloads0 reach0 impact
303 instances - 14 features - 0 classes - 6 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by Allison, T. and Cicchetti, D.…
0 runs0 likes0 downloads0 reach0 impact
62 instances - 8 features - 0 classes - 12 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes0 downloads0 reach0 impact
186 instances - 61 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! All nominal attributes and instances with missing values are deleted. Price treated as the class attribute. As used by…
2 runs0 likes0 downloads0 reach0 impact
159 instances - 16 features - 0 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) This is the data set called `DETROIT' in the book `Subset selection in regression' by Alan J. Miller published in the Chapman & Hall series of monographs…
2 runs0 likes0 downloads0 reach0 impact
13 instances - 14 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes0 downloads0 reach0 impact
2178 instances - 4 features - 0 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) These data are those collected in a cloud-seeding experiment in Tasmania between mid-1964 and January 1971. Their analysis, using regression techniques…
66 runs0 likes0 downloads0 reach0 impact
108 instances - 6 features - 0 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) The infamous Longley data, "An appraisal of least-squares programs from the point of view of the user", JASA, 62(1967) p819-841. Variables are: Number of…
3 runs0 likes0 downloads0 reach0 impact
16 instances - 7 features - 0 classes - 0 missing values
This data set concerns the study of the factors affecting patterns of insulin-dependent diabetes mellitus in children. The objective is to investigate the dependence of the level of serum C-peptide on…
2 runs0 likes0 downloads0 reach0 impact
43 instances - 3 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning…
10 runs0 likes0 downloads0 reach0 impact
195 instances - 11 features - 0 classes - 2 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Points scored per minute is being treated as…
2 runs0 likes0 downloads0 reach0 impact
96 instances - 5 features - 0 classes - 0 missing values
This is an artificial data set described in Breiman et al. (1984,p.238) (with variance 1 instead of 2). Generate the values of the 10 attributes independently using the following probabilities: P(X_1…
2 runs0 likes0 downloads0 reach0 impact
40768 instances - 11 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are different from the ailerons domain. In this case the goal variable is…
5 runs0 likes0 downloads0 reach0 impact
16599 instances - 19 features - 0 classes - 0 missing values
The task consists of Learning Quantitative Structure Activity Relationships (QSARs). The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are described in: King, Ross .D., Muggleton,…
6 runs0 likes0 downloads0 reach0 impact
74 instances - 28 features - 0 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
2 runs0 likes0 downloads0 reach0 impact
22784 instances - 9 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Survival treated as the class attribute As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
12 runs0 likes0 downloads0 reach0 impact
130 instances - 10 features - 0 classes - 97 missing values
This is a dataset obtained from the StatLib repository. Here is the included description: The data provided are daily stock prices from January 1988 through October 1991, for ten aerospace companies.…
5 runs0 likes0 downloads0 reach0 impact
950 instances - 10 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes0 downloads0 reach0 impact
286 instances - 10 features - 0 classes - 9 missing values
This is a family of datasets synthetically generated from a realistic simulation of the dynamics of a Unimation Puma 560 robot arm. There are eight datastets in this family . In this repository we…
2 runs0 likes0 downloads0 reach0 impact
8192 instances - 9 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Gasoline comnsumption is being treated as…
2 runs0 likes0 downloads0 reach0 impact
27 instances - 5 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
5 runs0 likes0 downloads0 reach0 impact
8192 instances - 13 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Electicity usage is being treated as the…
4 runs0 likes0 downloads0 reach0 impact
55 instances - 3 features - 0 classes - 0 missing values
As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems.…
2 runs0 likes0 downloads0 reach0 impact
200 instances - 11 features - 0 classes - 0 missing values
The problem concerns Relative CPU Performance Data. More information can be obtained in the UCI Machine Learning repository (http://www.ics.uci.edu/~mlearn/MLSummary.html). The used attributes are :…
2 runs0 likes0 downloads0 reach0 impact
209 instances - 7 features - 0 classes - 0 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
10 runs0 likes0 downloads0 reach0 impact
294 instances - 14 features - 0 classes - 782 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Weight treated as the class attribute. Identifier deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes0 downloads0 reach0 impact
158 instances - 8 features - 0 classes - 87 missing values
No data.
206 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
67 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
332 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
311 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
310 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
290 runs0 likes0 downloads0 reach0 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
867 runs0 likes0 downloads0 reach0 impact
39366 instances - 10 features - 2 classes - 0 missing values
No data.
52 runs0 likes0 downloads0 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
306 runs0 likes0 downloads0 reach0 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
52 runs0 likes0 downloads0 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
965 runs0 likes0 downloads0 reach0 impact
55296 instances - 10 features - 3 classes - 0 missing values
No data.
163 runs0 likes0 downloads0 reach0 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
68 runs0 likes0 downloads0 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
326 runs0 likes0 downloads0 reach0 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
315 runs0 likes0 downloads0 reach0 impact
295245 instances - 11 features - 5 classes - 0 missing values
No data.
225 runs0 likes0 downloads0 reach0 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
293 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 10 classes - 0 missing values
No data.
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
309 runs0 likes0 downloads0 reach0 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
296 runs0 likes0 downloads0 reach0 impact
1000000 instances - 61 features - 2 classes - 0 missing values