Data
Filter results by:
This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
506307 runs28 likes311 downloads339 reach34 impact
1000 instances - 21 features - 2 classes - 0 missing values
This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the…
28210 runs19 likes169 downloads188 reach36 impact
8378 instances - 121 features - 2 classes - 18372 missing values
Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan -- this is a classification problem. To demonstrate the RFMTC marketing model (a modified version of RFM), this study…
468688 runs6 likes101 downloads107 reach46 impact
748 instances - 5 features - 2 classes - 0 missing values
The satellite dataset comprises of features extracted from satellite observations. In particular, each image was taken under four different light wavelength, two in visible light (green and red) and…
2074 runs3 likes70 downloads73 reach33 impact
5100 instances - 37 features - 2 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143332 runs2 likes67 downloads69 reach419 impact
1599 instances - 65 features - 100 classes - 0 missing values
A dataset of steel plates' faults, classified into 7 different types. The goal was to train machine learning for automatic pattern recognition. The dataset consists of 27 features describing each…
277767 runs2 likes52 downloads54 reach26 impact
1941 instances - 34 features - 2 classes - 0 missing values
The original Titanic dataset, describing the survival status of individual passengers on the Titanic. The titanic data does not contain information from the crew, but it does contain actual ages of…
0 runs3 likes45 downloads48 reach12 impact
1309 instances - 14 features - 2 classes - 3855 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143764 runs1 likes40 downloads41 reach417 impact
1600 instances - 65 features - 100 classes - 0 missing values
Author: Volker Lohweg (University of Applied Sciences, Ostwestfalen-Lippe) Source: [UCI](https://archive.ics.uci.edu/ml/datasets/banknote+authentication) - 2012 Please cite:…
138170 runs6 likes40 downloads46 reach34 impact
1372 instances - 5 features - 2 classes - 0 missing values
Once upon a time, in July 1991, the monks of Corsendonk Priory were faced with a school held in their priory, namely the 2nd European Summer School on Machine Learning. After listening more than one…
394951 runs3 likes34 downloads37 reach39 impact
601 instances - 7 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from flight software for earth orbiting satellite. Data comes from McCabe and Halstead features extractors of source code. These features…
149998 runs0 likes27 downloads27 reach28 impact
1109 instances - 22 features - 2 classes - 0 missing values
QSAR biodegradation Data Set * Abstract: Data set containing values for 41 attributes (molecular descriptors) used to classify 1055 chemicals into 2 classes (ready and not ready biodegradable). *…
267861 runs1 likes25 downloads26 reach30 impact
1055 instances - 42 features - 2 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
34883 runs0 likes23 downloads23 reach14 impact
2000 instances - 48 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38885 runs0 likes20 downloads20 reach14 impact
2000 instances - 65 features - 10 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from flight software for earth orbiting satellite. Data comes from McCabe and Halstead features extractors of source code. These features…
146026 runs1 likes18 downloads19 reach27 impact
1563 instances - 38 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from flight software for earth orbiting satellite. Data comes from McCabe and Halstead features extractors of source code. These features…
115699 runs0 likes17 downloads17 reach28 impact
1458 instances - 38 features - 2 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143811 runs1 likes17 downloads18 reach419 impact
1600 instances - 65 features - 100 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au6-1000 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
11010 runs0 likes16 downloads16 reach47 impact
1000 instances - 41 features - 8 classes - 0 missing values
####1. Summary This database was generated by the Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF) in the development of the Esprit project ELENA No. 6891 and the Esprit working…
20418 runs0 likes14 downloads14 reach18 impact
5500 instances - 41 features - 11 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38439 runs0 likes13 downloads13 reach15 impact
2000 instances - 77 features - 10 classes - 0 missing values
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. Attributes represent board positions on a 6x6…
9760 runs0 likes12 downloads12 reach26 impact
67557 instances - 43 features - 3 classes - 0 missing values
This database was derived from a simple hierarchical decision model originally developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp.…
7180 runs0 likes11 downloads11 reach24 impact
1728 instances - 7 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
758 runs0 likes11 downloads11 reach15 impact
2000 instances - 77 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
617 runs0 likes11 downloads11 reach15 impact
1000 instances - 26 features - 2 classes - 0 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4207 runs1 likes10 downloads11 reach15 impact
690 instances - 15 features - 2 classes - 0 missing values
This simple domain contains 7 Boolean attributes and 10 classes, the set of decimal digits. Recall that LED displays contain 7 light-emitting diodes -- hence the reason for 7 attributes. The class…
13156 runs0 likes10 downloads10 reach19 impact
500 instances - 8 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
620 runs0 likes10 downloads10 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
A dataset relating characteristics of telephony account features and usage and whether or not the customer churned. Originally used in [Discovering Knowledge in Data: An Introduction to Data…
7512 runs2 likes9 downloads11 reach25 impact
5000 instances - 21 features - 2 classes - 0 missing values
Originally from the StatLog project. The raw data is still available on [UCI](https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+(Splice-junction+Gene+Sequences)). The data consists of 3,186…
7063 runs0 likes9 downloads9 reach25 impact
3186 instances - 181 features - 3 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs1 likes9 downloads10 reach8 impact
284807 instances - 31 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
614 runs0 likes9 downloads9 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
608 runs0 likes9 downloads9 reach15 impact
1000 instances - 26 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
604 runs0 likes9 downloads9 reach15 impact
1000 instances - 26 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
608 runs1 likes9 downloads10 reach15 impact
1000 instances - 26 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
794 runs0 likes9 downloads9 reach15 impact
2000 instances - 65 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
646 runs0 likes9 downloads9 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au1-1000 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
3255 runs1 likes9 downloads10 reach23 impact
1000 instances - 21 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
638 runs0 likes9 downloads9 reach15 impact
1000 instances - 26 features - 2 classes - 0 missing values
In the early 2000s, Billy Beane and Paul DePodesta worked for the Oakland Athletics. While there, they literally changed the game of baseball. They didn't do it using a bat or glove, and they…
3 runs0 likes8 downloads8 reach14 impact
1232 instances - 15 features - 0 classes - 3600 missing values
Over 92 thousand images (32x32 pixels) of 46 characters from Devanagari script. Includes the alphabet as well as the numbers. Devanagari is an Indic script and forms a basis for over 100 languages…
43 runs2 likes8 downloads10 reach14 impact
92000 instances - 1025 features - 46 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. __Major changes w.r.t.…
9969 runs0 likes8 downloads8 reach25 impact
2310 instances - 20 features - 7 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
621 runs0 likes8 downloads8 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
792 runs0 likes8 downloads8 reach15 impact
2000 instances - 48 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
636 runs0 likes8 downloads8 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
wine-quality-red-pmlb
31 runs1 likes7 downloads8 reach23 impact
1599 instances - 12 features - 6 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
104 runs0 likes7 downloads7 reach15 impact
1302 instances - 34 features - 2 classes - 7830 missing values
Pizza cutter 3
188 runs0 likes7 downloads7 reach14 impact
1043 instances - 38 features - 2 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
6905 runs0 likes6 downloads6 reach19 impact
44819 instances - 7 features - 3 classes - 0 missing values
0. airplane 1. automobile 2. bird 3. cat 4. deer 5. dog 6. frog 7. horse 8. ship 9. truck CIFAR-10 contains 6000 images per class. The original train-test split randomly divided these into 5000 train…
160 runs0 likes6 downloads6 reach21 impact
60000 instances - 3073 features - 10 classes - 0 missing values
Citation Request: This dataset is public available for research. The details are described in [Cortez et al., 2009]. Please include this citation if you plan to use this database: P. Cortez, A.…
64 runs2 likes6 downloads8 reach16 impact
4898 instances - 12 features - 7 classes - 0 missing values
####1. Summary This dataset contain attributes of dresses and their recommendations according to their sales. Sales are monitor on the basis of alternate days. The attributes present analyzed are:…
19207 runs1 likes6 downloads7 reach19 impact
500 instances - 13 features - 2 classes - 835 missing values
pie chart 3
103 runs0 likes6 downloads6 reach13 impact
1077 instances - 38 features - 2 classes - 0 missing values
### Description __Changes to version 1:__ all categorical features transformed as such. This dataset represents a set of possible advertisements on Internet pages. ### Sources (a) Creator and donor:…
1432 runs0 likes5 downloads5 reach23 impact
3279 instances - 1559 features - 2 classes - 0 missing values
this is titanic survival prediction
0 runs0 likes5 downloads5 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
Byron Roe (byronroe '@' umich.edu) Department of Physics University of Michigan Ann Arbor, MI 48109 This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos…
12 runs0 likes4 downloads4 reach14 impact
130064 instances - 51 features - 2 classes - 0 missing values
The goal is to predict the Fare. Variable description: pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower age: Age is fractional if less than 1. If the age is…
0 runs0 likes4 downloads4 reach11 impact
1307 instances - 8 features - 0 classes - 0 missing values
This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. It contains 19 house features plus the price and the id columns,…
0 runs0 likes4 downloads4 reach9 impact
21613 instances - 20 features - 0 classes - 0 missing values
__Major changes w.r.t. version 1: deactivated first two variables as they describe the batch of the experiments and should not be used for prediction. Also transformed the target from numeric to…
8809 runs0 likes4 downloads4 reach13 impact
540 instances - 21 features - 2 classes - 0 missing values
This dataset is gather to detect whether a person is running or walking based on deep neural networks and sensor data collected from iOS devices. The dataset represents 88588 sensor data samples…
1 runs0 likes4 downloads4 reach14 impact
88588 instances - 7 features - 2 classes - 0 missing values
Re-upload of the dataset as it is present in the Penn ML Benchmark (https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/fars). It's a dataset on traffic accidents,…
1 runs0 likes4 downloads4 reach23 impact
100968 instances - 30 features - 8 classes - 0 missing values
this is titanic survival prediction
0 runs0 likes4 downloads4 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
Source: [UCI](https://archive.ics.uci.edu/ml/datasets/Statlog+(Shuttle)) Donor: Jason Catlett Basser Department of Computer Science, University of Sydney, N.S.W., Australia Data Set Information:…
10 runs0 likes4 downloads4 reach24 impact
58000 instances - 10 features - 7 classes - 0 missing values
The dataset freMTPL2freq contains risk features for 677,991 motor third-part liability policies (observed mostly on one year). See https://github.com/dutangc/CASdatasets for more details. The dataset…
0 runs1 likes3 downloads4 reach9 impact
678013 instances - 12 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs1 likes3 downloads4 reach17 impact
5832 instances - 309 features - 2 classes - 0 missing values
One of the biggest challenges of an auto dealership purchasing a used car at an auto auction is the risk of that the vehicle might have serious issues that prevent it from being sold to customers. The…
3 runs0 likes3 downloads3 reach13 impact
72983 instances - 33 features - 2 classes - 149271 missing values
This is a "supervised learning" challenge in machine learning. We are making available 30 datasets, all pre-formatted in given feature representations (this means that each example consists of a fixed…
10 runs0 likes3 downloads3 reach19 impact
65196 instances - 28 features - 100 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs1 likes3 downloads4 reach9 impact
284807 instances - 31 features - 2 classes - 0 missing values
__Changes w.r.t. version 1: included one target factor with 7 levels as target variable for the classification. Also deleted the previous 7 binary target variables.__ A dataset of steel plates'…
9051 runs1 likes3 downloads4 reach15 impact
1941 instances - 28 features - 7 classes - 0 missing values
titanic surviual prediction
0 runs0 likes3 downloads3 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes3 downloads3 reach14 impact
1000 instances - 26 features - 0 classes - 0 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
8 runs0 likes2 downloads2 reach19 impact
2984 instances - 145 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_1000atts_0.4H_EDM-1_EDM-1_1-pmlb
0 runs0 likes2 downloads2 reach22 impact
1600 instances - 1001 features - 2 classes - 0 missing values
__Major changes w.r.t. version 2: ignored variable 3 in this upload as this seems to be ea perfect predictor.__ Tamilnadu Electricity Board Hourly Readings dataset. Real-time readings were collected…
0 runs0 likes2 downloads2 reach19 impact
45781 instances - 4 features - 20 classes - 0 missing values
led24-pmlb
31 runs0 likes2 downloads2 reach22 impact
3200 instances - 25 features - 10 classes - 0 missing values
__Changes w.r.t. version 1: renamed variables such that they match description.__ ### Dataset: Wilt Data Set ### Abstract: High-resolution Remote Sensing data set (Quickbird). Small number of training…
10966 runs0 likes2 downloads2 reach21 impact
4839 instances - 6 features - 2 classes - 0 missing values
PMLB version of the Titanic dataset, which only uses 3 features. See version 1 for the complete version: https://www.openml.org/d/40945
35 runs0 likes2 downloads2 reach23 impact
2201 instances - 4 features - 2 classes - 0 missing values
Data contains the information of 9144 samples form 220 spectral bands. The classes represent land-use types: alfalfa, corn, grass, hay, oats, soybeans, trees, and wheat.
0 runs0 likes2 downloads2 reach11 impact
9144 instances - 221 features - 8 classes - 0 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
4 runs1 likes2 downloads3 reach19 impact
5418 instances - 1637 features - 2 classes - 0 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
4 runs0 likes2 downloads2 reach18 impact
10000 instances - 2001 features - 5 classes - 0 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
0 runs0 likes2 downloads2 reach17 impact
416188 instances - 61 features - 355 classes - 0 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
4 runs0 likes2 downloads2 reach19 impact
5124 instances - 21 features - 2 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach13 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach14 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach13 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach13 impact
1000 instances - 26 features - 0 classes - 0 missing values
This data was extracted from the 1994 Census bureau database by Ronny Kohavi and Barry Becker (Data Mining and Visualization, Silicon Graphics). A set of reasonably clean records was extracted using…
0 runs0 likes1 downloads1 reach0 impact
32561 instances - 15 features - classes - 4262 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
13 runs1 likes1 downloads2 reach21 impact
20000 instances - 4297 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach18 impact
425240 instances - 79 features - 2 classes - 2734000 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_50_EDM-2_001-pmlb
0 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.1H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
The origin is not clear, but presumably this is an artificial problem representing M-of-N rules. The target is 1 if a certain M 'bits' are '1'? (Joaquin Vanschoren)
31 runs0 likes1 downloads1 reach22 impact
1324 instances - 11 features - 2 classes - 0 missing values
The Sheffield (previously UMIST) Face Database consists of 564 images of 20 individuals (mixed race/gender/appearance). Each individual is shown in a range of poses from profile to frontal views -…
53 runs0 likes1 downloads1 reach16 impact
575 instances - 10305 features - 20 classes - 0 missing values
The dataset and this description is made available on http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html. Normalized handwritten digits, automatically scanned from envelopes by the U.S. Postal…
57 runs0 likes1 downloads1 reach11 impact
9298 instances - 257 features - 10 classes - 0 missing values
Data used in an analysis of the Brown and Frown corpora for my doctoral dissertation titled ``Variations in Written English: Characterizing Authors' Rhetorical Language Choices Across Corpora of…
2048 runs0 likes1 downloads1 reach12 impact
1000 instances - 24 features - 30 classes - 0 missing values
The German Traffic Sign Benchmark is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011. We cordially invite researchers…
1 runs0 likes1 downloads1 reach11 impact
51839 instances - 257 features - 43 classes - 0 missing values
#### Information A small classic dataset from Fisher, 1936. One of the earliest datasets used for the evaluation of classification methodologies. #### References * Fisher, R. A. (1936), The use of…
0 runs0 likes1 downloads1 reach0 impact
150 instances - 7 features - classes - 0 missing values
)), [PMLB](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/tokyo1) This is Performance co-pilot (PCP) data for the Tokyo server at Silicon Graphics International…
37 runs0 likes1 downloads1 reach22 impact
959 instances - 45 features - 2 classes - 0 missing values