This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
506307 runs28 likes311 downloads339 reach34 impact
1000 instances - 21 features - 2 classes - 0 missing values
This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the…
28210 runs19 likes169 downloads188 reach36 impact
8378 instances - 121 features - 2 classes - 18372 missing values
Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan -- this is a classification problem. To demonstrate the RFMTC marketing model (a modified version of RFM), this study…
468688 runs6 likes101 downloads107 reach46 impact
748 instances - 5 features - 2 classes - 0 missing values
Author: Volker Lohweg (University of Applied Sciences, Ostwestfalen-Lippe) Source: [UCI](https://archive.ics.uci.edu/ml/datasets/banknote+authentication) - 2012 Please cite:…
138170 runs6 likes40 downloads46 reach34 impact
1372 instances - 5 features - 2 classes - 0 missing values
The original Titanic dataset, describing the survival status of individual passengers on the Titanic. The titanic data does not contain information from the crew, but it does contain actual ages of…
0 runs3 likes45 downloads48 reach12 impact
1309 instances - 14 features - 2 classes - 3855 missing values
The satellite dataset comprises of features extracted from satellite observations. In particular, each image was taken under four different light wavelength, two in visible light (green and red) and…
2074 runs3 likes70 downloads73 reach33 impact
5100 instances - 37 features - 2 classes - 0 missing values
Once upon a time, in July 1991, the monks of Corsendonk Priory were faced with a school held in their priory, namely the 2nd European Summer School on Machine Learning. After listening more than one…
394951 runs3 likes34 downloads37 reach39 impact
601 instances - 7 features - 2 classes - 0 missing values
A dataset relating characteristics of telephony account features and usage and whether or not the customer churned. Originally used in [Discovering Knowledge in Data: An Introduction to Data…
7512 runs2 likes9 downloads11 reach25 impact
5000 instances - 21 features - 2 classes - 0 missing values
Over 92 thousand images (32x32 pixels) of 46 characters from Devanagari script. Includes the alphabet as well as the numbers. Devanagari is an Indic script and forms a basis for over 100 languages…
43 runs2 likes8 downloads10 reach14 impact
92000 instances - 1025 features - 46 classes - 0 missing values
Citation Request: This dataset is public available for research. The details are described in [Cortez et al., 2009]. Please include this citation if you plan to use this database: P. Cortez, A.…
64 runs2 likes6 downloads8 reach16 impact
4898 instances - 12 features - 7 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143332 runs2 likes67 downloads69 reach419 impact
1599 instances - 65 features - 100 classes - 0 missing values
A dataset of steel plates' faults, classified into 7 different types. The goal was to train machine learning for automatic pattern recognition. The dataset consists of 27 features describing each…
277767 runs2 likes52 downloads54 reach26 impact
1941 instances - 34 features - 2 classes - 0 missing values
The dataset freMTPL2freq contains risk features for 677,991 motor third-part liability policies (observed mostly on one year). See https://github.com/dutangc/CASdatasets for more details. The dataset…
0 runs1 likes3 downloads4 reach9 impact
678013 instances - 12 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
13 runs1 likes1 downloads2 reach21 impact
20000 instances - 4297 features - 2 classes - 0 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4207 runs1 likes10 downloads11 reach15 impact
690 instances - 15 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs1 likes3 downloads4 reach17 impact
5832 instances - 309 features - 2 classes - 0 missing values
wine-quality-red-pmlb
31 runs1 likes7 downloads8 reach23 impact
1599 instances - 12 features - 6 classes - 0 missing values
This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or…
0 runs1 likes1 downloads2 reach9 impact
70340 instances - 21 features - 3 classes - 2288 missing values
SOURCE: [ChaLearn Automatic Machine Learning Challenge (AutoML)](https://competitions.codalab.org/competitions/2321), [ChaLearn](https://automl.chalearn.org/data) This is a "supervised learning"…
4 runs1 likes2 downloads3 reach19 impact
5418 instances - 1637 features - 2 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs1 likes3 downloads4 reach9 impact
284807 instances - 31 features - 2 classes - 0 missing values
__Changes w.r.t. version 1: included one target factor with 7 levels as target variable for the classification. Also deleted the previous 7 binary target variables.__ A dataset of steel plates'…
9051 runs1 likes3 downloads4 reach15 impact
1941 instances - 28 features - 7 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs1 likes9 downloads10 reach8 impact
284807 instances - 31 features - 0 classes - 0 missing values
####1. Summary This dataset contain attributes of dresses and their recommendations according to their sales. Sales are monitor on the basis of alternate days. The attributes present analyzed are:…
19207 runs1 likes6 downloads7 reach19 impact
500 instances - 13 features - 2 classes - 835 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
608 runs1 likes9 downloads10 reach15 impact
1000 instances - 26 features - 2 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au1-1000 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
3255 runs1 likes9 downloads10 reach23 impact
1000 instances - 21 features - 2 classes - 0 missing values
QSAR biodegradation Data Set * Abstract: Data set containing values for 41 attributes (molecular descriptors) used to classify 1055 chemicals into 2 classes (ready and not ready biodegradable). *…
267861 runs1 likes25 downloads26 reach30 impact
1055 instances - 42 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from flight software for earth orbiting satellite. Data comes from McCabe and Halstead features extractors of source code. These features…
146026 runs1 likes18 downloads19 reach27 impact
1563 instances - 38 features - 2 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143764 runs1 likes40 downloads41 reach417 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143811 runs1 likes17 downloads18 reach419 impact
1600 instances - 65 features - 100 classes - 0 missing values
Context This is historical data on cryptocurrency tradings for the period from 2016-01-01 to 2021-02-21. If you enjoy this dataset please upvote so I can see it is popular and I need to update it.…
0 runs1 likes0 downloads1 reach0 impact
2382643 instances - 17 features - classes - 4862194 missing values
Context Getting access to high-quality historical stock market data can be very expensive and/or complicated; parsing SEC 10-Q filings direct from the SEC EDGAR is difficult due to the varying…
0 runs1 likes0 downloads1 reach0 impact
101787 instances - 45 features - classes - 2857964 missing values
Context This dataset was created to make the project "AI Learn to invest" for SaturdaysAI - Euskadi 1st edition. The project can be found in https://github.com/ImanolR87/AI-Learn-to-invest Content…
0 runs1 likes1 downloads2 reach0 impact
405258 instances - 25 features - classes - 0 missing values
Context Find the best strategies to improve for the next marketing campaign. How can the financial institution have a greater effectiveness for future marketing campaigns? In order to answer this, we…
0 runs1 likes1 downloads2 reach0 impact
11162 instances - 17 features - classes - 0 missing values
This is a test dataset
0 runs0 likes0 downloads0 reach0 impact
The original Annealing dataset from UCI. The exact meaning of the features and classes is largely unknown. Annealing, in metallurgy and materials science, is a heat treatment that alters the physical…
13779 runs0 likes0 downloads0 reach0 impact
898 instances - 39 features - 5 classes - 22175 missing values
Author: Alen Shapiro Source: [UCI](https://archive.ics.uci.edu/ml/datasets/Chess+(King-Rook+vs.+King-Pawn)) Please cite: [UCI citation policy](https://archive.ics.uci.edu/ml/citation_policy.html) 1.…
274237 runs0 likes0 downloads0 reach0 impact
3196 instances - 37 features - 2 classes - 0 missing values
Date: Tue, 15 Nov 88 15:44:08 EST From: stan To: aha@ICS.UCI.EDU 1. Title: Final settlements in labor negotitions in Canadian industry 2. Source Information -- Creators:…
7681 runs0 likes0 downloads0 reach0 impact
57 instances - 17 features - 2 classes - 326 missing values
The aim is to determine the type of arrhythmia from the ECG recordings. This database contains 279 attributes, 206 of which are linear valued and the rest are nominal. Concerning the study of H. Altay…
4473 runs0 likes0 downloads0 reach0 impact
452 instances - 280 features - 13 classes - 408 missing values
1. TITLE: Letter Image Recognition Data The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The…
69913 runs0 likes0 downloads0 reach0 impact
20000 instances - 17 features - 26 classes - 0 missing values
This database is a standardized version of the original audiology database (see audiology.* in this directory). The non-standard set of attributes have been converted to a standard set of attributes…
7303 runs0 likes0 downloads0 reach0 impact
226 instances - 70 features - 24 classes - 317 missing values
The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. Each line in the dataset constitutes the record of a…
238 runs0 likes0 downloads0 reach0 impact
345 instances - 6 features - 0 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as…
3252 runs0 likes0 downloads0 reach0 impact
205 instances - 26 features - 6 classes - 59 missing values
Citation Request: This lymphography domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
1973 runs0 likes0 downloads0 reach0 impact
148 instances - 19 features - 4 classes - 0 missing values
Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Diagnosis) Data Set with a different set of…
28731 runs0 likes0 downloads0 reach0 impact
699 instances - 10 features - 2 classes - 16 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
36329 runs0 likes0 downloads0 reach0 impact
2000 instances - 7 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. The maps were scanned in 8 bit grey value at density of 400dpi,…
26538 runs0 likes0 downloads0 reach0 impact
2000 instances - 241 features - 10 classes - 0 missing values
We create a digit database by collecting 250 samples from 44 writers. The samples, written by 30 writers, are used for training, cross-validation and writer dependent testing, and the digits written…
37681 runs0 likes0 downloads0 reach0 impact
10992 instances - 17 features - 10 classes - 0 missing values
1. Title: Postoperative Patient Data 2. Source Information: -- Creators: Sharon Summers, School of Nursing, University of Kansas Medical Center, Kansas City, KS 66160 Linda Woolery, School of Nursing,…
1758 runs0 likes0 downloads0 reach0 impact
90 instances - 9 features - 3 classes - 3 missing values
1. Title: Dermatology Database 2. Source Information: (a) Original owners: -- 1. Nilsel Ilter, M.D., Ph.D., Gazi University, School of Medicine 06510 Ankara, Turkey Phone: +90 (312) 214 1080 -- 2. H.…
1756 runs0 likes0 downloads0 reach0 impact
366 instances - 35 features - 6 classes - 8 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. ### Attribute…
23519 runs0 likes0 downloads0 reach0 impact
2310 instances - 20 features - 7 classes - 0 missing values
1. Title: Pima Indians Diabetes Database 2. Sources: (a) Original owners: National Institute of Diabetes and Digestive and Kidney Diseases (b) Donor of database: Vincent Sigillito…
203502 runs0 likes0 downloads0 reach0 impact
768 instances - 9 features - 2 classes - 0 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19949 runs0 likes0 downloads0 reach0 impact
3772 instances - 30 features - 2 classes - 6064 missing values
1. Title: Protein Localization Sites 2. Creator and Maintainer: Kenta Nakai Institue of Molecular and Cellular Biology Osaka, University 1-3 Yamada-oka, Suita 565 Japan nakai@imcb.osaka-u.ac.jp…
1806 runs0 likes0 downloads0 reach0 impact
336 instances - 8 features - 8 classes - 0 missing values
NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task is to train a network…
2372 runs0 likes0 downloads0 reach0 impact
208 instances - 61 features - 2 classes - 0 missing values
1. Title: Contraceptive Method Choice 2. Sources: (a) Origin: This dataset is a subset of the 1987 National Indonesia Contraceptive Prevalence Survey (b) Creator: Tjen-Sien Lim (limt@stat.wisc.edu)…
24351 runs0 likes0 downloads0 reach0 impact
1473 instances - 10 features - 3 classes - 0 missing values
### Description This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible. ### Source ``` (a) Origin: Mushroom records are drawn from…
16692 runs0 likes0 downloads0 reach0 impact
8124 instances - 23 features - 2 classes - 2480 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) Database of surgeries on horses. Possible class attributes: 24 (whether lesion is surgical), others include: 23, 25, 26, and 27 Notes: * Hospital_Number…
236 runs0 likes0 downloads0 reach0 impact
368 instances - 27 features - 2 classes - 1927 missing values
1. Title: Nursery Database 2. Sources: (a) Creator: Vladislav Rajkovic et al. (13 experts) (b) Donors: Marko Bohanec (marko.bohanec@ijs.si) Blaz Zupan (blaz.zupan@ijs.si) (c) Date: June, 1997 3. Past…
2210 runs0 likes0 downloads0 reach0 impact
12960 instances - 9 features - 5 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) In this version (version 2), some features were removed. It is unclear why of how this was done.
1883 runs0 likes0 downloads0 reach0 impact
368 instances - 23 features - 2 classes - 1927 missing values
1. Title of Database: Optical Recognition of Handwritten Digits 2. Source: E. Alpaydin, C. Kaynak Department of Computer Engineering Bogazici University, 80815 Istanbul Turkey alpaydin@boun.edu.tr…
36117 runs0 likes0 downloads0 reach0 impact
5620 instances - 65 features - 10 classes - 0 missing values
This file concerns credit card applications. All attribute names and values have been changed to meaningless symbols to protect the confidentiality of the data. This dataset is interesting because…
25383 runs0 likes0 downloads0 reach0 impact
690 instances - 16 features - 2 classes - 67 missing values
1. Title of Database: Blocks Classification 2. Sources: (a) Donato Malerba Dipartimento di Informatica University of Bari via Orabona 4 70126 Bari - Italy phone: +39 - 80 - 5443269 fax: +39 - 80 -…
2719 runs0 likes0 downloads0 reach0 impact
5473 instances - 11 features - 5 classes - 0 missing values
This data set was generated to model psychological experimental results. Each example is classified as having the balance scale tip to the right, tip to the left, or be balanced. The attributes are…
30114 runs0 likes0 downloads0 reach0 impact
625 instances - 5 features - 3 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
37792 runs0 likes0 downloads0 reach0 impact
2000 instances - 217 features - 10 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
2009 runs0 likes0 downloads0 reach0 impact
286 instances - 10 features - 2 classes - 9 missing values
No data.
70 runs0 likes0 downloads0 reach0 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
72 runs0 likes0 downloads0 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
194 runs0 likes0 downloads0 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
73 runs0 likes0 downloads0 reach0 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
87 runs0 likes0 downloads0 reach0 impact
295245 instances - 11 features - 5 classes - 0 missing values
No data.
68 runs0 likes0 downloads0 reach0 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
67 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 10 classes - 0 missing values
This data sets consists of 3 different types of irises' (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
66 runs0 likes0 downloads0 reach0 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
211 runs0 likes0 downloads0 reach0 impact
1000000 instances - 20 features - 7 classes - 0 missing values
No data.
66 runs0 likes0 downloads0 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
324 runs0 likes0 downloads0 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
71 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
60 runs0 likes0 downloads0 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
68 runs0 likes0 downloads0 reach0 impact
1000000 instances - 10 features - 2 classes - 0 missing values
No data.
48 runs0 likes0 downloads0 reach0 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
50 runs0 likes0 downloads0 reach0 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
67 runs0 likes0 downloads0 reach0 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
66 runs0 likes0 downloads0 reach0 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
51 runs0 likes0 downloads0 reach0 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
1038 runs0 likes0 downloads0 reach0 impact
55296 instances - 10 features - 3 classes - 0 missing values
No data.
326 runs0 likes0 downloads0 reach0 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
68 runs0 likes0 downloads0 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
69 runs0 likes0 downloads0 reach0 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
356 runs0 likes0 downloads0 reach0 impact
131072 instances - 17 features - 2 classes - 0 missing values
No data.
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 30 features - 4 classes - 0 missing values
No data.
230 runs0 likes0 downloads0 reach0 impact
1000000 instances - 35 features - 2 classes - 0 missing values
No data.
63 runs0 likes0 downloads0 reach0 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
65 runs0 likes0 downloads0 reach0 impact
1000000 instances - 18 features - 7 classes - 0 missing values
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. More details can be found in: Albert Bifet, Geoff Holmes,…
332 runs0 likes0 downloads0 reach0 impact
1455525 instances - 73 features - 10 classes - 0 missing values
Normalized version of the Forest Covertype dataset (see version 1), so that the numerical values are between 0 and 1. Contains the forest cover type for 30 x 30 meter cells obtained from US Forest…
342 runs0 likes0 downloads0 reach0 impact
581012 instances - 55 features - 7 classes - 0 missing values
No data.
73 runs0 likes0 downloads0 reach0 impact
1000000 instances - 30 features - 2 classes - 0 missing values
No data.
50 runs0 likes0 downloads0 reach0 impact
1000000 instances - 61 features - 2 classes - 0 missing values