Filter results by:
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
854 runs0 likes0 downloads0 reach0 impact
250 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1119 runs0 likes0 downloads0 reach0 impact
100 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
866 runs0 likes0 downloads0 reach0 impact
7129 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
598 runs0 likes0 downloads0 reach0 impact
1000 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1899 runs0 likes0 downloads0 reach0 impact
1156 instances - 6 features - 2 classes - 0 missing values
* Title: User Knowledge Modeling Data Set * Abstract: It is the real dataset about the students' knowledge status about the subject of Electrical DC Machines. The dataset had been obtained from Ph.D.…
153 runs0 likes0 downloads0 reach0 impact
403 instances - 6 features - 5 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
757 runs0 likes0 downloads0 reach0 impact
400 instances - 6 features - 2 classes - 0 missing values
## Guess which points belong to signal track [COMET]( is an experiment being constructed at the J-PARC proton beam laboratory in Japan. It will search for…
0 runs0 likes0 downloads0 reach0 impact
7619400 instances - 6 features - 0 classes - 0 missing values
And another sample. (v. 2 without OpenML metainfo)
0 runs0 likes0 downloads0 reach0 impact
89640 instances - 6 features - classes - 0 missing values
Sample with OpenML metadata.
0 runs0 likes0 downloads0 reach0 impact
761940 instances - 6 features - 0 classes - 0 missing values
## Guess which points belong to signal track [COMET]( is an experiment being constructed at the J-PARC proton beam laboratory in Japan. It will search for…
0 runs0 likes0 downloads0 reach0 impact
7619400 instances - 6 features - 0 classes - 0 missing values
## Guess which points belong to signal track [COMET]( is an experiment being constructed at the J-PARC proton beam laboratory in Japan. It will search for…
0 runs0 likes0 downloads0 reach0 impact
7619400 instances - 6 features - 0 classes - 0 missing values
0 runs0 likes0 downloads0 reach0 impact
761940 instances - 6 features - classes - 0 missing values
Another sample of COMET_MC
0 runs0 likes0 downloads0 reach0 impact
89640 instances - 6 features - 0 classes - 0 missing values
simple engine data
52 runs0 likes0 downloads0 reach0 impact
383 instances - 6 features - 3 classes - 0 missing values
Payments given by healthcare manufacturing companies to medical doctors or hospitals
0 runs0 likes0 downloads0 reach0 impact
73558 instances - 6 features - 2 classes - 83182 missing values
Content Microsoft is an American multinational technology company. It develops, manufactures, licenses, supports, and sells computer software, consumer electronics, personal computers, and related…
0 runs0 likes0 downloads0 reach0 impact
5000 instances - 6 features - classes - 0 missing values
Context Coronaviruses are a large family of viruses which may cause illness in animals or humans. In humans, several coronaviruses are known to cause respiratory infections ranging from the common…
0 runs0 likes0 downloads0 reach0 impact
2580 instances - 6 features - classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark.…
0 runs0 likes0 downloads0 reach0 impact
3172 instances - 6 features - 2 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark.…
0 runs0 likes0 downloads0 reach0 impact
3172 instances - 6 features - 2 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
3172 instances - 6 features - 2 classes - 0 missing values
We choose age, delivery number, delivery time, blood pressure and heart status. We classify delivery time to Premature, Timely and Latecomer. As like the delivery time we consider blood pressure in…
0 runs0 likes0 downloads0 reach0 impact
80 instances - 6 features - classes - 0 missing values
Context Dataset is generated through a long and complex process. Starting from scrapping the whole URLs provided on for Game of Thrones series. Process on scrapping and cleaning the dataset…
0 runs0 likes0 downloads0 reach0 impact
23911 instances - 6 features - classes - 3 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark.…
0 runs0 likes0 downloads0 reach0 impact
1022616 instances - 6 features - 0 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
3172 instances - 6 features - 2 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
3172 instances - 6 features - 2 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
3172 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset wilt (40983) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset wilt (40983) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset wilt (40983) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset wilt (40983) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset wilt (40983) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed:…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Context As you all know that, as per the observation of economists, according to the current trend, it seems that the yellow metal is performing better as an investment option in comparison to mutual…
0 runs0 likes0 downloads0 reach0 impact
4971 instances - 6 features - classes - 0 missing values
Context People love movies because: It takes you on a journey. Its an escape from reality. Being a vivid movie watcher I always get amazed how sites like Netflix and Hotstar always exactly suggest the…
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 6 features - classes - 44 missing values
Context People love movies because: It takes you on a journey. Its an escape from reality. Being a vivid movie watcher I always get amazed how sites like Netflix and Hotstar always exactly suggest the…
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 6 features - classes - 44 missing values
Please refer to UCI's [citation policy]( Donor: Dr Roberto Lopez robertolopez '@' Intelnics Creators: Thomas F. Brooks, D. Stuart…
0 runs0 likes0 downloads0 reach0 impact
1503 instances - 6 features - 0 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on both numerical and categorical…
0 runs0 likes0 downloads0 reach0 impact
1000000 instances - 6 features - 0 classes - 0 missing values
Introduction is a crowd-sourced movie information database used by many film-related consoles, sites and apps, such as XBMC, MythTV and Plex. Dozens of media managers, mobile apps and social…
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 6 features - classes - 30 missing values
Context The stock prices dataset of a ticker is a good start to slice and dice and good for forecasting of the stock prices. The GOOG ticker data is taken Content Dataset is comprising of the below…
0 runs0 likes0 downloads0 reach0 impact
249 instances - 6 features - classes - 0 missing values
Context I scraped all of the currently available Urban Dictionary pages (611) on 3/26/17 Content word - the slang term added to urban dictionary definition - the definition of said term author - the…
0 runs0 likes0 downloads0 reach0 impact
4272 instances - 6 features - classes - 0 missing values
Apple stock price of each work day since January 1st 2021. Contains highest price, lowest price, open, close, volume and adjusted close
0 runs0 likes0 downloads0 reach0 impact
348 instances - 6 features - 0 classes - 0 missing values
Context I was exploring League of Legends datasets to play around but since Riot allows limited calls to their API, I've collected the data from OP.GG. Few goals of mine were to find out the best team…
0 runs0 likes0 downloads0 reach0 impact
4028 instances - 6 features - classes - 0 missing values
Apple stock price in the first month of 2022
0 runs0 likes0 downloads0 reach0 impact
22 instances - 6 features - 0 classes - 0 missing values
Apple stock price of each work day since January 1st 2021. Contains highest price, lowest price, open, close, volume and adjusted close
0 runs0 likes0 downloads0 reach0 impact
346 instances - 6 features - 0 classes - 0 missing values
Apple stock price of each trading day since January 1st 2021. Contains highest price, lowest price, open, close, volume and adjusted close
0 runs0 likes0 downloads0 reach0 impact
347 instances - 6 features - 0 classes - 0 missing values
Apple stock price in the first month of 2022
0 runs0 likes0 downloads0 reach0 impact
22 instances - 6 features - 0 classes - 0 missing values
Auto MPG (6 variables) dataset The data concerns city-cycle fuel consumption in miles per gallon (Mpg), to be predicted in terms of 1 multivalued discrete and 5 continuous attributes (two multivalued…
0 runs0 likes0 downloads0 reach0 impact
392 instances - 6 features - 0 classes - 0 missing values
DESCRIPTIVE ABSTRACT: The data set contains the oral, written and combined test scores for 2003 New Haven Fire Department promotion exams. The Race and Position for each test taker are also given.…
0 runs0 likes0 downloads0 reach0 impact
118 instances - 6 features - 2 classes - 0 missing values
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 128136 missing values
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 128136 missing values
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
0 runs0 likes0 downloads0 reach0 impact
2 instances - 6 features - classes - 0 missing values
Mammography is the most effective method for breast cancer screening available today. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to…
0 runs0 likes0 downloads0 reach0 impact
830 instances - 6 features - classes - 0 missing values
Context While brainstorming ideas for a statistics project for a course last semester, the idea of utilizing data about microbreweries came up. Unfortunately after some exploration and thought, we…
0 runs0 likes0 downloads0 reach0 impact
2407 instances - 6 features - classes - 8 missing values
0 runs0 likes0 downloads0 reach0 impact
2407 instances - 6 features - classes - 8 missing values
This dataset contains Apple's (AAPL) stock data for the last 10 years (from 2010 to date). I believe insights from this data can be used to build useful price forecasting algorithms to aid investment.…
0 runs0 likes0 downloads0 reach0 impact
2518 instances - 6 features - classes - 0 missing values
Oranges vs. Grapefruit The task of separating oranges and grapefruit is fairly obvious to a human, but even with manual observation there is still a bit of error. This dataset takes the color, weight,…
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 6 features - classes - 0 missing values
Context Since as a beginner in machine learning it would be a great opportunity to try some techniques to predict the outcome of the drugs that might be accurate for the patient. Content The target…
0 runs0 likes0 downloads0 reach0 impact
200 instances - 6 features - classes - 0 missing values
Context Ethereum a decentralized, open-source blockchain featuring smart contract functionality was proposed in 2013 by programmer Vitalik Buterin. Development was crowdfunded in 2014, and the network…
0 runs0 likes0 downloads0 reach0 impact
2202 instances - 6 features - classes - 0 missing values
Context - World Health Organization (WHO) Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus (2019-nCoV). Content This dataset has information on the…
0 runs0 likes0 downloads0 reach0 impact
37272 instances - 6 features - classes - 0 missing values
Context COVID19 is spreading across the globe and this data set help in analyzing to what extent the pandemic has affected different countries. Content This data set contains the total number of COVID…
0 runs0 likes0 downloads0 reach0 impact
26355 instances - 6 features - classes - 844 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
18063 instances - 6 features - 0 classes - 0 missing values
Dataset used in the tabular data benchmark, transformed in the same way. This dataset belongs to the "regression on numerical features" benchmark. Original…
0 runs0 likes0 downloads0 reach0 impact
18063 instances - 6 features - 0 classes - 0 missing values
Subsampling of the dataset phoneme (44127) with seed=0 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset phoneme (44127) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset phoneme (44127) with seed=4 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset phoneme (44127) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
Subsampling of the dataset phoneme (44127) with seed=2 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self,…
0 runs0 likes0 downloads0 reach0 impact
2000 instances - 6 features - 2 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
250 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
100 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
500 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
250 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
100 instances - 6 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
594 runs0 likes0 downloads0 reach0 impact
1000 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
822 runs0 likes0 downloads0 reach0 impact
250 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1013 runs0 likes0 downloads0 reach0 impact
163 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
777 runs0 likes0 downloads0 reach0 impact
500 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1136 runs0 likes0 downloads0 reach0 impact
100 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
708 runs0 likes0 downloads0 reach0 impact
62 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
759 runs0 likes0 downloads0 reach0 impact
50 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
779 runs0 likes0 downloads0 reach0 impact
500 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1767 runs0 likes0 downloads0 reach0 impact
3848 instances - 6 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
109967 runs0 likes0 downloads0 reach0 impact
15545 instances - 6 features - 2 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
500 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach0 impact
500 instances - 6 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
806 runs0 likes0 downloads0 reach0 impact
500 instances - 6 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
631 runs0 likes0 downloads0 reach0 impact
1000 instances - 6 features - 2 classes - 0 missing values
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
Probable risk factors for coronary thrombosis, comprising data from 1841 men. The coronary data set contains the following 6 variables: Smoking (smoking): a two-level factor with levels no and yes. M.…
0 runs0 likes0 downloads0 reach0 impact
1841 instances - 6 features - classes - 0 missing values
Context While brainstorming ideas for a statistics project for a course last semester, the idea of utilizing data about microbreweries came up. Unfortunately after some exploration and thought, we…
0 runs0 likes0 downloads0 reach0 impact
2407 instances - 6 features - classes - 8 missing values
Introduction Dogecoin is an open source peer-to-peer digital currency, favored by Shiba Inus worldwide. It is qualitatively more fun while being technically nearly identical to its close relative…
0 runs0 likes0 downloads0 reach0 impact
1532 instances - 6 features - classes - 0 missing values
Context Yallamotor is website in ksa have a collection of used vehicles for sale. I used the Yallamotor website to create dataset of used vehicles in KSA. Content Dataset includes ( 2287 ) vehicles…
0 runs0 likes0 downloads0 reach0 impact
2287 instances - 6 features - classes - 0 missing values
Context This dataset was scraped from, using the code in this repository. I designed the webscraping code to account for most of the variance in the website's formatting,…
0 runs0 likes0 downloads0 reach0 impact
2096 instances - 6 features - classes - 0 missing values
Context This is a continually updated dataset of professional fighters making fight predictions. Content The data is gathered mostly from James Lynch's YouTube channel, where fighters are asked to…
0 runs0 likes0 downloads0 reach0 impact
3401 instances - 6 features - classes - 0 missing values
What is World of Wacraft? According to Wikipedia: World of Warcraft (WoW) is a massively multiplayer online role-playing game (MMORPG) released in 2004 by Blizzard Entertainment. It is the fourth…
0 runs0 likes0 downloads0 reach0 impact
478 instances - 6 features - classes - 1 missing values