OpenML
Filter results by:
e3r4vr t4r
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
f fr
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
DESCRIPTIVE ABSTRACT: The data set contains the oral, written and combined test scores for 2003 New Haven Fire Department promotion exams. The Race and Position for each test taker are also given.…
0 runs0 likes0 downloads0 reach0 impact
118 instances - 6 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
101 instances - 18 features - classes - 0 missing values
Testing dataset
0 runs0 likes0 downloads0 reach0 impact
134731 instances - 31 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
20058 instances - 16 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
16598 instances - 11 features - classes - 329 missing values
test
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 128136 missing values
test
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 128136 missing values
data from yahoo finance
0 runs0 likes0 downloads0 reach0 impact
1259 instances - 7 features - classes - 0 missing values
This dataset contains 10962 houses to rent with 13 diferent features. Some values in the dataset can be considered as outliers for further analyses. Bear in mind that the Web Crawler was used only to…
0 runs0 likes0 downloads0 reach0 impact
10692 instances - 13 features - 0 classes - 0 missing values
MY Dataset
0 runs0 likes0 downloads0 reach0 impact
120 instances - 7 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
test
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
test
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 42138 missing values
test
0 runs0 likes0 downloads0 reach0 impact
2580 instances - 7 features - classes - 2541 missing values
This is weather data in arff format
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - classes - 0 missing values
sample
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - classes - 0 missing values
test data test
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
this is test data
0 runs0 likes0 downloads0 reach0 impact
5 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
270 instances - 14 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
8553 instances - 10 features - classes - 18454 missing values
test
0 runs0 likes0 downloads0 reach0 impact
2580 instances - 7 features - classes - 2541 missing values
newtest3
0 runs0 likes0 downloads0 reach0 impact
2 instances - 6 features - classes - 0 missing values
test3
0 runs0 likes0 downloads0 reach0 impact
2 instances - 8 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
2178 instances - 4 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
8124 instances - 23 features - classes - 2480 missing values
Salary Emp
0 runs0 likes0 downloads0 reach0 impact
31 instances - 2 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
336 instances - 8 features - classes - 0 missing values
This dataset describes 100,000 realistic, synthetically generated worker compensation insurance claims. Along the ultimate financial losses, each claim is described by the initial case estimate, date…
0 runs0 likes0 downloads0 reach0 impact
100000 instances - 14 features - 0 classes - 0 missing values
This dataset contains 206 attributes of 70 children with physical and motor disability based on ICF-CY. In particular, the SCADI dataset is the only one that has been used by ML researchers for…
0 runs0 likes0 downloads0 reach0 impact
70 instances - 206 features - classes - 0 missing values
Autistic Spectrum Disorder (ASD) is a neurodevelopment condition associated with significant healthcare costs, and early diagnosis can significantly reduce these. Unfortunately, waiting times for an…
0 runs0 likes0 downloads0 reach0 impact
704 instances - 21 features - classes - 192 missing values
The database was created with records of behavior of the urban traffic of the city of Sao Paulo in Brazil from December 14, 2009 to December 18, 2009 (From Monday to Friday). Registered from 7:00 to…
0 runs0 likes0 downloads0 reach0 impact
135 instances - 18 features - classes - 0 missing values
This data is for the purpose of bias correction of next-day maximum and minimum air temperatures forecast of the LDAPS model operated by the Korea Meteorological Administration over Seoul, South…
0 runs0 likes0 downloads0 reach0 impact
7752 instances - 25 features - classes - 1248 missing values
One of the primary challenges in identifying the risks of the Burst Header Packet (BHP) flood attacks in Optical Burst Switching networks (OBS) is the scarcity of reliable historical data. ###…
0 runs0 likes0 downloads0 reach0 impact
1075 instances - 22 features - classes - 15 missing values
The dataset is about bankruptcy prediction of Polish companies. The data was collected from Emerging Markets Information Service (EMIS, [Web Link]), which is a database containing information on…
0 runs0 likes0 downloads0 reach0 impact
7027 instances - 65 features - classes - 5835 missing values
1987 National Indonesia Contraceptive Prevalence Survey
0 runs0 likes0 downloads0 reach0 impact
1473 instances - 10 features - classes - 0 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
The database was created with records of absenteeism at work from July 2007 to July 2010 at a courier company in Brazil. The data set allows for several new combinations of attributes and attribute…
0 runs0 likes0 downloads0 reach0 impact
740 instances - 21 features - classes - 0 missing values
We scraped a large number of eBay auctions of a popular product. After preprocessing the auction data, we build the SB dataset. The goal is to share the labelled SB dataset with the researchers.
0 runs0 likes0 downloads0 reach0 impact
6321 instances - 13 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
178 instances - 16 features - classes - 0 missing values
Laboratory dataset
0 runs0 likes0 downloads0 reach0 impact
1750 instances - 7 features - classes - 0 missing values
Laboratorio_dataset_car
0 runs0 likes0 downloads0 reach0 impact
1750 instances - 7 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
194 instances - 32 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1484 instances - 18 features - classes - 0 missing values
#### Information A small classic dataset from Fisher, 1936. One of the earliest datasets used for the evaluation of classification methodologies. #### References * Fisher, R. A. (1936), The use of…
0 runs0 likes1 downloads1 reach0 impact
150 instances - 7 features - classes - 0 missing values
This hourly data set contains the PM2.5 data of US Embassy in Beijing. Meanwhile, meteorological data from Beijing Capital International Airport are also included.
0 runs0 likes0 downloads0 reach0 impact
43824 instances - 13 features - classes - 2067 missing values
Author: Francesca Grisoni, Claudia S. Neuhaus, Miyabi Hishinuma, Gisela Gabernet, Jan A. Hiss, - Masaaki Kotera, Gisbert Schneider Source:…
0 runs0 likes0 downloads0 reach0 impact
901 instances - 3 features - classes - 0 missing values
Author: Francesca Grisoni, Claudia S. Neuhaus, Miyabi Hishinuma, Gisela Gabernet, Jan A. Hiss, - Masaaki Kotera, Gisbert Schneider Source:…
0 runs0 likes0 downloads0 reach0 impact
949 instances - 3 features - classes - 0 missing values
The dataset consists of 384 features extracted from CT images. The class variable is numeric and denotes the relative location of the CT slice on the axial axis of the human body. The data was…
0 runs0 likes0 downloads0 reach0 impact
53500 instances - 386 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
7411 instances - 11 features - classes - 0 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
126 instances - 42 features - classes - 446 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
127 instances - 42 features - classes - 788 missing values
This dataset can be used to predict the chronic kidney disease and it can be collected from the hospital nearly 2 months of period. ### Attribute information We use 24 + class = 25 ( 11 numeric ,14…
0 runs0 likes0 downloads0 reach0 impact
400 instances - 26 features - classes - 1009 missing values
test
0 runs0 likes0 downloads0 reach0 impact
7384 instances - 11 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
577 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
10173 instances - 65 features - classes - 12157 missing values
test
0 runs0 likes0 downloads0 reach0 impact
341 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
263 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
487 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
7158 instances - 11 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
7628 instances - 11 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
7152 instances - 11 features - classes - 0 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
699 instances - 57 features - 5 classes - 7889 missing values
arbres-urbains
0 runs0 likes0 downloads0 reach0 impact
699 instances - 57 features - 5 classes - 7889 missing values
bases-de-donnees-annuelles-des-accidents-corporels-de-la-circulation-routiere-annees-de-2005-a-2019
0 runs0 likes0 downloads0 reach0 impact
132977 instances - 55 features - 0 classes - 550521 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
2 instances - 57 features - 1 classes - 22 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
421 instances - 3 features - 1 classes - 0 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
709 instances - 57 features - 6 classes - 8199 missing values
arbres-urbains
0 runs0 likes0 downloads0 reach0 impact
699 instances - 57 features - 5 classes - 7889 missing values
In our research each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, you want to…
0 runs0 likes0 downloads0 reach0 impact
750 instances - 16 features - classes - 60 missing values
This data set was collected from the internet traffic records on a university's firewall. There are 12 features in total. Action feature is used as a class. There are 4 classes in total. These are…
0 runs0 likes0 downloads0 reach0 impact
65532 instances - 12 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1643 instances - 3 features - classes - 0 missing values
https://archive.ics.uci.edu/ml/datasets/Diabetes
0 runs0 likes0 downloads0 reach0 impact
768 instances - 9 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1643 instances - 2 features - classes - 0 missing values
The classification task of this database is to determine where patients in a postoperative recovery area should be sent to next. Because hypothermia is a significant concern after surgery (Woolery, L.…
0 runs0 likes0 downloads0 reach0 impact
65532 instances - 12 features - classes - 0 missing values
Product listing data submitted to the U.S. FDA for all unfinished, unapproved drugs.
0 runs0 likes0 downloads0 reach0 impact
120215 instances - 20 features - 7 classes - 443305 missing values
artificial no anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - classes - 0 missing values
The data set contains laboratory values of blood donors and Hepatitis C patients and demographic values like age.The target attribute for classification is Category (blood donors vs. Hepatitis C…
0 runs0 likes0 downloads0 reach0 impact
615 instances - 14 features - classes - 31 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
1 instances - 57 features - 1 classes - 11 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - 0 classes - 0 missing values
artificial no anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - 0 classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1624 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1538 instances - 3 features - classes - 0 missing values
leak detection file
0 runs0 likes0 downloads0 reach0 impact
23 instances - 4 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - classes - 0 missing values
Context Projects are a great way to learn data science. So I started my own. The numerous housing data sets on Kaggle were the inspiration for this data set. Predicting housing prices is a simple yet…
0 runs0 likes0 downloads0 reach0 impact
10552 instances - 26 features - classes - 49282 missing values
Mammography is the most effective method for breast cancer screening available today. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to…
0 runs0 likes0 downloads0 reach0 impact
830 instances - 6 features - classes - 0 missing values
Context Chocolate is one of the most popular candies in the world. Each year, residents of the United States collectively eat more than 2.8 billions pounds. However, not all chocolate bars are created…
0 runs0 likes0 downloads0 reach0 impact
1795 instances - 9 features - classes - 962 missing values
Context Chocolate is one of the most popular candies in the world. Each year, residents of the United States collectively eat more than 2.8 billions pounds. However, not all chocolate bars are created…
0 runs0 likes0 downloads0 reach0 impact
1795 instances - 9 features - classes - 962 missing values
Context This dataset was collected by Neha Prerna Tigga and Dr. Shruti Garg of the Department of Computer Science and Engineering, BIT Mesra, Ranchi-835215 for research, non-commercial purposes only.…
0 runs0 likes0 downloads0 reach0 impact
952 instances - 18 features - classes - 48 missing values
ContextThisdatasetwascollectedbyNehaPrernaTiggaandDrShrutiGargoftheDepartmentofComputerScienceandEngineeringBITMesraRanchi835215forresearchnoncommercialpurposesonlyAnarticleisalsopublishedimplementingthisdatasetFormoreinformationandcitationofthisdatasetpleasereferTiggaNPGargS2020PredictionofType2DiabetesusingMachineLearningClassificationMethodsProcediaComputerScience167706716DOIhttpsdoiorg101016jprocs202003336ContentThereisatotalof952instanceswith17independentpredictorvariablesandonebinarytargetordependentvariableDiabetesAcknowledgementsWewouldliketothankalltheparticipantswhocontributedtowardsthebuildingofthisdatasetInspirationTobuildamachinelearningalgorithmtopredictifapersonhasdiabetesornot…
0 runs0 likes0 downloads0 reach0 impact
952 instances - 18 features - classes - 48 missing values
ContextProjectsareagreatwaytolearndatascienceSoIstartedmyownThenumeroushousingdatasetsonKaggleweretheinspirationforthisdatasetPredictinghousingpricesisasimpleyetinsightfulregressionproblemUnderstandingdatatakestimeandthemorepeopleanalyzeitthefasterthesecretscanbeuncoveredIacquiredthedatabyscrapingImmoScout24amarketplaceforGermanrealestate…
0 runs0 likes0 downloads0 reach0 impact
10552 instances - 26 features - classes - 49282 missing values
ContextAcorporatecreditratingexpressestheabilityofafirmtorepayitsdebttocreditorsCreditratingagenciesaretheentitiesresponsibletomaketheassessmentandgiveaverdictWhenabigcorporationfromtheUSoranywhereintheworldwantstoissueanewbondithiresacreditagencytomakeanassessmentsothatinvestorscanknowhowtrustworthyisthecompanyTheassessmentisbasedespeciallyinthefinancialsindicatorsthatcomefromthebalancesheetSomeofthemostimportantagenciesintheworldareMoodysFitchandStandardandPoorsContentAlistof2029creditratingsissuedbymajoragenciessuchasStandardandPoorstobigUSfirmstradedonNYSEorNasdaqfrom2010to2016Thereare30featuresforeverycompanyofwhich25arefinancialindicatorsTheycanbedividedinLiquidityMeasurementRatioscurrentRatioquickRatiocashRatiodaysOfSalesOutstandingProfitabilityIndicatorRatiosgrossProfitMarginoperatingProfitMarginpretaxProfitMarginnetProfitMargineffectiveTaxRatereturnOnAssetsreturnOnEquityreturnOnCapitalEmployedDebtRatiosdebtRatiodebtEquityRatioOperatingPerformanceRatiosassetTurnoverCashFlowIndicatorRatiosoperatingCashFlowPerSharefreeCashFlowPerSharecashPerShareoperatingCashFlowSalesRatiofreeCashFlowOperatingCashFlowRatioFormoreinformationaboutfinancialindicatorsvisithttpsfinancialmodelingprepcommarketindexesmajormarketsTheadditionalfeaturesareNameSymbolfortradingRatingAgencyNameDateandSectorThedatasetisunbalancedhereisthefrequencyofratingsAAA7AA89A398BBB671BB490B302CCC64CC5C2D1AcknowledgementsThisdatasetwaspossiblethankstofinancialmodelingprepandopendatasoftthesourcesofthedataToseehowthedatawasintegratedandreshapedcheckhereInspirationIsitpossibletoforecasttheratinganagencywillgivetoacompanybasedonitsfinancials…
0 runs0 likes0 downloads0 reach0 impact
2029 instances - 31 features - classes - 0 missing values
Context Recent growing interest in cryptocurrencies, specifically as a speculative investment vehicle, has sparked global conversation over the past 12 months. Although this data is available across…
0 runs0 likes0 downloads0 reach0 impact
28944 instances - 8 features - classes - 0 missing values
Context While brainstorming ideas for a statistics project for a course last semester, the idea of utilizing data about microbreweries came up. Unfortunately after some exploration and thought, we…
0 runs0 likes0 downloads0 reach0 impact
2407 instances - 6 features - classes - 8 missing values