OpenML

Linear vs. Non Linear

Comparison of linear and non-linear models. [Jupyter Notebook](https://github.com/janvanrijn/linear-vs-non-linear/blob/master/notebook/Linear-vs-Non-Linear.ipynb)

0 datasets, 0 tasks, 0 flows, 0 runs

Heterogeneous Ensembles for Data Streams

Ensembles of classifiers are among the best performing classifiers available in many data mining applications. Rather than training one classifier, multiple classifiers are trained, and their…

0 datasets, 0 tasks, 0 flows, 0 runs

Does Feature Selection Improve Classification?

Feature selection can be of value to classification for a variety of reasons. Real world data sets can be rife with irrelevant features, especially if the data was not gather specifically for the…

394 datasets, 394 tasks, 24 flows, 9454 runs

OpenML-CC18 Curated Classification benchmark

We advocate the use of curated, comprehensive benchmark suites of machine learning datasets, backed by standardized OpenML-based interfaces and complementary software toolkits written in Python, Java…

72 datasets, 72 tasks, 0 flows, 0 runs

Collaborative, reproducible benchmarking and analysis

Benchmarking in Machine Learning is often much more difficult than it seems, and hard to reproduce. This study is a new approach to do a collaborative, in-depth benchmarking of algorithms, and allows…

0 datasets, 0 tasks, 0 flows, 0 runs

Multi-class Classification

Multi-class Classification Study

0 datasets, 0 tasks, 0 flows, 0 runs

Machine Learning: An overview with the help of R software

This book intends to provide an overview of Machine Learning and its algorithms & models with help of R software. Machine learning forms the basis for Artificial Intelligence which will play a crucial…

0 datasets, 0 tasks, 0 flows, 0 runs

Deep Learning Models and its application: An overview with the help of R software

Deep learning models are widely used in different fields due to its capability to handle large and complex datasets and produce the desired results with more accuracy at a greater speed. In Deep…

0 datasets, 0 tasks, 0 flows, 0 runs

pandas

jhuilj;kl

0 datasets, 0 tasks, 0 flows, 0 runs

House_Price_Practice

Prediction of House price

0 datasets, 0 tasks, 0 flows, 0 runs

Mnist

Ggg

0 datasets, 0 tasks, 0 flows, 0 runs

na

0 datasets, 0 tasks, 0 flows, 0 runs

As

Hs

0 datasets, 0 tasks, 0 flows, 0 runs

efqer

qwerqwe

0 datasets, 0 tasks, 0 flows, 0 runs

Arusov study

Test study for arusov

0 datasets, 0 tasks, 0 flows, 0 runs

hanchao

hahaha

0 datasets, 0 tasks, 0 flows, 0 runs

Admissions123

0 datasets, 0 tasks, 0 flows, 0 runs

KEEL Imbalanced Datasets

A study of imbalanced classification data benchmarks from KEEL.

0 datasets, 0 tasks, 0 flows, 0 runs

Naive Bayes Classifier

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Tensorflow

Learning Tensorflow

0 datasets, 0 tasks, 0 flows, 0 runs

Random

0 datasets, 0 tasks, 0 flows, 0 runs

Nokia phone

Android phone scenarios

0 datasets, 0 tasks, 0 flows, 0 runs

Breast Cancer Detection

Detect breast cancer using various methods

0 datasets, 0 tasks, 0 flows, 0 runs

Honda Civic

This is a Machine Learning starter project, we will grab data through online resources and then will perform different algorithms on data.

0 datasets, 0 tasks, 0 flows, 0 runs

Primeiro Teste

Primeiro teste

0 datasets, 0 tasks, 0 flows, 0 runs

xray

0 datasets, 0 tasks, 0 flows, 0 runs

Reviewanalyze

a

0 datasets, 0 tasks, 0 flows, 0 runs

Tweets Demo

0 datasets, 0 tasks, 0 flows, 0 runs

voice recognition

machine language

0 datasets, 0 tasks, 0 flows, 0 runs

Inconcise Mapping Prediction

A classifier for identifying inconcise mappings in DBpedia based on a set of features defined in the following paper. Rico, Mariano, Mihindukulasooriya, Nandana, Kontokostas, Dimitris, Paulheim,…

0 datasets, 0 tasks, 0 flows, 0 runs

Categorical Columns Analysis

Classification Datasets that are not too large (less than 40k rows) with at least one categorical column

0 datasets, 0 tasks, 0 flows, 0 runs

MetaQSAR - MLJ

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Yusuf

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Annotative Expert For Hyperparameter Selection

A list of the datasets used in the paper Annotative Expert For Hyperparameter Selection, as part of the AutoML workshop at ICML 2018

0 datasets, 0 tasks, 0 flows, 0 runs

trialyagiziris

trial for learning

0 datasets, 0 tasks, 0 flows, 0 runs

Breast Cancer Prediction

Test on wdbc dataset

0 datasets, 0 tasks, 0 flows, 0 runs

classify

0 datasets, 0 tasks, 0 flows, 0 runs

MURA_humerus

stanford stuff

0 datasets, 0 tasks, 0 flows, 0 runs

Ames Housing

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

OpenML Regression 30

Selected regression problems for aggregate model analysis

30 datasets, 1 tasks, 0 flows, 0 runs

AFH

0 datasets, 0 tasks, 0 flows, 0 runs

Recoil

Recoil estimation from drifting position data.

0 datasets, 0 tasks, 0 flows, 0 runs

How big is the impact of 1-0-encoding in trees

Show how one-hot-encoding impacts the performance of decision trees. See also https://roamanalytics.com/2016/10/28/are-categorical-variables-getting-lost-in-your-random-forests/

0 datasets, 0 tasks, 0 flows, 0 runs

Trail Run

My first test on the platform

0 datasets, 0 tasks, 0 flows, 0 runs

Dependency parser experiment

Dependency parser for news data

0 datasets, 0 tasks, 0 flows, 0 runs

Identifying Cultural Events in DBpedia

We want to predict the type of a DBpedia resource from its structure in the Knowledge graph. our preliminary study concludes that we can achieve it with accuracy above 90%. Paper submitted to ICWE…

0 datasets, 0 tasks, 0 flows, 0 runs

Weather Mael

Studying Weather with machine learning

0 datasets, 0 tasks, 0 flows, 0 runs

Meta-Sparsity

Runs made for constructing a meta-dataset in a study on the effects of sparsity on the meta-level.

0 datasets, 0 tasks, 0 flows, 0 runs

Test of random

Test Of Random

0 datasets, 0 tasks, 0 flows, 0 runs

Importance of hyperparameter tuning

Benchmark study, using 73 datasets from OpenML-CC18, on the importance of hyperparameter tuning: which parameters are important to tune and which might be set to a default value instead? For each…

0 datasets, 0 tasks, 0 flows, 0 runs

Multi-Parameter Cancer tests

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Survey Result Analysis

First analysis of CML survey results from over a year

0 datasets, 0 tasks, 0 flows, 0 runs

When to tune DTs hyperparameters via Meta-learning

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Sports Analytics

[Sport Data Valley](https://www.sportinnovator.nl/sport-data-valley) is a Dutch initiative to collect, share and analyse datasets on sports and exercise.…

0 datasets, 0 tasks, 0 flows, 0 runs

Prefetching for SPARQL endopints

Data prefetching is a standard technique used to accelerate the access to data stores. In the context of SPARQL endpoints, previous approaches have been based on two main techniques: (1) query…

3 datasets, 3 tasks, 0 flows, 5 runs

Inferring new types on large datasets applying ontology class hierarchy classifiers: the DBpedia case

Paper submitted to ESWC 2018

0 datasets, 0 tasks, 0 flows, 0 runs

Datasets

0 datasets, 0 tasks, 0 flows, 0 runs

PRIME

project

0 datasets, 0 tasks, 0 flows, 0 runs

Classif

Classifiers in R

0 datasets, 0 tasks, 0 flows, 0 runs

test

1

0 datasets, 0 tasks, 0 flows, 0 runs

mqy_test

1

0 datasets, 0 tasks, 0 flows, 0 runs

Multi-class dataset library

The library contains different multi-class datasets.

0 datasets, 0 tasks, 0 flows, 0 runs

Decision Tree

just messing around

0 datasets, 0 tasks, 0 flows, 0 runs

Workflow recomendation

Workflow recomendation experiment using runs considered "human-made"

0 datasets, 0 tasks, 0 flows, 0 runs

ML introduction class

A small study of algorithms on datasets provided by the students.

0 datasets, 0 tasks, 0 flows, 0 runs

Hyperparameter Importance Across Datasets

With the advent of automated machine learning, automated hyperparameter optimization methods are by now routinely used. However, this progress is not yet matched by equal progress on automatic…

0 datasets, 0 tasks, 0 flows, 0 runs

Automatic Recommendation of Machine Learning Workflows - Master's Dissertation

This collection of datasets and runs was used in the study included in the dissertation, prepared by Miguel Viana Cachada, for the Master in Data Analytics from _Faculdade de Economia do Porto_…

0 datasets, 0 tasks, 0 flows, 0 runs

Layered TPOT datasets

Datasets used to evaluate Layered TPOT against 'vanilla' TPOT. Comprises a selection of large datasets, with between 100k and 1m instances each, contains pseudo-synthetic datasets.

0 datasets, 0 tasks, 0 flows, 0 runs

ML R Bot

Run experiments on study 14

0 datasets, 0 tasks, 0 flows, 0 runs

Performance of ctree with different settings of testtype.

A simple study created for a talk at CENISBS

0 datasets, 0 tasks, 0 flows, 0 runs

Experimenting with Survival Analysis

This study is intented for exploring the platform. Most things will be deleted.

0 datasets, 0 tasks, 0 flows, 0 runs

I applied multi-task and multimodal deep learning to financial forecasting, check out how it works!

Here is description in the form of a tutorial: https://medium.com/@alexrachnog/neural-networks-for-algorithmic-trading-multimodal-and-multitask-deep-learning-5498e0098caf; a link to the Github repo is…

0 datasets, 0 tasks, 0 flows, 0 runs

When to tune SVMs hyper-parameters via Meta-learning

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Telecom Churn analysis

Identify best ML for predicting the churn

0 datasets, 0 tasks, 0 flows, 0 runs

Predicting wrong DBpedia mappings

This was an study started by Nandana and Mariano in 2016. We started with unsupervised methods, but we could not find good clusters. En 2017 we started with annotated data and here we are. ## Summary…

0 datasets, 0 tasks, 0 flows, 0 runs

Hyper-parameter tuning of Support Vector Machines

This study lists all the experiments described in the paper ...

157 datasets, 0 tasks, 0 flows, 0 runs

ensemble on diabetes

ensemble test on diabetes

0 datasets, 0 tasks, 0 flows, 0 runs

Hyper-parameter tuning of Decision Trees

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

ASLib OpenML Scenario

Containing all datasets, tasks, flows and runs used in the ASLib OpenML Scenario.

0 datasets, 0 tasks, 0 flows, 0 runs

Performance of new ctree implementations on classification problems

This is just to test the new ctree implementation on various problems to check if there is anything where it fails.

0 datasets, 0 tasks, 0 flows, 0 runs

Speeding up Algorithm Selection via Meta-learning and Active Testing

Authors: Salisu Mamman Abdulrahman, Pavel Brazdil, Jan N. van Rijn, Joaquin Vanschoren Abstract: Algorithm selection methods can be speeded-up substantially by incorporating multi-objective measures…

0 datasets, 0 tasks, 0 flows, 0 runs

Massively Collaborative Machine Learning

All datasets, tasks, flows and setups used for Chapter 6 in the PhD Thesis "Massively Collaborative Machine Learning"

0 datasets, 0 tasks, 0 flows, 0 runs

Data Streams and more

this study joins multiple data stream studies

0 datasets, 0 tasks, 0 flows, 0 runs

Iris Data set Study

Iris dataset

0 datasets, 0 tasks, 0 flows, 0 runs

OpenML Paper Study

Compare several trees, bagged trees and the random forest.

0 datasets, 0 tasks, 0 flows, 0 runs

Compare three different SVM versions of R package kernlab

Based on three different tasks we want to compare three versions of ksvm - C-svc C classification - spoc-svc Crammer, Singer native multi-class - kbb-svc Weston, Watkins native multi-class

0 datasets, 0 tasks, 0 flows, 0 runs

Bernd Demo Study for Multiclass SVMs OML WS 2016

none

0 datasets, 0 tasks, 0 flows, 0 runs

OpenML R paper

Paper on OpenML R library. Includes a case study on bagging vs forests

0 datasets, 0 tasks, 0 flows, 0 runs

Identifying critical paths in undergraduate programs and students grade rep orts based on graph mining approach

An increase in undergraduate registered students in universities largely grown last years. However, the number of graduates remains low. The main cause of this issue is the evasion and / or retention…

0 datasets, 0 tasks, 0 flows, 0 runs

Mythbusting data mining urban legends through large scale experimentation

Data mining researchers and practitioners often use general rules of thumb or common data mining wisdom, those are so called data-mining myths. Even though, these myths are not always proven or…

0 datasets, 0 tasks, 0 flows, 0 runs

Subgroup Discovery

A subgroup discovery study.

0 datasets, 0 tasks, 0 flows, 0 runs

Meta-QSAR: learning how to learn QSARs

Almost every form of statistical and machine learning method has been applied to learning QSARs at one time or another: linear regression, decision trees, neural networks, nearest-neighbour methods,…

0 datasets, 0 tasks, 0 flows, 0 runs

Subspace Clustering via Seeking Neighbors with Minimum Reconstruction Error

The work will be submitted to ECML-PKDD2016

0 datasets, 0 tasks, 0 flows, 0 runs

Having a Blast: Meta-Learning and Heterogeneous Ensembles for Data Streams

Ensembles of classifiers are among the best performing classifiers available in many data mining applications. However, most ensembles developed specifically for the dynamic data stream setting rely…

0 datasets, 0 tasks, 0 flows, 0 runs

Collaborative primer

Example of collaborative research conducted by means of OpenML NB:

0 datasets, 0 tasks, 0 flows, 0 runs

Decision tree comparaison

No data.

0 datasets, 0 tasks, 0 flows, 0 runs

Massive machine learning experiments using mlr and OpenML

In this study, we investigate and summarize the performance of a wide range of ML algorithms (using its default hyper-parameter values) on a wide range of OpenML classifications tasks. This will yield…

0 datasets, 0 tasks, 0 flows, 0 runs

Local and Global Feature Selection on Multilabel Transformed Classification Methods

This study compares the local and global feature selection strategy on multilabel classification transformation methods

0 datasets, 0 tasks, 0 flows, 0 runs

Multi-Task Learning with a Natural Metric for Quantitative Structure Activity Relationship Learning

The task of Quantitative Structure Activity Relationship (QSAR) Learning is to learn a function that, given the structure of a small molecule (a potential drug), outputs the predicted activity of the…

0 datasets, 0 tasks, 0 flows, 0 runs

Fast Algorithm Selection using Learning Curves

One of the challenges in Machine Learning to find a classifier and parameter settings that work well on a given dataset. Evaluating all possible combinations typically takes too much time, hence many…

0 datasets, 0 tasks, 0 flows, 0 runs

Sign in

Filter results by: