Study
AutoML Benchmark

AutoML Benchmark

Created 02-05-2019 by Pieter Gijsbers Visibility: public
Loading wiki
The original set of tasks for the AutoML benchmark presented in “An Open Source AutoML Benchmark” by Gijsbers et al. at the AutoML workshop at ICML 2019. The set of tasks aims to provide a challenging set of datasets suitable for the evaluation of AutoML systems. It contains both binary and multiclass classification tasks in a wide range of sizes and domains. It includes datasets from OpenML-CC18, AutoML challenges and competitions, and datasets that were used in earlier AutoML evaluations. We exclude artificial data, data that should not be evaluated with cross-validation (e.g. time-series data) and restrict image classification data. Datasets contain at least 500 samples, and may include missing values, numerical and nominal features, heavy class imbalance and any number of features. Visit https://openml.github.io/automlbenchmark/ for more information on the original benchmark, using the benchmark tool, the latest state of the benchmark or to get in touch with the authors. If you use this work in a publication, please cite: Gijsbers, Pieter, et al. "An open source AutoML benchmark." arXiv preprint arXiv:1907.00909 (2019). Or BibTeX: ``` @article{amlb2019, title={An Open Source AutoML Benchmark}, author={Gijsbers, P. and LeDell, E. and Poirier, S. and Thomas, J. and Bischl, B. and Vanschoren, J.}, journal={arXiv preprint arXiv:1907.00909 [cs.LG]}, url={https://arxiv.org/abs/1907.00909}, note={Accepted at AutoML Workshop at ICML 2019}, year={2019} } ```