Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "regression on both numerical and categorical features" benchmark.
Original link: https://openml.org/d/42721
Original description:
Author: Bureau of Transportation Statistics, Airline Service Quality Performance
Source: [original](http://www.transtats.bts.gov/) - 2013
Please cite:
Airlines Departure Delay Prediction (Regression).
Original data can be found at: http://www.transtats.bts.gov
This is a processed version of the original data, designed to predict departure delay (in seconds).
A CSV of the raw data (years 1987-2013) can be be found [here](https://h2o-airlines-unpacked.s3.amazonaws.com/allyears.1987.2013.csv). This is the first 1 million rows (and a subset of the columns) of this CSV file, in ARFF format.