OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

MiamiHousing2016

active ARFF Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) Visibility: public Uploaded 16-06-2022 by Leo Grin
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark. Original description: The dataset contains information on 13,932 single-family homes sold in Miami in 2016. Besides publicly available information, the dataset creator Steven C. Bourassa has added distance variables, aviation noise as well as latitude and longitude. The dataset containts the following columns: - PARCELNO: unique identifier for each property. About 1% appear multiple times. - SALE_PRC: sale price ($) - LND_SQFOOT: land area (square feet) - TOT_LVG_AREA: floor area (square feet) - SPEC_FEAT_VAL: value of special features (e.g., swimming pools) ($) - RAIL_DIST: distance to the nearest rail line (an indicator of noise) (feet) - OCEAN_DIST: distance to the ocean (feet) - WATER_DIST: distance to the nearest body of water (feet) - CNTR_DIST: distance to the Miami central business district (feet) - SUBCNTR_DI: distance to the nearest subcenter (feet) - HWY_DIST: distance to the nearest highway (an indicator of noise) (feet) - age: age of the structure - avno60plus: dummy variable for airplane noise exceeding an acceptable level - structure_quality: quality of the structure - month_sold: sale month in 2016 (1 = jan) - LATITUDE - LONGITUDE A typical model would try to predict log(SALE_PRC) as a function of all variables except the PARCELNO.

14 features

SALEPRC (target)	numeric	2111 unique values 0 missing
LATITUDE	numeric	13776 unique values 0 missing
LONGITUDE	numeric	13776 unique values 0 missing
LND_SQFOOT	numeric	4696 unique values 0 missing
TOT_LVG_AREA	numeric	2978 unique values 0 missing
SPEC_FEAT_VAL	numeric	7583 unique values 0 missing
RAIL_DIST	numeric	13235 unique values 0 missing
OCEAN_DIST	numeric	13617 unique values 0 missing
WATER_DIST	numeric	13218 unique values 0 missing
CNTR_DIST	numeric	13682 unique values 0 missing
SUBCNTR_DI	numeric	13642 unique values 0 missing
HWY_DIST	numeric	13213 unique values 0 missing
age	numeric	96 unique values 0 missing
month_sold	numeric	12 unique values 0 missing