OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

Vehicle-Trips

active ARFF Creative Commons Attribution 4.0 International Visibility: public Uploaded 09-07-2024 by Bruno Belucci Teixeira
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

daily pickup data for 329 FHV companies from January 2015 through August 2015. From original source: ----- There is also a file other-FHV-data-jan-aug-2015.csv containing daily pickup data for 329 FHV companies from January 2015 through August 2015. ----- There are 5 columns: id_series: The id of the time series. date: The date of the time series in the format "%Y-%m-%d". time_step: The time step on the time series. value_X (X from 0 to 1): The values of the time series, which will be used for the forecasting task. Preprocessing: 1 - Renamed columns: 'Number of Trips' to 'value_0', 'Number of Vehicles' to 'value_1', 'Base Number' to 'id_series', 'Pick Up Date' to 'date'. 2 - Dropped column 'Base Name', which contains the same information as id_series. 3 - Trimmed white spaces and capitalize the column 'id_series'. 4 - Standardize the date to the format %Y-%m-%d. 5 - Replace ' - ' in column 'value_1' with NaNs. 6 - Added missing dates to time series to have evenly spaced values with daily frequency. There were some dates missing for some time series, this could be entire months or some missing days between two values. The values were considered NaNs. 7 - Created column 'time_step' with increasing values of the time_step for each time series. 8 - Casted 'value_X' columns to float (to accomodate NaNs, as all the other values are int) and 'id_series' as 'category'.

5 features

id_series	nominal	319 unique values 0 missing
date	string	243 unique values 0 missing
value_0	numeric	1906 unique values 3783 missing
value_1	numeric	444 unique values 5493 missing
time_step	numeric	243 unique values 0 missing