Data
Vehicle-Trips

Vehicle-Trips

active ARFF Creative Commons Attribution 4.0 International Visibility: public Uploaded 09-07-2024 by Bruno Belucci Teixeira
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
daily pickup data for 329 FHV companies from January 2015 through August 2015. From original source: ----- There is also a file other-FHV-data-jan-aug-2015.csv containing daily pickup data for 329 FHV companies from January 2015 through August 2015. ----- There are 5 columns: id_series: The id of the time series. date: The date of the time series in the format "%Y-%m-%d". time_step: The time step on the time series. value_X (X from 0 to 1): The values of the time series, which will be used for the forecasting task. Preprocessing: 1 - Renamed columns: 'Number of Trips' to 'value_0', 'Number of Vehicles' to 'value_1', 'Base Number' to 'id_series', 'Pick Up Date' to 'date'. 2 - Dropped column 'Base Name', which contains the same information as id_series. 3 - Trimmed white spaces and capitalize the column 'id_series'. 4 - Standardize the date to the format %Y-%m-%d. 5 - Replace ' - ' in column 'value_1' with NaNs. 6 - Added missing dates to time series to have evenly spaced values with daily frequency. There were some dates missing for some time series, this could be entire months or some missing days between two values. The values were considered NaNs. 7 - Created column 'time_step' with increasing values of the time_step for each time series. 8 - Casted 'value_X' columns to float (to accomodate NaNs, as all the other values are int) and 'id_series' as 'category'.

5 features

id_seriesnominal319 unique values
0 missing
datestring243 unique values
0 missing
value_0numeric1906 unique values
3783 missing
value_1numeric444 unique values
5493 missing
time_stepnumeric243 unique values
0 missing

19 properties

29826
Number of instances (rows) of the dataset.
5
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
9276
Number of missing values in the dataset.
5493
Number of instances with at least one value missing.
3
Number of numeric attributes.
1
Number of nominal attributes.
0
Number of binary attributes.
0
Percentage of binary attributes.
18.42
Percentage of instances having missing values.
Average class difference between consecutive instances.
6.22
Percentage of missing values.
0
Number of attributes divided by the number of instances.
60
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
20
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.

0 tasks

Define a new task