Bitcoin data scrapped from BitInfoCharts, with preprocessing.
Several Bitcoin related data scrapped directly from BitInfoCharts. 'date' in the format %Y-%m-%d.
We have only kept the rows between the max(dates with non NaN values of each column) and min(dates with non NaN values of each column), which
leave us with dates between 2014-04-09 and 2023-03-14.
There are 22 columns:
id_series: The id of the time series.
date: The date of the time series in the format "%Y-%m-%d".
time_step: The time step on the time series.
value_X (X from 0 to 18): The values of the time series, which will be used for the forecasting task.
Preprocessing:
1 - Renamed columns to 'date' and 'value_X' with X from 0 to 18 (number of columns of original dataset).
2 - Created columns 'time_step' and 'id_series'. There is only one 'id_series' (0).
3 - Ensured that there are no missing dates and that the frequency of the time_series is daily.
4 - Filled nan values by propagating the last valid observation to next valid (ffill).
The columns with some missing values were:
'confirmationtime': 'value_10'
'tweets': 'value_14'
'activeaddresses': 'value_16'
'top100cap': 'value_17'
5 - Casted 'date' to str, 'time_step' to int, 'value_X' to float, and defined 'id_series' as 'category'.