Data
BitInfoCharts-wo-tweets-preprocessed

BitInfoCharts-wo-tweets-preprocessed

active ARFF Public Domain Visibility: public Uploaded 25-06-2024 by Bruno Belucci Teixeira
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Bitcoin data scrapped from BitInfoCharts, without 'tweets' and with preprocessing. Several Bitcoin related data scrapped directly from BitInfoCharts. 'date' in the format %Y-%m-%d. The 'tweets' column was dropped due to too many nan values (values only between 2014-04-09 and 2023-03-14). Besides, we have only kept the rows between the max(dates with non NaN values of each column) and min(dates with non NaN values of each column), which leave us with dates between 2011-04-14 and 2024-05-26. There are 21 columns: id_series: The id of the time series. date: The date of the time series in the format "%Y-%m-%d". time_step: The time step on the time series. value_X (X from 0 to 17): The values of the time series, which will be used for the forecasting task. Preprocessing: 1 - Renamed columns to 'date' and 'value_X' with X from 0 to 17 (number of columns of original dataset). 2 - Created columns 'time_step' and 'id_series'. There is only one 'id_series' (0). 3 - Ensured that there are no missing dates and that the frequency of the time_series is daily. 4 - Filled nan values by propagating the last valid observation to next valid (ffill). The columns with some missing values were: 'median_transaction_fee': 'value_9' 'confirmationtime': 'value_10' 'activeaddresses': 'value_15' 'top100cap': 'value_16' 5 - Casted 'date' to str, 'time_step' to int, 'value_X' to float, and defined 'id_series' as 'category'.

21 features

id_seriesnominal1 unique values
0 missing
datestring4792 unique values
0 missing
value_0numeric4745 unique values
0 missing
value_1numeric4761 unique values
0 missing
value_2numeric4773 unique values
0 missing
value_3numeric726 unique values
0 missing
value_4numeric4792 unique values
0 missing
value_5numeric4628 unique values
0 missing
value_6numeric2957 unique values
0 missing
value_7numeric4792 unique values
0 missing
value_8numeric3238 unique values
0 missing
value_9numeric2414 unique values
0 missing
value_10numeric159 unique values
0 missing
value_11numeric4792 unique values
0 missing
value_12numeric4624 unique values
0 missing
value_13numeric4748 unique values
0 missing
value_14numeric3572 unique values
0 missing
value_15numeric4760 unique values
0 missing
value_16numeric3362 unique values
0 missing
value_17numeric3241 unique values
0 missing
time_stepnumeric4792 unique values
0 missing

19 properties

4792
Number of instances (rows) of the dataset.
21
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
19
Number of numeric attributes.
1
Number of nominal attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
Average class difference between consecutive instances.
90.48
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
4.76
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task