Data
ETTm2

ETTm2

active ARFF Creative Commons Attribution-NoDerivatives 4.0 International Visibility: public Uploaded 24-06-2024 by Bruno Belucci Teixeira
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Electric power distribution, 15 minutely data. From original source: ----- The electric power distribution problem is the distribution of electricity to different areas depends on its sequential usage. But predicting the following demand of a specific area is difficult, as it varies with weekdays, holidays, seasons, weather, temperatures, etc. However, no existing method can perform a long-term prediction based on super long-term real-world data with high precision. Any false prophecy may damage the electrical transformer. So currently, without an efficient method to predict future electric usage, managers have to make decisions based on the empirical number, which is much higher than the real-world demands. It causes unnecessary waste of electric and equipment depreciation. On the other hand, the oil temperatures can reflect the conditon of electricity Transformer. One of the most efficient strategies is to predict how the electrical transformers' oil temperature is safe and avoid unnecessary waste. As a result, to address this problem, our team and Beijing Guowang Fuda Science & Technology Development Company built a real-world platform and collected 2-year data. We work on it to predict the electrical transformers' oil temperature and investigate the extreme load capacity. We donated two years of data, in which each data point is recorded every minute (marked by m), and they were from two regions of a province of China, named ETT-small-m1 and ETT-small-m2, respectively. Each dataset contains 2 year * 365 days * 24 hours * 4 times = 70,080 data point. Besides, we also provide the hourly-level variants for fast development (marked by h), i.e. ETT-small-h1 and ETT-small-h2. Each data point consists of 8 features, including the date of the point, the predictive value "oil temperature", and 6 different types of external power load features. ----- This data corresponds to the ETTm2 variant. There are 10 columns: id_series: The id of the time series. date: The date of the time series in the format "%Y-%m-%d %H:%M:%S". time_step: The time step on the time series. value_X (X from 0 to 6): The values of the time series, which will be used for the forecasting task. Preprocessing: 1 - Standardize the 'date' column in the format "%Y-%m-%d %H:%M:%S". 2 - Renamed columns [1:] to 'value_X' with X from 0 to 6. 3 - Created 'id_series' with value 0. There is only one multivariate time series. 4 - Ensured that there are no missing dates and that the frequency of the time_series is 15 minutes. 5 - Created 'time_step' column from 'date' and 'id_series' with increasing values from 0 to the size of the time series. 6 - Casted 'date' to str, 'time_step' to int, 'value_X' columns to float and defined 'id_series' as 'category'.

10 features

id_seriesnominal1 unique values
0 missing
datestring69680 unique values
0 missing
value_0numeric819 unique values
0 missing
value_1numeric398 unique values
0 missing
value_2numeric1898 unique values
0 missing
value_3numeric991 unique values
0 missing
value_4numeric1858 unique values
0 missing
value_5numeric388 unique values
0 missing
value_6numeric1910 unique values
0 missing
time_stepnumeric69680 unique values
0 missing

19 properties

69680
Number of instances (rows) of the dataset.
10
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
8
Number of numeric attributes.
1
Number of nominal attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
Average class difference between consecutive instances.
80
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
10
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task