Data
Weather-Saaleaue

Weather-Saaleaue

active ARFF Creative Commons Attribution 4.0 International Visibility: public Uploaded 25-06-2024 by Bruno Belucci Teixeira
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Weather measures from Saaleaue provided by the Max-Planck-Institute for Biogeochemistry. Several weather measures provided by Max-Planck-Institute for Biogeochemistry from the Weather Station on Top of the Roof of the Institute Building. We have assembled all the files available as of 24-05-2024 on https://www.bgc-jena.mpg.de/wetter/weather_data.html There are 32 columns: id_series: The id of the time series. date: The date of the time series in the format "%Y-%m-%d %H:%M:%S". time_step: The time step on the time series. value_X (X from 0 to 20): The values of the time series, which will be used for the forecasting task. Preprocessing: 1 - Renamed column 'Date Time' to 'date' 2 - Parsed the date with the format '%d.%m.%Y %H:%M:%S' and converted it to string with format %Y-%m-%d %H:%M:%S. 3 - Replaced values of -9999 to nan. Values of -9999 seems to indicate a problem with the measure. Besides, it seems that some measures only started to be collected later on the year of 2002. 4 - Renamed columns [1:] to 'value_X' with X from 0 to 20. 5 - Rounded 'date' to the nearest 10 minutes. Some 'date' values were not exactly at 10 minutes frequency (offset by some seconds or by 1 minute for '2011-09-26 13:41:00', '2012-07-24 06:51:00', '2013-08-23 10:11:00'). 6 - Created 'id_series' with value 0. There is only one multivariate time series. 7 - Ensured that there are no missing dates and that the frequency of the time_series is 10 minutes. Filled the missing dates with NaNs. 8 - Created 'time_step' column from 'date' and 'id_series' with increasing values from 0 to the size of the time series. 9 - Casted 'date' to str, 'time_step' to int, 'value_X' to float, and defined 'id_series' as 'category'.

33 features

id_seriesnominal1 unique values
0 missing
datestring1151800 unique values
0 missing
value_0numeric7447 unique values
3249 missing
value_1numeric5886 unique values
2988 missing
value_2numeric6653 unique values
3539 missing
value_3numeric1767 unique values
4223 missing
value_4numeric5984 unique values
3473 missing
value_5numeric5233 unique values
3968 missing
value_6numeric3876 unique values
2988 missing
value_7numeric2746 unique values
3748 missing
value_8numeric4618 unique values
3748 missing
value_9numeric2748 unique values
4223 missing
value_10numeric25518 unique values
4223 missing
value_11numeric1331 unique values
27149 missing
value_12numeric10809 unique values
26919 missing
value_13numeric620 unique values
3494 missing
value_14numeric54653 unique values
2777 missing
value_15numeric2385 unique values
26610 missing
value_16numeric6868 unique values
4089 missing
value_17numeric50619 unique values
2818 missing
value_18numeric4498 unique values
25471 missing
value_19numeric4492 unique values
4720 missing
value_20numeric3535 unique values
6510 missing
value_21numeric2926 unique values
25473 missing
value_22numeric2346 unique values
25469 missing
value_23numeric1934 unique values
3297 missing
value_24numeric1368 unique values
25473 missing
value_25numeric3415 unique values
31940 missing
value_26numeric3391 unique values
30155 missing
value_27numeric2518 unique values
26403 missing
value_28numeric2788 unique values
26393 missing
value_29numeric1202 unique values
26410 missing
time_stepnumeric1151865 unique values
0 missing

19 properties

1151865
Number of instances (rows) of the dataset.
33
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
391940
Number of missing values in the dataset.
39850
Number of instances with at least one value missing.
31
Number of numeric attributes.
1
Number of nominal attributes.
0
Percentage of binary attributes.
3.46
Percentage of instances having missing values.
1.03
Percentage of missing values.
Average class difference between consecutive instances.
93.94
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
3.03
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task