Weather measures from Beutenberg provided by the Max-Planck-Institute for Biogeochemistry
Several weather measures provided by Max-Planck-Institute for Biogeochemistry from the Weather Station on Top of the Roof of the Institute Building.
We have assembled all the files available as of 24-05-2024 on https://www.bgc-jena.mpg.de/wetter/weather_data.html
There are 23 columns:
id_series: The id of the time series.
date: The date of the time series in the format "%Y-%m-%d %H:%M:%S".
time_step: The time step on the time series.
value_X (X from 0 to 20): The values of the time series, which will be used for the forecasting task.
Preprocessing:
1 - Renamed column 'Date Time' to 'date'
2 - Parsed the date with the format '%d.%m.%Y %H:%M:%S' and converted it to string with format %Y-%m-%d %H:%M:%S.
3 - Replaced values of -9999 to nan.
Values of -9999 seems to indicate a problem with the measure. Besides, it seems that the measure for 'CO2 (ppm)' started to be recorded on 2008, before
all the values were already NaN.
4 - Renamed columns with characters that cannot be encoded with encoding utf8.
5 - Renamed columns [1:] to 'value_X' with X from 0 to 20.
6 - Created 'id_series' with value 0. There is only one multivariate time series.
7 - Ensured that there are no missing dates and that the frequency of the time_series is 10 minutes. Filled the missing dates with NaNs.
8 - Created 'time_step' column from 'date' and 'id_series' with increasing values from 0 to the size of the time series.
9 - Casted 'date' to str, 'time_step' to int, 'value_X' to float, and defined 'id_series' as 'category'.