{ "data_id": "46236", "name": "CIF-2016-competition", "exact_name": "CIF-2016-competition", "version": 1, "version_label": null, "description": "CIF 2016 time series forecasting competition , monthly data.\n\nFrom original source:\n-----\nCompetition Data Format\n\nData file containing time series to be predicted is a text file having the following format:\n\nEach row contains a single time series data record;\n\nitems in the row are delimited with semicolon (\";\");\n\nthe first item is an ID of the time series;\n\nthe second item determines the forecasting horizon, i.e., the number of values to be forecasted;\n\nthe third item determines the frequency of the time series (this year \"monthly\" only);\n\nthe rest of the row contains numeric data of the time series;\n\nthe number of values in each row may differ because each time series is of different length.\n\nExample of the competition data format:\n\nts1;4;yearly;26.5;38.2;5.3\nts2;12;monthly;1;2;4;5;5;6;8;9;10\n...\nts72;12;daily;1;2;4;5;5;6;8;9;10\n\n-----\n\nThere are 3 columns:\n\nid_series: The id of the time series.\n\ntime_step: The time step on the time series.\n\nvalue_0: The values of the time series, which will be used for the forecasting task.\n\nPreprocessing:\n\nTraining set\n\n1 - Renamed first three columns to 'id_series' and 'horizon' and 'period', and renamed the other columns to reflect the actual time_step of the time series.\n\n2 - Melted the data, obtaining columns 'time_step' and 'value_0'.\n\n3 - Dropped nan values.\n\nThe nan values correspond to time series that are shorter than the time series with maximum lenght, there are no nans in the middle of a time series.\n\n4 - Defined columns 'id_series' as 'category', casted 'time_step' to int.\n\nTest set:\n\nSame as for the training set. \n\nFinally, we have concatenated both training and test set. If one wants to use the same train and test set of the competition, we invite them to get the\nforecasting horizon of the original data on the provided website.", "format": "arff", "uploader": "Bruno Belucci Teixeira", "uploader_id": 30703, "visibility": "public", "creator": "\"Martin Stepnicka, Michal Burda\"", "contributor": "\"Bruno Belucci\"", "date": "2024-06-25 00:32:29", "update_comment": null, "last_update": "2024-06-25 00:32:29", "licence": "Creative Commons Attribution 4.0 International", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22120700\/dataset", "kaggle_url": null, "default_target_attribute": null, "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "CIF-2016-competition", "CIF 2016 time series forecasting competition , monthly data. From original source: ----- Competition Data Format Data file containing time series to be predicted is a text file having the following format: Each row contains a single time series data record; items in the row are delimited with semicolon (\";\"); the first item is an ID of the time series; the second item determines the forecasting horizon, i.e., the number of values to be forecasted; the third item determines the frequency of the t " ], "weight": 5 }, "qualities": { "NumberOfInstances": 7108, "NumberOfFeatures": 3, "NumberOfClasses": null, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 2, "NumberOfSymbolicFeatures": 1, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 0, "PercentageOfMissingValues": 0, "AutoCorrelation": null, "PercentageOfNumericFeatures": 66.66666666666666, "Dimensionality": 0.0004220596510973551, "PercentageOfSymbolicFeatures": 33.33333333333333, "MajorityClassPercentage": null, "MajorityClassSize": null, "MinorityClassPercentage": null, "MinorityClassSize": null, "NumberOfBinaryFeatures": 0 }, "tags": [], "features": [ { "name": "id_series", "index": "0", "type": "nominal", "distinct": "72", "missing": "0", "distr": [] }, { "name": "time_step", "index": "1", "type": "numeric", "distinct": "120", "missing": "0", "min": "0", "max": "119", "mean": "54", "stdev": "34" }, { "name": "value_0", "index": "2", "type": "numeric", "distinct": "6994", "missing": "0", "min": "1", "max": "1495400234", "mean": "17995793", "stdev": "150607206" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }