OpenML
Forest-Fires-Data-Set-Portugal

Forest-Fires-Data-Set-Portugal

active ARFF Database: Open Database, Contents: Database Contents Visibility: public Uploaded 24-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
ABSTRACT This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data (see details at: [Web Link]). Data Set Information: Data Set Characteristics: Multivariate Number of Instances: 517 Area: Physical Attribute Characteristics: Real Number of Attributes: 13 Date Donated: 2008-02-29 Associated Tasks: Regression Missing Values? N/A Number of Web Hits: 871088 In [Cortez and Morais, 2007], the output 'area' was first transformed with a ln(x+1) function. Then, several Data Mining methods were applied. After fitting the models, the outputs were post-processed with the inverse of the ln(x+1) transform. Four different input setups were used. The experiments were conducted using a 10-fold (cross-validation) x 30 runs. Two regression metrics were measured: MAD and RMSE. A Gaussian support vector machine (SVM) fed with only 4 direct weather conditions (temp, RH, wind and rain) obtained the best MAD value: 12.71 +- 0.01 (mean and confidence interval within 95 using a t-student distribution). The best RMSE was attained by the naive mean predictor. An analysis to the regression error curve (REC) shows that the SVM model predicts more examples within a lower admitted error. In effect, the SVM model predicts better small fires, which are the majority. Source: Paulo Cortez, pcortez '' dsi.uminho.pt, Department of Information Systems, University of Minho, Portugal. Anbal Morais, araimorais '' gmail.com, Department of Information Systems, University of Minho, Portugal. Relevant Papers: [Cortez and Morais, 2007] P. Cortez and A. Morais. A Data Mining Approach to Predict Forest Fires using Meteorological Data. In J. Neves, M. F. Santos and J. Machado Eds., New Trends in Artificial Intelligence, Proceedings of the 13th EPIA 2007 - Portuguese Conference on Artificial Intelligence, December, Guimares, Portugal, pp. 512-523, 2007. APPIA, ISBN-13 978-989-95618-0-9. Available at: [Web Link]

13 features

Xnumeric9 unique values
0 missing
Ynumeric7 unique values
0 missing
monthstring12 unique values
0 missing
daystring7 unique values
0 missing
FFMCnumeric106 unique values
0 missing
DMCnumeric215 unique values
0 missing
DCnumeric219 unique values
0 missing
ISInumeric119 unique values
0 missing
tempnumeric192 unique values
0 missing
RHnumeric75 unique values
0 missing
windnumeric21 unique values
0 missing
rainnumeric7 unique values
0 missing
areanumeric251 unique values
0 missing

19 properties

517
Number of instances (rows) of the dataset.
13
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
11
Number of numeric attributes.
0
Number of nominal attributes.
0.03
Number of attributes divided by the number of instances.
84.62
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task