Data
California-Environmental-Conditions-Dataset

California-Environmental-Conditions-Dataset

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Explore an environmental conditions dataframe scraped from CIMIS weather stations using a selenium chromedriver. With California's wildfires setting records in 2020, it is worthwhile to explore factors that may contribute to creating at risk environments. This dataset was used in conjunction to building an XGBoost Classifier to accurately predict probability for fire given environmental condition features. Following my Fire Risk Analysis project. Content 262 Station Id's correspond to California weather station IDs. Approximately 14 numerical features for exploratory data analysis. Advanced users can keep date feature for time series analysis. Target column corresponds to fires on the respective observation date, in the observation region. Acknowledgements CIMIS: https://cimis.water.ca.gov/Default.aspx Inspiration What additional features would be valuable in determining fire risk? What features are most important for specific models in determining target? Is there an accurate LSTM to determine feature predictions? " to determine fire risk in the future?

19 features

Target (target)numeric2 unique values
0 missing
Stn_Idnumeric153 unique values
0 missing
Stn_Namestring153 unique values
0 missing
CIMIS_Regionstring14 unique values
0 missing
Datestring991 unique values
0 missing
ETo_(in)numeric50 unique values
83 missing
Precip_(in)numeric312 unique values
0 missing
Sol_Rad_(Ly/day)numeric976 unique values
0 missing
Avg_Vap_Pres_(mBars)numeric333 unique values
0 missing
Max_Air_Temp_(F)numeric922 unique values
3 missing
Min_Air_Temp_(F)numeric884 unique values
1 missing
Avg_Air_Temp_(F)numeric850 unique values
5 missing
Max_Rel_Hum_(%)numeric101 unique values
0 missing
Min_Rel_Hum_(%)numeric101 unique values
0 missing
Avg_Rel_Hum_(%)numeric101 unique values
13 missing
Dew_Point_(F)numeric857 unique values
13 missing
Avg_Wind_Speed_(mph)numeric195 unique values
0 missing
Wind_Run_(miles)numeric3308 unique values
0 missing
Avg_Soil_Temp_(F)numeric648 unique values
20 missing

19 properties

128125
Number of instances (rows) of the dataset.
19
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
138
Number of missing values in the dataset.
116
Number of instances with at least one value missing.
16
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
84.21
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0.09
Percentage of instances having missing values.
1
Average class difference between consecutive instances.
0.01
Percentage of missing values.

0 tasks

Define a new task