Data
COVID-19-Visualisation-and-Epidemic-Analysis-Data

COVID-19-Visualisation-and-Epidemic-Analysis-Data

active ARFF Database: Open Database, Contents: Database Contents Visibility: public Uploaded 23-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
COVID-19 Dataset for Epidemic Model Development I combined several data sources to gain an integrated dataset involving country-level COVID-19 confirmed, recovered and fatalities cases which can be used to build some epidemic models such as SIR, SIR with mortality. Adding information regarding population which can be used for calculating incidence rate and prevalence rate. One of my applications based on this dataset is published at https://dylansp.shinyapps.io/COVID19_Visualization_Analysis_Tool/. Content My approach is to retrieve cumulative confirmed cases, fatalities and recovered cases since 2020-01-22 onwards from the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) COVID-19 dataset, merged with country code as well as population of each country. For the purpose of building epidemic models, I calculated information regarding daily new confirmed cases, recovered cases, and fatalities, together with remaining confirmed cases which equal to cumulative confirmed cases - cumulative recovered cases - cumulative fatalities. I haven't yet to find creditable data sources regarding probable cases of various countries yet. I'll add them once I found them. Date: The date of the record. CountryRegion: The name of the country/region. -alpha-3code: country code for that can be used for map visualization. Population: The population of the given country/region. TotalConfirmedCases: Cumulative confirmed cases. Total_Fatalities: Cumulative fatalities. TotalRecoveredCases: Cumulative recovered cases. NewConfirmedCases: Daily new confirmed cases. New_Fatalities: Daily new fatalities. NewRecoveredCases: Daily new recovered cases. RemainingConfirmedCases: Remaining infected cases which equal to (cumulative confirmed cases - cumulative recovered cases - cumulative fatalities). Acknowledgements The data source of confirmed cases, recovered cases and deaths is JHU CSSE https://github.com/CSSEGISandData/COVID-19; The data source of the country-level population mainly comes from https://storage.guidotti.dev/covid19/data/ and Worldometer (https://www.worldometers.info/population/). Inspiration Building up the country-level COVID-19 case track dashboard. Insights regarding the incidence rate, prevalence rate, mortality and recovery rate of various countries. Building up epidemic models for forecasting.

11 features

Datestring367 unique values
0 missing
Country_Regionstring192 unique values
0 missing
alpha-3_codestring187 unique values
1835 missing
Populationnumeric192 unique values
0 missing
Total_Confirmed_Casesnumeric29944 unique values
0 missing
Total_Fatalitiesnumeric10173 unique values
0 missing
Total_Recovered_Casesnumeric24521 unique values
0 missing
New_Confirmed_Casesnumeric6985 unique values
0 missing
New_Fatalitiesnumeric1168 unique values
0 missing
New_Recovered_Casesnumeric5789 unique values
0 missing
Remaining_Confirmed_Casesnumeric20994 unique values
0 missing

19 properties

70464
Number of instances (rows) of the dataset.
11
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
1835
Number of missing values in the dataset.
1835
Number of instances with at least one value missing.
8
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of binary attributes.
2.6
Percentage of instances having missing values.
Average class difference between consecutive instances.
0.24
Percentage of missing values.
0
Number of attributes divided by the number of instances.
72.73
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task