COVID-19 Dataset for Epidemic Model Development
I combined several data sources to gain an integrated dataset involving country-level COVID-19 confirmed, recovered and fatalities cases which can be used to build some epidemic models such as SIR, SIR with mortality. Adding information regarding population which can be used for calculating incidence rate and prevalence rate.
One of my applications based on this dataset is published at https://dylansp.shinyapps.io/COVID19_Visualization_Analysis_Tool/.
Content
My approach is to retrieve cumulative confirmed cases, fatalities and recovered cases since 2020-01-22 onwards from the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) COVID-19 dataset, merged with country code as well as population of each country. For the purpose of building epidemic models, I calculated information regarding daily new confirmed cases, recovered cases, and fatalities, together with remaining confirmed cases which equal to cumulative confirmed cases - cumulative recovered cases - cumulative fatalities. I haven't yet to find creditable data sources regarding probable cases of various countries yet. I'll add them once I found them.
Date: The date of the record.
CountryRegion: The name of the country/region.
-alpha-3code: country code for that can be used for map visualization.
Population: The population of the given country/region.
TotalConfirmedCases: Cumulative confirmed cases.
Total_Fatalities: Cumulative fatalities.
TotalRecoveredCases: Cumulative recovered cases.
NewConfirmedCases: Daily new confirmed cases.
New_Fatalities: Daily new fatalities.
NewRecoveredCases: Daily new recovered cases.
RemainingConfirmedCases: Remaining infected cases which equal to (cumulative confirmed cases - cumulative recovered cases - cumulative fatalities).
Acknowledgements
The data source of confirmed cases, recovered cases and deaths is JHU CSSE https://github.com/CSSEGISandData/COVID-19;
The data source of the country-level population mainly comes from https://storage.guidotti.dev/covid19/data/ and Worldometer (https://www.worldometers.info/population/).
Inspiration
Building up the country-level COVID-19 case track dashboard.
Insights regarding the incidence rate, prevalence rate, mortality and recovery rate of various countries.
Building up epidemic models for forecasting.