Context
The data obtained from the Mexico's General Direction of Epidemiology contains multiple information on the current pandemic situation. However, these data are saturated with features that may not be very useful in a predictive analysis.
Due to this I decided to clean and format the original data and generate a dataset that groups confirmed, dead, recovered and active cases by State, Municipality and Date.
This is very useful if you want to generate geographically specific models
Content
The data set contains the covid cases columns (positive, dead, recovered and active) that are counted by state and municipality.
I.e
Sate
Municipality
Date
Deaths
Confirmed
recovered
Active
Ciudad de Mexico
Iztapalapa
2020-07-18
1
42
0
41
Ciudad de Mexico
Iztapalapa
2020-07-19
0
14
0
14
Ciudad de Mexico
Iztapalapa
2020-07-20
0
41
0
41
Would you like to see the data cleaning notebook?
You can check it in my Github
Classification criteria
Recovered cases: If the patient is not dead and it has been more than 15 days then he is considered as recovered.
Active cases: If the patien isn't recovered an isn't dead then is active
Time lapse
The first documented case is on 2020-01-13.
The dataset will be updated every day adding new cases
Acknowledgements
For this project, the data are obtained from the official URL of the government of Mxico whose author is Direccin General de Epidemiologa:
Corona Virus Data: https://www.gob.mx/salud/documentos/datos-abiertos-152127
Data Dictionary: https://www.gob.mx/salud/documentos/datos-abiertos-152127
Differences in results
According to the official results obtained from: https://coronavirus.gob.mx/datos/
The main difference between the official data and this dataset is in the recovered cases. This is because the Mexican government only considers outpatient cases when counting recovered cases. This dataset considers outpatient and inpatient cases when counting recovered people.
The second difference is some rows that contained nonsense information(I think this was a data collection error by the institution), these were eliminated.