OpenML
Coronavirus-Worldwide-Dataset

Coronavirus-Worldwide-Dataset

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people. So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community. The European CDC publishes daily statistics on the COVID-19 pandemic. Not just for Europe, but for the entire world. We rely on the ECDC as they collect and harmonize data from around the world which allows us to compare what is happening in different countries. Content This dataset has daily level information on the number of affected cases, deaths and recovery etc. from coronavirus. It also contains various other parameters like average life expectancy, population density, smocking population etc. which users can find useful in further prediction that they need to make. The data is available from 31 Dec,2019. Inspiration Give people weekly data so that they can use it to make accurate predictions.

36 features

iso_codestring211 unique values
64 missing
continentstring6 unique values
289 missing
locationstring211 unique values
64 missing
datestring225 unique values
64 missing
total_casesnumeric12229 unique values
409 missing
new_casesnumeric3228 unique values
409 missing
total_deathsnumeric4200 unique values
409 missing
new_deathsnumeric895 unique values
409 missing
total_cases_per_millionnumeric22087 unique values
409 missing
new_cases_per_millionnumeric12225 unique values
409 missing
total_deaths_per_millionnumeric10822 unique values
409 missing
new_deaths_per_millionnumeric2626 unique values
409 missing
new_testsnumeric7227 unique values
24852 missing
total_testsnumeric11225 unique values
24500 missing
total_tests_per_thousandnumeric9321 unique values
24500 missing
new_tests_per_thousandnumeric2313 unique values
24852 missing
new_tests_smoothednumeric8042 unique values
23444 missing
new_tests_smoothed_per_thousandnumeric2338 unique values
23444 missing
tests_per_casenumeric10832 unique values
24296 missing
positive_ratenumeric508 unique values
23947 missing
tests_unitsstring5 unique values
22698 missing
stringency_indexnumeric158 unique values
6520 missing
populationnumeric211 unique values
64 missing
population_densitynumeric200 unique values
1650 missing
median_agenumeric133 unique values
3655 missing
aged_65_oldernumeric183 unique values
4130 missing
aged_70_oldernumeric182 unique values
3823 missing
gdp_per_capitanumeric184 unique values
4060 missing
extreme_povertynumeric73 unique values
14679 missing
cardiovasc_death_ratenumeric186 unique values
3630 missing
diabetes_prevalencenumeric141 unique values
2534 missing
female_smokersnumeric107 unique values
10421 missing
male_smokersnumeric122 unique values
10733 missing
handwashing_facilitiesnumeric92 unique values
21177 missing
hospital_beds_per_thousandnumeric100 unique values
6633 missing
life_expectancynumeric197 unique values
505 missing

19 properties

36137
Number of instances (rows) of the dataset.
36
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
314500
Number of missing values in the dataset.
33643
Number of instances with at least one value missing.
31
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
86.11
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
93.1
Percentage of instances having missing values.
Average class difference between consecutive instances.
24.17
Percentage of missing values.

0 tasks

Define a new task