OpenML
Covid-19-Research-Articles-(NCBI)

Covid-19-Research-Articles-(NCBI)

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Machine Learning Manufacturing
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context I collected about 1200 Covid-19 research articles from the NCBI.NLM.NIH website to be utilized in ML algorithms/ Data Analysis such as Sentiment Analysis, Time Series, Recommender System and/or Classification. Content link: URL to the research article title: research article keywords: words under which the research article is categorized dates: publication date online abstract: a brief summary of the article (methods hypothesis included) conclusion: findings of the research For the sake of time, I left some columns with 'null' String values. It's your choice to filter the values, and use what is more appropriate for your ML model. I didn't include authors/contributors as it won't serve a purpose in this datasets Inspiration I am interested in knowing the focus of those studies (by analyzing word frequencies) as well as analyzing the volume of publications over time.

5 features

link (ignore)string1198 unique values
0 missing
titlestring1197 unique values
0 missing
keywordsstring843 unique values
351 missing
conclusionstring825 unique values
372 missing
datesstring266 unique values
319 missing
abstractstring779 unique values
417 missing

19 properties

1198
Number of instances (rows) of the dataset.
5
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
1459
Number of missing values in the dataset.
768
Number of instances with at least one value missing.
0
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
0
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
64.11
Percentage of instances having missing values.
Average class difference between consecutive instances.
24.36
Percentage of missing values.

0 tasks

Define a new task