OpenML
Data-Science-YouTube-channels-Video-Metadata

Data-Science-YouTube-channels-Video-Metadata

active ARFF CC BY-NC-SA 4.0 Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Content This dataset contains meta data of around 60 Data Science YouTube channel videos meta data. Acknowledgements Data scraped from https://wiki.digitalmethods.net/Dmi/ToolDatabase . Cover Photo: Photo by Rachit Tank on Unsplash. Motivation : Dataset by Gabriel Preda Inspiration Possible uses for this dataset could include: Sentiment analysis/ Categorising YouTube videos based on their comments and statistics. Training ML algorithms like RNNs to generate their own YouTube description. Most popular data science youtube channel based on total likes, dislikes, votes counts. Statistical analysis over different channels_x000C_. Feel free to check notebook for other possible uses.

21 features

channelIdstring60 unique values
0 missing
channelTitlestring60 unique values
0 missing
videoIdstring43709 unique values
0 missing
publishedAtstring36001 unique values
0 missing
publishedAtSQLstring32799 unique values
0 missing
videoTitlestring43777 unique values
1 missing
videoDescriptionstring36257 unique values
482 missing
videoCategoryIdnumeric15 unique values
1 missing
videoCategoryLabelstring15 unique values
1 missing
durationstring5769 unique values
1 missing
durationSecnumeric3506 unique values
1 missing
dimensionstring1 unique values
1 missing
definitionstring2 unique values
1 missing
captionnominal2 unique values
1 missing
thumbnail_maxresstring34232 unique values
10029 missing
licensedContentnumeric1 unique values
25450 missing
viewCountnumeric18914 unique values
3 missing
likeCountnumeric3019 unique values
859 missing
dislikeCountnumeric517 unique values
859 missing
favoriteCountnumeric1 unique values
1 missing
commentCountnumeric873 unique values
9365 missing

19 properties

44261
Number of instances (rows) of the dataset.
21
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
47056
Number of missing values in the dataset.
31554
Number of instances with at least one value missing.
8
Number of numeric attributes.
1
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
38.1
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
4.76
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
1
Number of binary attributes.
4.76
Percentage of binary attributes.
71.29
Percentage of instances having missing values.
Average class difference between consecutive instances.
5.06
Percentage of missing values.

0 tasks

Define a new task