Data
youtube-spam-eminem

youtube-spam-eminem

active ARFF CC-BY Visibility: public Uploaded 19-05-2021 by Meilina Reksoprodjo
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Unknown Source: [UCI](https://archive.ics.uci.edu/ml/datasets/YouTube+Spam+Collection) - 2017 Please cite*: [Paper](http://dcomp.sor.ufscar.br/talmeida/youtubespamcollection/) YouTube Spam Collection Eminem dataset It is a public set of comments collected for spam research. It has five datasets composed by 1,956 real messages extracted from five videos that were among the 10 most viewed on the collection period. This dataset only contains information about LMFAO. It consists of 245 spam entries and 203 ham entries, leading to a grand total of 448 samples. ### Attribute information The collection is composed by one CSV file per dataset, where each line has the following attributes: COMMENT_ID,AUTHOR,DATE,CONTENT,TAG

5 features

COMMENT_IDstring446 unique values
0 missing
AUTHORstring392 unique values
0 missing
DATEstring203 unique values
245 missing
CONTENTstring412 unique values
0 missing
CLASSnumeric2 unique values
0 missing

19 properties

448
Number of instances (rows) of the dataset.
5
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
245
Number of missing values in the dataset.
245
Number of instances with at least one value missing.
1
Number of numeric attributes.
0
Number of nominal attributes.
0.01
Number of attributes divided by the number of instances.
20
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
54.69
Percentage of instances having missing values.
Average class difference between consecutive instances.
10.94
Percentage of missing values.

0 tasks

Define a new task