OpenML
The-Social-Dilemma-Tweets---Text-Classification

The-Social-Dilemma-Tweets---Text-Classification

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context The Social Dilemma, a documentary-drama hybrid explores the dangerous human impact of social networking, with tech experts sounding the alarm on their own creations as the tech experts sound the alarm on the dangerous human impact of social networking. Initial release: January 2020 Director: Jeff Orlowski Producer: Larissa Rhodes Music director: Mark A. Crawford Screenplay: Jeff Orlowski, Vickie Curtis, Davis Coombe Content This dataset brings you the twitter responses made with the TheSocialDilemma hashtag after watching the eye-opening documentary "The Social Dilemma" released in an OTT platform(Netflix) on September 9th, 2020. The dataset was extracted using TwitterAPI, consisting of nearly 10,526 tweets from twitter users all over the globe! No Columns Descriptions 1 user_name The name of the user, as theyve defined it. 2 user_location The user-defined location for this accounts profile. 3 user_description The user-defined UTF-8 string describing their account. 4 user_created Time and date, when the account was created. 5 user_followers The number of followers an account currently has. 6 user_friends The number of friends an account currently has. 7 user_favourites The number of favorites a account currently has 8 user_verified When true, indicates that the user has a verified account 9 date UTC time and date when the Tweet was created 10 text The actual UTF-8 text of the Tweet 11 hashtags All the other hashtags posted in the tweet along with TheSocialDilemma 12 source Utility used to post the Tweet, Tweets from the Twitter website have a source value - web 13 is_retweet Indicates whether this Tweet has been Retweeted by the authenticating user. 14 Sentiment(Target variable) Indicates the sentiment of the tweet, consists of three categories: Positive, neutral, and negative Inspiration You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, looks to trends.

14 features

user_namestring15184 unique values
328 missing
user_locationstring5652 unique values
4293 missing
user_descriptionstring14724 unique values
1511 missing
user_createdstring16109 unique values
0 missing
user_followersnumeric4141 unique values
0 missing
user_friendsnumeric3220 unique values
0 missing
user_favouritesnumeric9945 unique values
0 missing
user_verifiednominal2 unique values
0 missing
datestring19811 unique values
0 missing
textstring19804 unique values
0 missing
hashtagsstring1753 unique values
4297 missing
sourcestring82 unique values
0 missing
is_retweetnominal1 unique values
0 missing
Sentimentstring3 unique values
0 missing

19 properties

20068
Number of instances (rows) of the dataset.
14
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
10429
Number of missing values in the dataset.
8638
Number of instances with at least one value missing.
3
Number of numeric attributes.
2
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
21.43
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
14.29
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
2
Number of binary attributes.
14.29
Percentage of binary attributes.
43.04
Percentage of instances having missing values.
Average class difference between consecutive instances.
3.71
Percentage of missing values.

0 tasks

Define a new task