Data
Capitol-Riot-Tweets

Capitol-Riot-Tweets

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
A csv file with 80,000+ tweets from January 6th, 2021 -- the day of the capitol hill riots. Made using the Twitter Developer API + Tweepy. Nowhere close to the size of the Parler data dumps, but anyone with NLP experience might be able to find something useful here. tweets have mentions, hyperlinks, emojis, and punctuation removed. All text is converted to lowercase. Some tweets have coordinates (if users had geotagging enabled). Verified users have their usernames included "user location" is the user's self reported location in their profile. Blank if it doesn't correspond to a US state (or DC)

14 features

tweet_idnumeric81987 unique values
0 missing
textstring39078 unique values
1 missing
querystring16 unique values
0 missing
user_idnumeric72442 unique values
0 missing
user_namestring2071 unique values
79831 missing
follower_countnumeric10791 unique values
0 missing
user_tweet_countnumeric41197 unique values
0 missing
likesnumeric353 unique values
0 missing
retweetsnumeric1702 unique values
0 missing
location_namestring207 unique values
82024 missing
longitudenumeric207 unique values
82024 missing
latitudenumeric207 unique values
82024 missing
user_locationstring52 unique values
66419 missing
datestring1 unique values
0 missing

19 properties

82309
Number of instances (rows) of the dataset.
14
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
392323
Number of missing values in the dataset.
82296
Number of instances with at least one value missing.
8
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
57.14
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
99.98
Percentage of instances having missing values.
Average class difference between consecutive instances.
34.05
Percentage of missing values.

0 tasks

Define a new task