Data
Reddit-WallStreetBets-Posts

Reddit-WallStreetBets-Posts

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context WallStreetBets (r/wallstreetbets, also known as WSB), is a subreddit where participants discuss stock and option trading. It has become notable for its profane nature and allegations of users manipulating securities. Recently the community became mainstream again with its interest on GameStop shares. The data might contain a small percent of harsh language, the posts were not filtered. Content Reddit posts from subreddit WallStreetBets, downloaded from https://www.reddit.com/r/wallstreetbets/ using praw (The Python Reddit API Wrapper). Inspiration You can use the data to: Perform sentiment analysis; Identify discussion topics; Follow the trends (like appearance of keywords as GME, AMP, NOK and whatever other trends are actual in the data).

7 features

titlestring51750 unique values
130 missing
scorenumeric5210 unique values
0 missing
id (ignore)string53187 unique values
0 missing
urlstring53172 unique values
0 missing
comms_numnumeric2045 unique values
0 missing
creatednumeric43460 unique values
0 missing
bodystring24077 unique values
28525 missing
timestampstring43460 unique values
0 missing

19 properties

53187
Number of instances (rows) of the dataset.
7
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
28655
Number of missing values in the dataset.
28543
Number of instances with at least one value missing.
3
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
42.86
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
53.67
Percentage of instances having missing values.
Average class difference between consecutive instances.
7.7
Percentage of missing values.

0 tasks

Define a new task