Data
(Non-)depressive_tweet_data

(Non-)depressive_tweet_data

active ARFF Attribution (CC BY) Visibility: public Uploaded 15-05-2024 by Iwo Godzwon
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Description: The dataset, named "clean_tweet_Dec19ToDec20.csv," comprises a collection of tweets post-processed for clarity and analysis, spanning from December 2019 to December 2020. It is designed to provide insights into public sentiment during this period, capturing a unique blend of personal and societal narratives emerging from various global circumstances, including the COVID-19 pandemic. This dataset is structured into columns that include an index for unique identification, the raw text of each tweet, and a sentiment score. Attribute Description: - Index: A numerical identifier assigned to each tweet, e.g., 98655, 59794. - Text: Contains the cleaned and processed text of the tweet. This column captures a wide range of topics, from personal appliance purchases and mental health advice to discussions on electricity waste, unemployment, and even cryptocurrency-related dietary suggestions. - Sentiment: A numerical sentiment score assigned to each tweet, where 0 indicates a negative sentiment and 1 indicates a positive sentiment. This binary classification assists in sentiment analysis, offering a simplistic yet effective insight into the general mood of each tweet. Use Case: This dataset can be instrumental for researchers and data scientists focusing on natural language processing (NLP), sentiment analysis, and trend spotting. It offers a rich resource for training machine learning models aimed at understanding public sentiment, detecting shifts in societal concerns or interests over time, and exploring the correlation between external events and public mood. Additionally, marketing professionals might leverage this dataset to gauge consumer sentiment, optimize brand communication strategies, and identify potential areas for product or service improvements based on public feedback.

3 features

Indexstring134348 unique values
0 missing
textstring124016 unique values
18 missing
sentimentnominal2 unique values
0 missing

19 properties

134348
Number of instances (rows) of the dataset.
3
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
18
Number of missing values in the dataset.
18
Number of instances with at least one value missing.
0
Number of numeric attributes.
1
Number of nominal attributes.
1
Number of binary attributes.
33.33
Percentage of binary attributes.
0.01
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
0
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
33.33
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.

0 tasks

Define a new task