Data
Disaster-Tweets

Disaster-Tweets

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context The file contains over 11,000 tweets associated with disaster keywords like crash, quarantine, and bush fires as well as the location and keyword itself. The data structure was inherited from Disasters on social media The tweets were collected on Jan 14th, 2020. Some of the topics people were tweeting: The eruption of Taal Volcano in Batangas, Philippines Coronavirus Bushfires in Australia Iran downing of the airplane flight PS752 Disclaimer: The dataset contains text that may be considered profane, vulgar, or offensive. Inspiration The intention was to enrich the already available data for this topic with newly collected and manually classified tweets. The initial source Disasters on social media which is used in Real or Not? NLP with Disaster Tweets competition on Kaggle.

5 features

idnumeric11370 unique values
0 missing
keywordstring219 unique values
0 missing
locationstring4358 unique values
3535 missing
textstring11220 unique values
0 missing
targetnumeric2 unique values
0 missing

19 properties

11370
Number of instances (rows) of the dataset.
5
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
3535
Number of missing values in the dataset.
3535
Number of instances with at least one value missing.
2
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
40
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
31.09
Percentage of instances having missing values.
Average class difference between consecutive instances.
6.22
Percentage of missing values.

0 tasks

Define a new task