Data
Popular-Halloween-2020--Costumes-Amazon-Reviews

Popular-Halloween-2020--Costumes-Amazon-Reviews

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context So it's Halloween again dear Kagglers! And what better way of celebrating than with some NLP! The dataset brings you the reviews of popular Halloween costumes sold on amazon as of November 2020. Content The dataset contains popular costumes from the Amazon website, for each costume there are user review texts including the review title and the review score, also you will find the publishing date and location. The data hasn't been preprocessed in any way so I think it can be a great exercise for aspiring data scientists who are looking to sharpen their skills in text preprocessing skills and feature extraction skills.

5 features

textstring7666 unique values
0 missing
datestring2247 unique values
0 missing
titlestring5522 unique values
16 missing
ratingnumeric5 unique values
0 missing
product_namestring73 unique values
0 missing

19 properties

7814
Number of instances (rows) of the dataset.
5
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
16
Number of missing values in the dataset.
16
Number of instances with at least one value missing.
1
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
20
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0.2
Percentage of instances having missing values.
Average class difference between consecutive instances.
0.04
Percentage of missing values.

0 tasks

Define a new task