Data
Spotify---All-Time-Top-2000s-Mega-Dataset

Spotify---All-Time-Top-2000s-Mega-Dataset

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context This dataset contains audio statistics of the top 2000 tracks on Spotify. The data contains about 15 columns each describing the track and it's qualities. Songs released from 1956 to 2019 are included from some notable and famous artists like Queen, The Beatles, Guns N' Roses, etc. http://sortyourmusic.playlistmachinery.com/ by plamere uses Spotify API to extract the audio features from the tracks given the Spotify Playlist URI. This data contains audio features like Danceability, BPM, Liveness, Valence(Positivity) and many more. Each feature's description has been given in detail below. Content Index: ID Title: Name of the Track Artist: Name of the Artist Top Genre: Genre of the track Year: Release Year of the track Beats per Minute(BPM): The tempo of the song Energy: The energy of a song - the higher the value, the more energtic. song Danceability: The higher the value, the easier it is to dance to this song. Loudness: The higher the value, the louder the song. Valence: The higher the value, the more positive mood for the song. Length: The duration of the song. Acoustic: The higher the value the more acoustic the song is. Speechiness: The higher the value the more spoken words the song contains Popularity: The higher the value the more popular the song is. Acknowledgements This data is extracted from the Spotify playlist - Top 2000s on PlaylistMachinery(plamere) using Selenium with Python. More specifically, it was scraped from http://sortyourmusic.playlistmachinery.com/. Thanks to Paul for providing a free and open source to extract features and do cool stuff with your Spotify playlists! Inspiration This is a very fun dataset to explore and find out unique links which land songs in the Top 2000s. With this dataset, I wanted to be able to answer some questions like: Which genres were more popular coming through 1950s to 2000s? Songs of which genre mostly saw themselves landing in the Top 2000s? Which artists were more likely to make a top song? Songs containing which words are more popular? What is the average tempo of songs compared over the years? Is there a trend of acoustic songs being popular back in 1960s than they are now? Is there a trend in genres preferred back in the day vs now? and a lot more.

15 features

Indexnumeric1994 unique values
0 missing
Titlestring1958 unique values
0 missing
Artiststring731 unique values
0 missing
Top_Genrestring149 unique values
0 missing
Yearnumeric63 unique values
0 missing
Beats_Per_Minute_(BPM)numeric145 unique values
0 missing
Energynumeric98 unique values
0 missing
Danceabilitynumeric84 unique values
0 missing
Loudness_(dB)numeric23 unique values
0 missing
Livenessnumeric94 unique values
0 missing
Valencenumeric97 unique values
0 missing
Length_(Duration)string350 unique values
0 missing
Acousticnessnumeric100 unique values
0 missing
Speechinessnumeric37 unique values
0 missing
Popularitynumeric81 unique values
0 missing

19 properties

1994
Number of instances (rows) of the dataset.
15
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
11
Number of numeric attributes.
0
Number of nominal attributes.
0.01
Number of attributes divided by the number of instances.
73.33
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task