Data
Game-of-Thrones-Script-All-Seasons

Game-of-Thrones-Script-All-Seasons

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Dataset is generated through a long and complex process. Starting from scrapping the whole URLs provided on Genius.com for Game of Thrones series. Process on scrapping and cleaning the dataset required a lot of time and effort in which I managed to utilize wide range of package available for collecting and compiling data scattered all over the internet. This dataset is inspired by previous similar dataset published by Ander Fernndez Jauregui on https://www.kaggle.com/anderfj/game-of-thrones-series-scripts-breakdowns. I was waiting for him to update the dataset to do some analysis on them. Unfortunately, it was a long time since he last updated the dataset. Therefore, following some of his practice I generated this dataset, and hopefully will be a good use for anyone or at least for my personal analysis. Content The content inside is a complete set of Game of Thrones script for all seasons in form of a table containing 6 columns with different data types used for various purposes. Description on each columns are provided on the data description part. Acknowledgements Great credits for Genius.com to published the whole script of Game of Thrones series completely. Also, kudos to all of the open source packages out there, as well as people who are contributing on them so we can utilize those packages as we pleases. Inspiration There is only one question that I want to find answer using this dataset. Who is the true hero/heroin in the whole series?

6 features

Release_Datestring73 unique values
0 missing
Seasonstring8 unique values
0 missing
Episodestring10 unique values
0 missing
Episode_Titlestring73 unique values
0 missing
Namestring564 unique values
3 missing
Sentencestring22297 unique values
0 missing

19 properties

23911
Number of instances (rows) of the dataset.
6
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
3
Number of missing values in the dataset.
3
Number of instances with at least one value missing.
0
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
0
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0.01
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task