Data
Pantheon-Project-Historical-Popularity-Index

Pantheon-Project-Historical-Popularity-Index

active ARFF CC BY-SA 4.0 Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Pantheon is a project celebrating the cultural information that endows our species with these fantastic capacities. To celebrate our global cultural heritage we are compiling, analyzing and visualizing datasets that can help us understand the process of global cultural development. Dive in, visualize, and enjoy. Content The Pantheon 1.0 data measures the global popularity of historical characters using two measures. The simpler of the two measures, which we denote as L, is the number of different Wikipedia language editions that have an article about a historical character. The more sophisticated measure, which we name the Historical Popularity Index (HPI) corrects L by adding information on the age of the historical character, the concentration of page views among different languages, the coefficient of variation in page views, and the number of page views in languages other than English. For annotations of specific values visit the column metadata in the /Data tab. A more comprehensive breakdown is available on the Parthenon website. Acknowledgements Pantheon is a project developed by the Macro Connections group at the Massachusetts Institute of Technology Media Lab. For more on the dataset and to see visualizations using it, visit its landing page on the MIT website. Inspiration Which historical figures have a biography in the most languages? Who received the most Wikipedia page views? Which occupations or industries are the most popular? What country has the most individuals with a historical popularity index over twenty?

17 features

article_idnumeric11341 unique values
0 missing
full_namestring11325 unique values
3 missing
sexstring2 unique values
0 missing
birth_yearstring1486 unique values
0 missing
citystring5091 unique values
0 missing
statestring79 unique values
9169 missing
countrystring195 unique values
33 missing
continentstring7 unique values
30 missing
latitudenumeric4493 unique values
1047 missing
longitudenumeric4768 unique values
1047 missing
occupationstring88 unique values
0 missing
industrystring27 unique values
0 missing
domainstring8 unique values
0 missing
article_languagesnumeric137 unique values
0 missing
page_viewsnumeric11333 unique values
0 missing
average_viewsnumeric10832 unique values
0 missing
historical_popularity_indexnumeric10710 unique values
0 missing

19 properties

11341
Number of instances (rows) of the dataset.
17
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
11329
Number of missing values in the dataset.
9211
Number of instances with at least one value missing.
7
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
41.18
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
81.22
Percentage of instances having missing values.
Average class difference between consecutive instances.
5.88
Percentage of missing values.

0 tasks

Define a new task