Data
GeographicalOriginalofMusic

GeographicalOriginalofMusic

active ARFF Publicly available Visibility: public Uploaded 17-02-2016 by Hilda Fabiola Bernard
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Economics Machine Learning OpenML-Reg19
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Fang Zhou (fang.zhou '@' nottingham.edu.cn) The University of Nottinghan","Ningbo","China Source: UCI Please cite: Fang Zhou, Claire Q and Ross. D. King Predicting the Geographical Origin of Music, ICDM, 2014 Abstract: Instances in this dataset contain audio features extracted from 1059 wave files. The task associated with the data is to predict the geographical origin of music. Source: Creators: Fang Zhou (fang.zhou '@' nottingham.edu.cn) The University of Nottinghan, Ningbo, China Donors of the Dataset: Fang Zhou (fang.zhou '@' nottingham.edu.cn) Claire Q (eskoala '@' gmail.com) Ross D. King (ross.king '@' manchester.ac.uk) Data Set Information: The dataset was built from a personal collection of 1059 tracks covering 33 countries/area. The music used is traditional, ethnic or `world' only, as classified by the publishers of the product on which it appears. Any Western music is not included because its influence is global - what we seek are the aspects of music that most influence location. Thus, being able to specify a location with strong influence on the music is central. The geographical location of origin was manually collected the information from the CD sleeve notes, and when this information was inadequate we searched other information sources. The location data is limited in precision to the country of origin. The country of origin was determined by the artist's or artists' main country/area of residence. Any track that had ambiguous origin is not included. We have taken the position of each country's capital city (or the province of the area) by latitude and longitude as the absolute point of origin. The program MARSYAS[1] was used to extract audio features from the wave files. We used the default MARSYAS settings in single vector format (68 features) to estimate the performance with basic timbal information covering the entire length of each track. No feature weighting or pre-filtering was applied. All features were transformed to have a mean of 0, and a standard deviation of 1. We also investigated the utility of adding chromatic attributes. These describe the notes of the scale being used. This is especially important as a distinguishing feature in geographical ethnomusicology. The chromatic features provided by MARSYAS are 12 per octave - Western tuning, but it may be possible to tell something from how similar to or different from Western tuning the music is. [1] G. Tzanetakis and P. Cook, “MARSYAS: a framework for audio analysis,” Organised Sound, vol. 4, pp. 169–175, 2000. Attribute Information: The dataset is present in two files, where each file corresponds to a different feature sets. Both files contain the audio features of 1059 tracks. In the 'default_features_1059_tracks.txt' file, the first 68 columns are audio features of the track, and the last two columns are the origin of the music, represented by latitude and longitude. In the 'default_plus_chromatic_features_1059_tracks.txt' file, the first 116 columns are audio features of the track, and the last two columns are the origin of the music. Relevant Papers: The description of music collection and audio features can be found in: Fang Zhou, Claire Q and Ross. D. King Predicting the Geographical Origin of Music, ICDM, 2014 Citation Request: The following citation is requested if you use the dataset: Fang Zhou, Claire Q and Ross. D. King Predicting the Geographical Origin of Music, ICDM, 2014

118 features

V100 (target)numeric1057 unique values
0 missing
V1numeric1054 unique values
0 missing
V2numeric1045 unique values
0 missing
V3numeric1055 unique values
0 missing
V4numeric1037 unique values
0 missing
V5numeric1058 unique values
0 missing
V6numeric1059 unique values
0 missing
V7numeric1059 unique values
0 missing
V8numeric1059 unique values
0 missing
V9numeric1059 unique values
0 missing
V10numeric1059 unique values
0 missing
V11numeric1059 unique values
0 missing
V12numeric1059 unique values
0 missing
V13numeric1059 unique values
0 missing
V14numeric1059 unique values
0 missing
V15numeric1059 unique values
0 missing
V16numeric1059 unique values
0 missing
V17numeric1058 unique values
0 missing
V18numeric746 unique values
0 missing
V19numeric746 unique values
0 missing
V20numeric746 unique values
0 missing
V21numeric746 unique values
0 missing
V22numeric746 unique values
0 missing
V23numeric746 unique values
0 missing
V24numeric746 unique values
0 missing
V25numeric746 unique values
0 missing
V26numeric746 unique values
0 missing
V27numeric746 unique values
0 missing
V28numeric746 unique values
0 missing
V29numeric746 unique values
0 missing
V30numeric1042 unique values
0 missing
V31numeric1037 unique values
0 missing
V32numeric1058 unique values
0 missing
V33numeric1029 unique values
0 missing
V34numeric1059 unique values
0 missing
V35numeric1058 unique values
0 missing
V36numeric1059 unique values
0 missing
V37numeric1057 unique values
0 missing
V38numeric1053 unique values
0 missing
V39numeric1058 unique values
0 missing
V40numeric1055 unique values
0 missing
V41numeric1059 unique values
0 missing
V42numeric1058 unique values
0 missing
V43numeric1058 unique values
0 missing
V44numeric1058 unique values
0 missing
V45numeric1058 unique values
0 missing
V46numeric1055 unique values
0 missing
V47numeric681 unique values
0 missing
V48numeric681 unique values
0 missing
V49numeric681 unique values
0 missing
V50numeric681 unique values
0 missing
V51numeric681 unique values
0 missing
V52numeric681 unique values
0 missing
V53numeric681 unique values
0 missing
V54numeric681 unique values
0 missing
V55numeric681 unique values
0 missing
V56numeric681 unique values
0 missing
V57numeric681 unique values
0 missing
V58numeric681 unique values
0 missing
V59numeric1041 unique values
0 missing
V60numeric1054 unique values
0 missing
V61numeric1056 unique values
0 missing
V62numeric997 unique values
0 missing
V63numeric1059 unique values
0 missing
V64numeric1059 unique values
0 missing
V65numeric1058 unique values
0 missing
V66numeric1059 unique values
0 missing
V67numeric1059 unique values
0 missing
V68numeric1058 unique values
0 missing
V69numeric1057 unique values
0 missing
V70numeric1057 unique values
0 missing
V71numeric1059 unique values
0 missing
V72numeric1059 unique values
0 missing
V73numeric1058 unique values
0 missing
V74numeric1059 unique values
0 missing
V75numeric1058 unique values
0 missing
V76numeric680 unique values
0 missing
V77numeric680 unique values
0 missing
V78numeric680 unique values
0 missing
V79numeric680 unique values
0 missing
V80numeric680 unique values
0 missing
V81numeric680 unique values
0 missing
V82numeric680 unique values
0 missing
V83numeric680 unique values
0 missing
V84numeric680 unique values
0 missing
V85numeric680 unique values
0 missing
V86numeric680 unique values
0 missing
V87numeric680 unique values
0 missing
V88numeric1037 unique values
0 missing
V89numeric1046 unique values
0 missing
V90numeric1052 unique values
0 missing
V91numeric932 unique values
0 missing
V92numeric1059 unique values
0 missing
V93numeric1058 unique values
0 missing
V94numeric1058 unique values
0 missing
V95numeric1055 unique values
0 missing
V96numeric1057 unique values
0 missing
V97numeric1056 unique values
0 missing
V98numeric1058 unique values
0 missing
V99numeric1055 unique values
0 missing
V101numeric1056 unique values
0 missing
V102numeric1058 unique values
0 missing
V103numeric1054 unique values
0 missing
V104numeric1052 unique values
0 missing
V105numeric621 unique values
0 missing
V106numeric621 unique values
0 missing
V107numeric621 unique values
0 missing
V108numeric621 unique values
0 missing
V109numeric621 unique values
0 missing
V110numeric621 unique values
0 missing
V111numeric621 unique values
0 missing
V112numeric621 unique values
0 missing
V113numeric621 unique values
0 missing
V114numeric621 unique values
0 missing
V115numeric621 unique values
0 missing
V116numeric621 unique values
0 missing
V117numeric31 unique values
0 missing
V118numeric33 unique values
0 missing

107 properties

1059
Number of instances (rows) of the dataset.
118
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
118
Number of numeric attributes.
0
Number of nominal attributes.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
38.41
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
2.08
Second quartile (Median) of skewness among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.11
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
0
Percentage of binary attributes.
1.01
Second quartile (Median) of standard deviation of attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
The maximum number of distinct values among attributes of the nominal type.
-1.93
Minimum skewness among attributes of the numeric type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
6.75
Maximum skewness among attributes of the numeric type.
0.91
Minimum standard deviation of attributes of the numeric type.
0
Percentage of missing values.
13.49
Third quartile of kurtosis among attributes of the numeric type.
0.04
Average class difference between consecutive instances.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
50.42
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
100
Percentage of numeric attributes.
0.02
Third quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
0
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
9.14
Mean kurtosis among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
First quartile of entropy among attributes.
2.98
Third quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.55
Mean of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
2.94
First quartile of kurtosis among attributes of the numeric type.
1.02
Third quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
-0.01
First quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
0
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Standard deviation of the number of distinct values among attributes of the nominal type.
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
Average number of distinct values among the attributes of the nominal type.
0.95
First quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
1.96
Mean skewness among attributes of the numeric type.
1
First quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
Percentage of instances belonging to the most frequent class.
1.57
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Entropy of the target attribute values.
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
8.31
Second quartile (Median) of kurtosis among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-0.27
Minimum kurtosis among attributes of the numeric type.
0.01
Second quartile (Median) of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
65.76
Maximum kurtosis among attributes of the numeric type.
-0.07
Minimum of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

14 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: V100
0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: mean_absolute_error - target_feature: V100
0 runs - estimation_procedure: 33% Holdout set - target_feature: V100
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task