Data
sylva_prior

sylva_prior

active ARFF Publicly available Visibility: public Uploaded 06-10-2014 by Joaquin Vanschoren
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Chemistry derived Life Science mythbusting_1 study_1 study_15 study_20 study_41
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Source: Unknown - Date unknown Please cite: Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Note: Derived from the covertype dataset Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF format) SYLVA is the ecology database The task of SYLVA is to classify forest cover types. The forest cover type for 30 x 30 meter cells is obtained from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. We brought it back to a two-class classification problem (classifying Ponderosa pine vs. everything else). The "agnostic learning track" data consists in 216 input variables. Each pattern is composed of 4 records: 2 true records matching the target and 2 records picked at random. Thus 1/2 of the features are distracters. The "prior knowledge track" data is identical to the "agnostic learning track" data, except that the distracters are removed and the identity of the features is revealed. For that track, the forest cover original ids are revealed for training data. Data type: non-sparse Number of features: 108 Number of examples and check-sums: Pos_ex Neg_ex Tot_ex Check_sum Train 805 12281 13086 118996108.00 Valid 81 1228 1309 11904801.00

109 features

label (target)nominal2 unique values
0 missing
Elevationnumeric801 unique values
0 missing
Aspectnumeric360 unique values
0 missing
Slopenumeric51 unique values
0 missing
Horizontal_Distance_To_Hydrologynumeric354 unique values
0 missing
Vertical_Distance_To_Hydrologynumeric427 unique values
0 missing
Horizontal_Distance_To_Roadwaysnumeric919 unique values
0 missing
Hillshade_9amnumeric165 unique values
0 missing
Hillshade_Noonnumeric131 unique values
0 missing
Hillshade_3pmnumeric244 unique values
0 missing
Horizontal_Distance_To_Fire_Pointsnumeric906 unique values
0 missing
Rawah_Wilderness_Areanumeric2 unique values
0 missing
Neota_Wilderness_Areanumeric2 unique values
0 missing
Comanche_Peak_Wilderness_Areanumeric2 unique values
0 missing
Cache_la_Poudre_Wilderness_Areanumeric2 unique values
0 missing
Soil_Type_1numeric2 unique values
0 missing
Soil_Type_2numeric2 unique values
0 missing
Soil_Type_3numeric2 unique values
0 missing
Soil_Type_4numeric2 unique values
0 missing
Soil_Type_5numeric2 unique values
0 missing
Soil_Type_6numeric2 unique values
0 missing
Soil_Type_7numeric2 unique values
0 missing
Soil_Type_8numeric2 unique values
0 missing
Soil_Type_9numeric2 unique values
0 missing
Soil_Type_10numeric2 unique values
0 missing
Soil_Type_11numeric2 unique values
0 missing
Soil_Type_12numeric2 unique values
0 missing
Soil_Type_13numeric2 unique values
0 missing
Soil_Type_14numeric2 unique values
0 missing
Soil_Type_15numeric2 unique values
0 missing
Soil_Type_16numeric2 unique values
0 missing
Soil_Type_17numeric2 unique values
0 missing
Soil_Type_18numeric2 unique values
0 missing
Soil_Type_19numeric2 unique values
0 missing
Soil_Type_20numeric2 unique values
0 missing
Soil_Type_21numeric2 unique values
0 missing
Soil_Type_22numeric2 unique values
0 missing
Soil_Type_23numeric2 unique values
0 missing
Soil_Type_24numeric2 unique values
0 missing
Soil_Type_25numeric2 unique values
0 missing
Soil_Type_26numeric2 unique values
0 missing
Soil_Type_27numeric2 unique values
0 missing
Soil_Type_28numeric2 unique values
0 missing
Soil_Type_29numeric2 unique values
0 missing
Soil_Type_30numeric2 unique values
0 missing
Soil_Type_31numeric2 unique values
0 missing
Soil_Type_32numeric2 unique values
0 missing
Soil_Type_33numeric2 unique values
0 missing
Soil_Type_34numeric2 unique values
0 missing
Soil_Type_35numeric2 unique values
0 missing
Soil_Type_36numeric2 unique values
0 missing
Soil_Type_37numeric2 unique values
0 missing
Soil_Type_38numeric2 unique values
0 missing
Soil_Type_39numeric2 unique values
0 missing
Soil_Type_40numeric2 unique values
0 missing
dup_Elevationnumeric803 unique values
0 missing
dup_Aspectnumeric360 unique values
0 missing
dup_Slopenumeric50 unique values
0 missing
dup_Horizontal_Distance_To_Hydrologynumeric353 unique values
0 missing
dup_Vertical_Distance_To_Hydrologynumeric430 unique values
0 missing
dup_Horizontal_Distance_To_Roadwaysnumeric915 unique values
0 missing
dup_Hillshade_9amnumeric170 unique values
0 missing
dup_Hillshade_Noonnumeric128 unique values
0 missing
dup_Hillshade_3pmnumeric239 unique values
0 missing
dup_Horizontal_Distance_To_Fire_Pointsnumeric918 unique values
0 missing
dup_Rawah_Wilderness_Areanumeric2 unique values
0 missing
dup_Neota_Wilderness_Areanumeric2 unique values
0 missing
dup_Comanche_Peak_Wilderness_Areanumeric2 unique values
0 missing
dup_Cache_la_Poudre_Wilderness_Areanumeric2 unique values
0 missing
dup_Soil_Type_1numeric2 unique values
0 missing
dup_Soil_Type_2numeric2 unique values
0 missing
dup_Soil_Type_3numeric2 unique values
0 missing
dup_Soil_Type_4numeric2 unique values
0 missing
dup_Soil_Type_5numeric2 unique values
0 missing
dup_Soil_Type_6numeric2 unique values
0 missing
dup_Soil_Type_7numeric2 unique values
0 missing
dup_Soil_Type_8numeric2 unique values
0 missing
dup_Soil_Type_9numeric2 unique values
0 missing
dup_Soil_Type_10numeric2 unique values
0 missing
dup_Soil_Type_11numeric2 unique values
0 missing
dup_Soil_Type_12numeric2 unique values
0 missing
dup_Soil_Type_13numeric2 unique values
0 missing
dup_Soil_Type_14numeric2 unique values
0 missing
dup_Soil_Type_15numeric1 unique values
0 missing
dup_Soil_Type_16numeric2 unique values
0 missing
dup_Soil_Type_17numeric2 unique values
0 missing
dup_Soil_Type_18numeric2 unique values
0 missing
dup_Soil_Type_19numeric2 unique values
0 missing
dup_Soil_Type_20numeric2 unique values
0 missing
dup_Soil_Type_21numeric2 unique values
0 missing
dup_Soil_Type_22numeric2 unique values
0 missing
dup_Soil_Type_23numeric2 unique values
0 missing
dup_Soil_Type_24numeric2 unique values
0 missing
dup_Soil_Type_25numeric2 unique values
0 missing
dup_Soil_Type_26numeric2 unique values
0 missing
dup_Soil_Type_27numeric2 unique values
0 missing
dup_Soil_Type_28numeric2 unique values
0 missing
dup_Soil_Type_29numeric2 unique values
0 missing
dup_Soil_Type_30numeric2 unique values
0 missing
dup_Soil_Type_31numeric2 unique values
0 missing
dup_Soil_Type_32numeric2 unique values
0 missing
dup_Soil_Type_33numeric2 unique values
0 missing
dup_Soil_Type_34numeric2 unique values
0 missing
dup_Soil_Type_35numeric2 unique values
0 missing
dup_Soil_Type_36numeric2 unique values
0 missing
dup_Soil_Type_37numeric2 unique values
0 missing
dup_Soil_Type_38numeric2 unique values
0 missing
dup_Soil_Type_39numeric2 unique values
0 missing
dup_Soil_Type_40numeric2 unique values
0 missing

107 properties

14395
Number of instances (rows) of the dataset.
109
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
108
Number of numeric attributes.
1
Number of nominal attributes.
0.91
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
878.7
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
6.39
Second quartile (Median) of skewness among attributes of the numeric type.
0.9
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.01
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
2
The minimal number of distinct values among attributes of the nominal type.
0.92
Percentage of binary attributes.
0.15
Second quartile (Median) of standard deviation of attributes of the numeric type.
0.02
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
2
The maximum number of distinct values among attributes of the nominal type.
-1.18
Minimum skewness among attributes of the numeric type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
0.8
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.97
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
119.98
Maximum skewness among attributes of the numeric type.
0
Minimum standard deviation of attributes of the numeric type.
0
Percentage of missing values.
283
Third quartile of kurtosis among attributes of the numeric type.
0.88
Average class difference between consecutive instances.
0.9
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.01
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
309.83
Maximum standard deviation of attributes of the numeric type.
6.15
Percentage of instances belonging to the least frequent class.
99.08
Percentage of numeric attributes.
0.1
Third quartile of means among attributes of the numeric type.
0.96
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.02
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.94
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
886
Number of instances belonging to the least frequent class.
0.92
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
0.01
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.8
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.97
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
784.38
Mean kurtosis among attributes of the numeric type.
1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
First quartile of entropy among attributes.
16.88
Third quartile of skewness among attributes of the numeric type.
0.92
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.9
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.01
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
84.2
Mean of means among attributes of the numeric type.
0.02
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
5.6
First quartile of kurtosis among attributes of the numeric type.
0.3
Third quartile of standard deviation of attributes of the numeric type.
0.96
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.02
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.94
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
0.82
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0
First quartile of means among attributes of the numeric type.
0.99
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.01
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.8
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.97
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
1
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
0.01
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.92
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.96
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0
Standard deviation of the number of distinct values among attributes of the nominal type.
0.01
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
2
Average number of distinct values among the attributes of the nominal type.
2.65
First quartile of skewness among attributes of the numeric type.
0.91
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.01
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.96
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
0.94
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
15.05
Mean skewness among attributes of the numeric type.
0.06
First quartile of standard deviation of attributes of the numeric type.
0.99
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.92
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.01
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
93.85
Percentage of instances belonging to the most frequent class.
28.35
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
0.01
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.33
Entropy of the target attribute values.
0.91
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
13509
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
38.79
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.91
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.99
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.91
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-1.96
Minimum kurtosis among attributes of the numeric type.
0.02
Second quartile (Median) of means among attributes of the numeric type.
0.01
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.06
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
14395
Maximum kurtosis among attributes of the numeric type.
0
Minimum of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

16 tasks

286 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: label
127 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: label
0 runs - estimation_procedure: 33% Holdout set - evaluation_measure: predictive_accuracy - target_feature: label
73 runs - estimation_procedure: 10-fold Learning Curve - target_feature: label
0 runs - estimation_procedure: Interleaved Test then Train - target_feature: label
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task