OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

kc1-top5

active ARFF Publicly available Visibility: public Uploaded 06-10-2014 by Joaquin Vanschoren
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Author: Source: Unknown - Date unknown Please cite: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering. If you publish material based on PROMISE data sets then, please follow the acknowledgment guidelines posted on the PROMISE repository web page http://promise.site.uottawa.ca/SERepository . %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1. Title: Class-level data for KC1 This one includes a {DEF,NODEF} attribute (DL) to indicate defectiveness. DL is equal to DEF if the module is in the Top5% in defect count ranking, NODEF otherwise. 2. Sources (a) Creator: A. Gunes Koru (b) Date: February 21, 2005 (c) Contact: gkoru AT umbc DOT edu Phone: +1 (410) 455 8843 3. Donor: A. Gunes Koru 4. Past Usage: This data set was used for: A. Gunes Koru and Hongfang Liu, "An Investigation of the Effect of Module Size on Defect Prediction Using Static Measures", PROMISE - Predictive Models in Software Engineering Workshop, ICSE 2005, May 15th 2005, Saint Louis, Missouri, US. We used several machine learning algorithms to predict the defective modules in five NASA products, namely, CM1, JM1, KC1, KC2, and PC1. A set of static measures were used as predictor variables. While doing so, we observed that a large portion of the modules were small, as measured by lines of code (LOC). When we experimented on the data subsets created by partitioning according to module size, we obtained higher prediction performance for the subsets that include larger modules. We also performed defect prediction using class-level data for KC1 rather than method-level data. In this case, the use of class-level data resulted in improved prediction performance compared to using method-level data. These findings suggest that quality assurance activities can be guided even better if defect predictions are made by using data that belong to larger modules. 5. Features: The descriptions of the features are taken from http://mdp.ivv.nasa.gov/mdp_glossary.html Feature Used as the Response Variable: ====================================== DL: Defect level. DEF if the class is in the Top 5% in defect ranking, NODEF otherwise. Features at Class Level Originally ================================== PERCENT_PUB_DATA: The percentage of data that is public and protected data in a class. In general, lower values indicate greater encapsulation. It is measure of encapsulation. ACCESS_TO_PUB_DATA: The amount of times that a class's public and protected data is accessed. In general, lower values indicate greater encapsulation. It is a measure of encapsulation. COUPLING_BETWEEN_OBJECTS: The number of distinct non-inheritance-related classes on which a class depends. If a class that is heavily dependent on many classes outside of its hierarchy is introduced into a library, all the classes upon which it depends need to be introduced as well. This may be acceptable, especially if the classes which it references are already part of a class library and are even more fundamental than the specified class. DEPTH: The level for a class. For instance, if a parent has one child the depth for the child is two. Depth indicates at what level a class is located within its class hierarchy. In general, inheritance increases when depth increases. LACK_OF_COHESION_OF_METHODS: For each data field in a class, the percentage of the methods in the class using that data field; the percentages are averaged then subtracted from 100%. The locm metric indicates low or high percentage of cohesion. If the percentage is low, the class is cohesive. If it is high, it may indicate that the class could be split into separate classes that will individually have greater cohesion. NUM_OF_CHILDREN: The number of classes derived from a specified class. DEP_ON_CHILD: Whether a class is dependent on a descendant. FAN_IN: This is a count of calls by higher modules. RESPONSE_FOR_CLASS: A count of methods implemented within a class plus the number of methods accessible to an object class due to inheritance. In general, lower values indicate greater polymorphism. WEIGHTED_METHODS_PER_CLASS: A count of methods implemented within a class (rather than all methods accessible within the class hierarchy). In general, lower values indicate greater polymorphism. Features Transformed to Class Level (Originally at Method Level) ================================================================ Transformation was achieved by obtaining min, max, sum, and avg values over all the methods in a class. There this data set includes four features for all of the following features that were originally at the method level but transformed to the class level. For example, LOC_BLANK has minLOC_BLANK, maxLOC_BLANK, avgLOC_BLANK, and maxLOC_BLANK. LOC_BLANK: Lines with only white space or no text content. BRANCH_COUNT: This metric is the number of branches for each module. Branches are defined as those edges that exit from a decision node. The greater the number of branches in a program's modules, the more testing resource's required. LOC_CODE_AND_COMMENT: Lines that contain both code and comment. LOC_COMMENTS: The number of lines in a module. This particular metric includes all blank lines, comment lines, and source lines. CYCLOMATIC_COMPLEXITY: It is a measure of the complexity of a modules decision structure. It is the number of linearly independent paths. DESIGN_COMPLEXITY: Design complexity is a measure of a module's decision structure as it relates to calls to other modules. This quantifies the testing effort related to integration. ESSENTIAL_COMPLEXITY: Essential complexity is a measure of the degree to which a module contains unstructured constructs. LOC_EXECUTABLE: Source lines of code that contain only code and white space. HALSTEAD_CONTENT: Complexity of a given algorithm independent of the language used to express the algorithm. HALSTEAD_DIFFICULTY: Level of difficulty in the program. HALSTEAD_EFFORT: Estimated mental effort required to develop the program. HALSTEAD_ERROR_EST: Estimated number of errors in the program. HALSTEAD_LENGTH: This is a Halstead metric that includes the total number of operator occurrences and total number of operand occurrences. HALSTEAD_LEVEL: Level at which the program can be understood. HALSTEAD_PROG_TIME: Estimated amount of time to implement the algorithm. HALSTEAD_VOLUME: This is a Halstead metric that contains the minimum number of bits required for coding the program. NUM_OPERANDS: Variables and identifiers Constants (numeric literal/string) Function names when used during calls. NUM_UNIQUE_OPERANDS: Variables and identifiers Constants (numeric literal/string) Function names when used during calls NUM_UNIQUE_OPERATORS: Number of unique operators. LOC_TOTAL: Total Lines of Code.

95 features

DL (target)	nominal	2 unique values 0 missing
PERCENT_PUB_DATA	numeric	12 unique values 0 missing
ACCESS_TO_PUB_DATA	numeric	1 unique values 0 missing
COUPLING_BETWEEN_OBJECTS	numeric	25 unique values 0 missing
DEPTH	numeric	7 unique values 0 missing
LACK_OF_COHESION_OF_METHODS	numeric	41 unique values 0 missing
NUM_OF_CHILDREN	numeric	6 unique values 0 missing
DEP_ON_CHILD	numeric	2 unique values 0 missing
FAN_IN	numeric	4 unique values 0 missing
RESPONSE_FOR_CLASS	numeric	63 unique values 0 missing
WEIGHTED_METHODS_PER_CLASS	numeric	39 unique values 0 missing
minLOC_BLANK	numeric	1 unique values 0 missing
minBRANCH_COUNT	numeric	1 unique values 0 missing
minLOC_CODE_AND_COMMENT	numeric	1 unique values 0 missing
minLOC_COMMENTS	numeric	1 unique values 0 missing
minCYCLOMATIC_COMPLEXITY	numeric	1 unique values 0 missing
minDESIGN_COMPLEXITY	numeric	1 unique values 0 missing
minESSENTIAL_COMPLEXITY	numeric	1 unique values 0 missing
minLOC_EXECUTABLE	numeric	5 unique values 0 missing
minHALSTEAD_CONTENT	numeric	13 unique values 0 missing
minHALSTEAD_DIFFICULTY	numeric	7 unique values 0 missing
minHALSTEAD_EFFORT	numeric	12 unique values 0 missing
minHALSTEAD_ERROR_EST	numeric	2 unique values 0 missing
minHALSTEAD_LENGTH	numeric	9 unique values 0 missing
minHALSTEAD_LEVEL	numeric	16 unique values 0 missing
minHALSTEAD_PROG_TIME	numeric	12 unique values 0 missing
minHALSTEAD_VOLUME	numeric	8 unique values 0 missing
minNUM_OPERANDS	numeric	6 unique values 0 missing
minNUM_OPERATORS	numeric	7 unique values 0 missing
minNUM_UNIQUE_OPERANDS	numeric	6 unique values 0 missing
minNUM_UNIQUE_OPERATORS	numeric	6 unique values 0 missing
minLOC_TOTAL	numeric	7 unique values 0 missing
maxLOC_BLANK	numeric	25 unique values 0 missing
maxBRANCH_COUNT	numeric	38 unique values 0 missing
maxLOC_CODE_AND_COMMENT	numeric	12 unique values 0 missing
maxLOC_COMMENTS	numeric	26 unique values 0 missing
maxCYCLOMATIC_COMPLEXITY	numeric	30 unique values 0 missing
maxDESIGN_COMPLEXITY	numeric	24 unique values 0 missing
maxESSENTIAL_COMPLEXITY	numeric	19 unique values 0 missing
maxLOC_EXECUTABLE	numeric	82 unique values 0 missing
maxHALSTEAD_CONTENT	numeric	122 unique values 0 missing
maxHALSTEAD_DIFFICULTY	numeric	112 unique values 0 missing
maxHALSTEAD_EFFORT	numeric	123 unique values 0 missing
maxHALSTEAD_ERROR_EST	numeric	63 unique values 0 missing
maxHALSTEAD_LENGTH	numeric	104 unique values 0 missing
maxHALSTEAD_LEVEL	numeric	18 unique values 0 missing
maxHALSTEAD_PROG_TIME	numeric	123 unique values 0 missing
maxHALSTEAD_VOLUME	numeric	118 unique values 0 missing
maxNUM_OPERANDS	numeric	88 unique values 0 missing
maxNUM_OPERATORS	numeric	97 unique values 0 missing
maxNUM_UNIQUE_OPERANDS	numeric	63 unique values 0 missing
maxNUM_UNIQUE_OPERATORS	numeric	31 unique values 0 missing
maxLOC_TOTAL	numeric	85 unique values 0 missing
avgLOC_BLANK	numeric	83 unique values 0 missing
avgBRANCH_COUNT	numeric	95 unique values 0 missing
avgLOC_CODE_AND_COMMENT	numeric	33 unique values 0 missing
avgLOC_COMMENTS	numeric	69 unique values 0 missing
avgCYCLOMATIC_COMPLEXITY	numeric	90 unique values 0 missing
avgDESIGN_COMPLEXITY	numeric	92 unique values 0 missing
avgESSENTIAL_COMPLEXITY	numeric	60 unique values 0 missing
avgLOC_EXECUTABLE	numeric	114 unique values 0 missing
avgHALSTEAD_CONTENT	numeric	133 unique values 0 missing
avgHALSTEAD_DIFFICULTY	numeric	125 unique values 0 missing
avgHALSTEAD_EFFORT	numeric	133 unique values 0 missing
avgHALSTEAD_ERROR_EST	numeric	110 unique values 0 missing
avgHALSTEAD_LENGTH	numeric	129 unique values 0 missing
avgHALSTEAD_LEVEL	numeric	129 unique values 0 missing
avgHALSTEAD_PROG_TIME	numeric	132 unique values 0 missing
avgHALSTEAD_VOLUME	numeric	133 unique values 0 missing
avgNUM_OPERANDS	numeric	122 unique values 0 missing
avgNUM_OPERATORS	numeric	126 unique values 0 missing
avgNUM_UNIQUE_OPERANDS	numeric	116 unique values 0 missing
avgNUM_UNIQUE_OPERATORS	numeric	115 unique values 0 missing
avgLOC_TOTAL	numeric	124 unique values 0 missing
sumLOC_BLANK	numeric	57 unique values 0 missing
sumBRANCH_COUNT	numeric	85 unique values 0 missing
sumLOC_CODE_AND_COMMENT	numeric	16 unique values 0 missing
sumLOC_COMMENTS	numeric	41 unique values 0 missing
sumCYCLOMATIC_COMPLEXITY	numeric	70 unique values 0 missing
sumDESIGN_COMPLEXITY	numeric	70 unique values 0 missing
sumESSENTIAL_COMPLEXITY	numeric	51 unique values 0 missing
sumLOC_EXECUTABLE	numeric	108 unique values 0 missing
sumHALSTEAD_CONTENT	numeric	131 unique values 0 missing
sumHALSTEAD_DIFFICULTY	numeric	124 unique values 0 missing
sumHALSTEAD_EFFORT	numeric	131 unique values 0 missing
sumHALSTEAD_ERROR_EST	numeric	90 unique values 0 missing
sumHALSTEAD_LENGTH	numeric	118 unique values 0 missing
sumHALSTEAD_LEVEL	numeric	119 unique values 0 missing
sumHALSTEAD_PROG_TIME	numeric	130 unique values 0 missing
sumHALSTEAD_VOLUME	numeric	131 unique values 0 missing
sumNUM_OPERANDS	numeric	117 unique values 0 missing
sumNUM_OPERATORS	numeric	116 unique values 0 missing
sumNUM_UNIQUE_OPERANDS	numeric	99 unique values 0 missing
sumNUM_UNIQUE_OPERATORS	numeric	101 unique values 0 missing
sumLOC_TOTAL	numeric	121 unique values 0 missing

Show all 95 features

107 properties

NumberOfInstances

145

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

J48.001.Kappa

0.31

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001

MeanSkewnessOfNumericAtts

2.73

Mean skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

0.94

First quartile of standard deviation of attributes of the numeric type.

REPTreeDepth2AUC

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

CfsSubsetEval_kNN1NErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

kNN1NAUC

0.66

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk

MajorityClassPercentage

94.48

Percentage of instances belonging to the most frequent class.

MeanStdDevOfNumericAtts

2999.29

Mean standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

Second quartile (Median) of entropy among attributes.

REPTreeDepth2ErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2

CfsSubsetEval_kNN1NKappa

0.16

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

kNN1NErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.lazy.IBk

MajorityClassSize

137

Number of instances belonging to the most frequent class.

MinAttributeEntropy

Minimal entropy among attributes.

Quartile2KurtosisOfNumericAtts

8.18

Second quartile (Median) of kurtosis among attributes of the numeric type.

REPTreeDepth2Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

ClassEntropy

0.31

Entropy of the target attribute values.

kNN1NKappa

0.4

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

MaxAttributeEntropy

Maximum entropy among attributes.

MinKurtosisOfNumericAtts

-0.82

Minimum kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

7.4

Second quartile (Median) of means among attributes of the numeric type.

REPTreeDepth3AUC

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

DecisionStumpAUC

0.78

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

MaxKurtosisOfNumericAtts

69.94

Maximum kurtosis among attributes of the numeric type.

MinMeansOfNumericAtts

Minimum of means among attributes of the numeric type.

Quartile2MutualInformation

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

REPTreeDepth3ErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3

DecisionStumpErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

MaxMeansOfNumericAtts

76249.59

Maximum of means among attributes of the numeric type.

MinMutualInformation

Minimal mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

2.4

Second quartile (Median) of skewness among attributes of the numeric type.

REPTreeDepth3Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

DecisionStumpKappa

0.36

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

MaxMutualInformation

Maximum mutual information between the nominal attributes and the target attribute.

MinNominalAttDistinctValues

The minimal number of distinct values among attributes of the nominal type.

PercentageOfBinaryFeatures

1.05

Percentage of binary attributes.

Quartile2StdDevOfNumericAtts

7.96

Second quartile (Median) of standard deviation of attributes of the numeric type.

RandomTreeDepth1AUC

0.72

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Dimensionality

0.66

Number of attributes divided by the number of instances.

MaxNominalAttDistinctValues

The maximum number of distinct values among attributes of the nominal type.

MinSkewnessOfNumericAtts

-1.11

Minimum skewness among attributes of the numeric type.

PercentageOfInstancesWithMissingValues

Percentage of instances having missing values.

Quartile3AttributeEntropy

Third quartile of entropy among attributes.

RandomTreeDepth1ErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

EquivalentNumberOfAtts

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

MaxSkewnessOfNumericAtts

8.42

Maximum skewness among attributes of the numeric type.

MinStdDevOfNumericAtts

Minimum standard deviation of attributes of the numeric type.

PercentageOfMissingValues

Percentage of missing values.

Quartile3KurtosisOfNumericAtts

16.95

Third quartile of kurtosis among attributes of the numeric type.

AutoCorrelation

0.91

Average class difference between consecutive instances.

RandomTreeDepth1Kappa

0.38

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

J48.00001.AUC

0.69

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

MaxStdDevOfNumericAtts

200468.25

Maximum standard deviation of attributes of the numeric type.

MinorityClassPercentage

5.52

Percentage of instances belonging to the least frequent class.

PercentageOfNumericFeatures

98.95

Percentage of numeric attributes.

Quartile3MeansOfNumericAtts

59.62

Third quartile of means among attributes of the numeric type.

CfsSubsetEval_DecisionStumpAUC

0.77

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth2AUC

0.72

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

J48.00001.ErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001

MeanAttributeEntropy

Average entropy of the attributes.

MinorityClassSize

Number of instances belonging to the least frequent class.

PercentageOfSymbolicFeatures

1.05

Percentage of nominal attributes.

Quartile3MutualInformation

Third quartile of mutual information between the nominal attributes and the target attribute.

CfsSubsetEval_DecisionStumpErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth2ErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

J48.00001.Kappa

0.31

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

MeanKurtosisOfNumericAtts

13.16

Mean kurtosis among attributes of the numeric type.

NaiveBayesAUC

0.73

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

Quartile1AttributeEntropy

First quartile of entropy among attributes.

Quartile3SkewnessOfNumericAtts

3.96

Third quartile of skewness among attributes of the numeric type.

CfsSubsetEval_DecisionStumpKappa

0.16

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth2Kappa

0.38

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

J48.0001.AUC

0.69

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

MeanMeansOfNumericAtts

1290.47

Mean of means among attributes of the numeric type.

NaiveBayesErrRate

0.1

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes

Quartile1KurtosisOfNumericAtts

2.44

First quartile of kurtosis among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

62.41

Third quartile of standard deviation of attributes of the numeric type.

CfsSubsetEval_NaiveBayesAUC

0.77

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth3AUC

0.72

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

J48.0001.ErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001

MeanMutualInformation

Average mutual information between the nominal attributes and the target attribute.

NaiveBayesKappa

0.3

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

Quartile1MeansOfNumericAtts

First quartile of means among attributes of the numeric type.

REPTreeDepth1AUC

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

CfsSubsetEval_NaiveBayesErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth3ErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

J48.0001.Kappa

0.31

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

MeanNoiseToSignalRatio

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

NumberOfBinaryFeatures

Number of binary attributes.

Quartile1MutualInformation

First quartile of mutual information between the nominal attributes and the target attribute.

REPTreeDepth1ErrRate

0.06

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1

CfsSubsetEval_NaiveBayesKappa

0.16

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth3Kappa

0.38

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

J48.001.AUC

0.69

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

MeanNominalAttDistinctValues

Average number of distinct values among the attributes of the nominal type.

Quartile1SkewnessOfNumericAtts

1.46

First quartile of skewness among attributes of the numeric type.

REPTreeDepth1Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

CfsSubsetEval_kNN1NAUC

0.77

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

StdvNominalAttDistinctValues

Standard deviation of the number of distinct values among attributes of the nominal type.

J48.001.ErrRate

0.08

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001

Show all 107 properties

14 tasks

Supervised Classification on kc1-top5

560 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: DL

Supervised Classification on kc1-top5

187 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: DL

Supervised Data Stream Classification on kc1-top5

0 runs - estimation_procedure: Interleaved Test then Train - target_feature: DL

Clustering on kc1-top5