Data
Code_Smells_Data_Class

Code_Smells_Data_Class

active ARFF Publicly available Visibility: public Uploaded 10-08-2021 by Jan van Rijn
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset combines records from the MLCQ dataset with metrics extracted using the PMD Tool and the Understand tool, to determine whether a file contains code smells. Please note that the records are on (sub)class level. Classification task, the default class (severity) should be binarized with a static threshold (preferably between 0.5 and 2.5). Please carefully read the publication to understand how to use this dataset.

67 features

severity (target)numeric26 unique values
0 missing
repository (ignore)string431 unique values
0 missing
package (ignore)string2078 unique values
0 missing
filename (ignore)string2252 unique values
0 missing
code_name (ignore)string2329 unique values
0 missing
commit_hash (ignore)string431 unique values
0 missing
smell (ignore)string1 unique values
0 missing
AvgCyclomaticnumeric21 unique values
33950 missing
AvgCyclomaticModifiednumeric21 unique values
33950 missing
AvgCyclomaticStrictnumeric24 unique values
33950 missing
AvgEssentialnumeric13 unique values
33950 missing
AvgLinenumeric74 unique values
33950 missing
AvgLineBlanknumeric20 unique values
33950 missing
AvgLineCodenumeric64 unique values
33950 missing
AvgLineCommentnumeric29 unique values
33950 missing
CountClassBasenumeric9 unique values
33950 missing
CountClassCouplednumeric77 unique values
33950 missing
CountClassCoupledModifiednumeric77 unique values
33950 missing
CountClassDerivednumeric29 unique values
33950 missing
CountDeclClassnumeric0 unique values
86467 missing
CountDeclClassMethodnumeric32 unique values
33950 missing
CountDeclClassVariablenumeric26 unique values
33950 missing
CountDeclExecutableUnitnumeric0 unique values
86467 missing
CountDeclFilenumeric0 unique values
86467 missing
CountDeclFunctionnumeric0 unique values
86467 missing
CountDeclInstanceMethodnumeric64 unique values
33950 missing
CountDeclInstanceVariablenumeric41 unique values
33950 missing
CountDeclMethodnumeric66 unique values
33950 missing
CountDeclMethodAllnumeric139 unique values
33950 missing
CountDeclMethodDefaultnumeric27 unique values
33950 missing
CountDeclMethodPrivatenumeric29 unique values
33950 missing
CountDeclMethodProtectednumeric21 unique values
33950 missing
CountDeclMethodPublicnumeric58 unique values
33950 missing
CountInputnumeric0 unique values
86467 missing
CountLinenumeric428 unique values
33950 missing
CountLineBlanknumeric122 unique values
33950 missing
CountLineCodenumeric338 unique values
33950 missing
CountLineCodeDeclnumeric159 unique values
33950 missing
CountLineCodeExenumeric251 unique values
33950 missing
CountLineCommentnumeric188 unique values
33950 missing
CountOutputnumeric0 unique values
86467 missing
CountPathnumeric0 unique values
86467 missing
CountPathLognumeric0 unique values
86467 missing
CountSemicolonnumeric217 unique values
33950 missing
CountStmtnumeric268 unique values
33950 missing
CountStmtDeclnumeric140 unique values
33950 missing
CountStmtExenumeric223 unique values
33950 missing
Cyclomaticnumeric0 unique values
86467 missing
CyclomaticModifiednumeric0 unique values
86467 missing
CyclomaticStrictnumeric0 unique values
86467 missing
Essentialnumeric0 unique values
86467 missing
Knotsnumeric0 unique values
86467 missing
MaxCyclomaticnumeric48 unique values
33950 missing
MaxCyclomaticModifiednumeric47 unique values
33950 missing
MaxCyclomaticStrictnumeric54 unique values
33950 missing
MaxEssentialnumeric28 unique values
33950 missing
MaxEssentialKnotsnumeric0 unique values
86467 missing
MaxInheritanceTreenumeric11 unique values
33950 missing
MaxNestingnumeric10 unique values
33950 missing
MinEssentialKnotsnumeric0 unique values
86467 missing
PercentLackOfCohesionnumeric83 unique values
33950 missing
PercentLackOfCohesionModifiednumeric96 unique values
33950 missing
RatioCommentToCodenumeric284 unique values
33950 missing
SumCyclomaticnumeric125 unique values
33950 missing
SumCyclomaticModifiednumeric123 unique values
33950 missing
SumCyclomaticStrictnumeric133 unique values
33950 missing
SumEssentialnumeric84 unique values
33950 missing
WOCnumeric321 unique values
920 missing
NOPAnumeric35 unique values
194 missing
NOAMnumeric43 unique values
194 missing
WMCnumeric276 unique values
194 missing
TCCnumeric901 unique values
26455 missing
ATFDnumeric235 unique values
194 missing
class_name (ignore)string16908 unique values
194 missing

19 properties

86467
Number of instances (rows) of the dataset.
67
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
2852906
Number of missing values in the dataset.
86467
Number of instances with at least one value missing.
67
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
100
Percentage of instances having missing values.
0.99
Average class difference between consecutive instances.
49.25
Percentage of missing values.

0 tasks

Define a new task