Data
QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL4251

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL4251

deactivated ARFF Publicly available Visibility: public Uploaded 14-07-2016 by Noureddin Sadawi
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL4251 (TID: 100234), and it has 23 rows and 231 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

233 features

pXC50 (target)numeric7 unique values
0 missing
molecule_id (row identifier)nominal23 unique values
0 missing
JGI5numeric12 unique values
0 missing
DLS_consnumeric10 unique values
0 missing
GGI5numeric16 unique values
0 missing
MATS5snumeric22 unique values
0 missing
ATS7snumeric22 unique values
0 missing
Eig06_AEA.dm.numeric18 unique values
0 missing
Eig06_AEA.ed.numeric14 unique values
0 missing
Eig06_EA.ed.numeric15 unique values
0 missing
Eig07_AEA.dm.numeric15 unique values
0 missing
Eig08_AEA.dm.numeric16 unique values
0 missing
GGI1numeric9 unique values
0 missing
GGI3numeric13 unique values
0 missing
SM15_AEA.dm.numeric15 unique values
0 missing
SpMax1_Bh.e.numeric16 unique values
0 missing
SpMax1_Bh.v.numeric14 unique values
0 missing
X0Avnumeric20 unique values
0 missing
X3Avnumeric18 unique values
0 missing
ZM1Pernumeric23 unique values
0 missing
ZM2Kupnumeric21 unique values
0 missing
ZM2MulPernumeric23 unique values
0 missing
ZM2Pernumeric23 unique values
0 missing
ZM2Vnumeric19 unique values
0 missing
Eta_Bnumeric15 unique values
0 missing
GATS1vnumeric20 unique values
0 missing
SpMax5_Bh.s.numeric22 unique values
0 missing
Eig03_AEA.ri.numeric21 unique values
0 missing
Eig03_EA.ri.numeric22 unique values
0 missing
Eig04_AEA.ri.numeric22 unique values
0 missing
Eig04_EA.ri.numeric21 unique values
0 missing
Eig05_AEA.ri.numeric22 unique values
0 missing
Eig06_AEA.ri.numeric17 unique values
0 missing
Eig08_EA.ri.numeric20 unique values
0 missing
Eig15_AEA.ri.numeric20 unique values
0 missing
MATS8vnumeric23 unique values
0 missing
SpMax2_Bh.i.numeric19 unique values
0 missing
SpMax2_Bh.p.numeric17 unique values
0 missing
SpMax2_Bh.v.numeric18 unique values
0 missing
SpMax3_Bh.e.numeric22 unique values
0 missing
SpMax3_Bh.i.numeric20 unique values
0 missing
SpMax3_Bh.m.numeric21 unique values
0 missing
SpMax3_Bh.p.numeric19 unique values
0 missing
SpMax3_Bh.v.numeric22 unique values
0 missing
SpMax4_Bh.e.numeric20 unique values
0 missing
SpMax4_Bh.i.numeric19 unique values
0 missing
SpMax4_Bh.m.numeric20 unique values
0 missing
SpMax4_Bh.v.numeric20 unique values
0 missing
SpMin2_Bh.e.numeric14 unique values
0 missing
SpMin2_Bh.i.numeric15 unique values
0 missing
SpMin3_Bh.i.numeric19 unique values
0 missing
GATS8mnumeric23 unique values
0 missing
ATS3snumeric21 unique values
0 missing
CENTnumeric19 unique values
0 missing
Chi1_AEA.bo.numeric17 unique values
0 missing
Chi1_AEA.dm.numeric17 unique values
0 missing
Chi1_AEA.ed.numeric17 unique values
0 missing
Chi1_AEA.ri.numeric17 unique values
0 missing
Chi1_EAnumeric17 unique values
0 missing
Chi1_EA.ed.numeric19 unique values
0 missing
CSInumeric19 unique values
0 missing
D.Dtr09numeric16 unique values
0 missing
ECCnumeric19 unique values
0 missing
Eig03_EAnumeric19 unique values
0 missing
Eig04_AEA.bo.numeric20 unique values
0 missing
Eig04_EAnumeric18 unique values
0 missing
Eig04_EA.bo.numeric20 unique values
0 missing
Eig04_EA.ed.numeric19 unique values
0 missing
Eig05_EAnumeric19 unique values
0 missing
Eig06_EAnumeric16 unique values
0 missing
Eig07_AEA.ed.numeric14 unique values
0 missing
Eig07_EA.bo.numeric14 unique values
0 missing
Eig07_EA.ed.numeric14 unique values
0 missing
Eig08_AEA.bo.numeric18 unique values
0 missing
Eig08_AEA.ed.numeric16 unique values
0 missing
Eig08_AEA.ri.numeric19 unique values
0 missing
Eig08_EAnumeric16 unique values
0 missing
Eig08_EA.bo.numeric18 unique values
0 missing
Eig08_EA.ed.numeric17 unique values
0 missing
Eig09_AEA.bo.numeric14 unique values
0 missing
Eig09_EAnumeric13 unique values
0 missing
Eig09_EA.ed.numeric12 unique values
0 missing
Eig12_AEA.bo.numeric20 unique values
0 missing
Eig12_AEA.dm.numeric16 unique values
0 missing
Eig12_AEA.ed.numeric14 unique values
0 missing
Eig12_EAnumeric15 unique values
0 missing
Eig13_AEA.bo.numeric18 unique values
0 missing
Eig13_AEA.ed.numeric19 unique values
0 missing
Eig14_AEA.bo.numeric19 unique values
0 missing
Eig14_AEA.dm.numeric18 unique values
0 missing
Eig14_AEA.ed.numeric18 unique values
0 missing
Eig14_EAnumeric16 unique values
0 missing
Eig14_EA.bo.numeric17 unique values
0 missing
Eig15_AEA.dm.numeric18 unique values
0 missing
Eig15_EAnumeric15 unique values
0 missing
Eig15_EA.bo.numeric17 unique values
0 missing
Eig15_EA.ed.numeric16 unique values
0 missing
GGI9numeric14 unique values
0 missing
GMTInumeric19 unique values
0 missing
IDETnumeric19 unique values
0 missing
IDMTnumeric19 unique values
0 missing
LPRSnumeric19 unique values
0 missing
MATS3snumeric20 unique values
0 missing
MATS4inumeric21 unique values
0 missing
MDDDnumeric19 unique values
0 missing
MPC10numeric17 unique values
0 missing
RDCHInumeric19 unique values
0 missing
SM02_AEA.dm.numeric16 unique values
0 missing
SM02_AEA.ri.numeric14 unique values
0 missing
SM03_AEA.dm.numeric13 unique values
0 missing
SM03_AEA.ri.numeric17 unique values
0 missing
SM04_AEA.ri.numeric12 unique values
0 missing
SM06_AEA.dm.numeric15 unique values
0 missing
SM08_AEA.dm.numeric16 unique values
0 missing
SM09_AEA.dm.numeric15 unique values
0 missing
SM10_AEA.ri.numeric16 unique values
0 missing
SM11_AEA.bo.numeric19 unique values
0 missing
SM12_AEA.bo.numeric18 unique values
0 missing
SM13_AEA.bo.numeric19 unique values
0 missing
SM13_AEA.dm.numeric19 unique values
0 missing
SM14_AEA.bo.numeric16 unique values
0 missing
SM14_AEA.ri.numeric20 unique values
0 missing
SMTInumeric19 unique values
0 missing
SpMax1_Bh.m.numeric19 unique values
0 missing
SpMax8_Bh.p.numeric16 unique values
0 missing
SpMaxA_AEA.bo.numeric18 unique values
0 missing
SpMaxA_AEA.ed.numeric16 unique values
0 missing
SpMaxA_EAnumeric14 unique values
0 missing
SpMaxA_EA.bo.numeric15 unique values
0 missing
SpMaxA_EA.ri.numeric16 unique values
0 missing
SpMin8_Bh.e.numeric19 unique values
0 missing
UNIPnumeric17 unique values
0 missing
Wapnumeric17 unique values
0 missing
X2numeric17 unique values
0 missing
X2solnumeric17 unique values
0 missing
Xunumeric19 unique values
0 missing
AMRnumeric22 unique values
0 missing
Mvnumeric19 unique values
0 missing
SpAD_AEA.dm.numeric21 unique values
0 missing
SpMaxA_AEA.dm.numeric16 unique values
0 missing
AMWnumeric19 unique values
0 missing
ATS1snumeric21 unique values
0 missing
Chi1_EA.ri.numeric22 unique values
0 missing
Eig09_AEA.dm.numeric17 unique values
0 missing
Eig13_EA.ri.numeric19 unique values
0 missing
Eig14_EA.ri.numeric19 unique values
0 missing
Eig15_EA.ri.numeric22 unique values
0 missing
GATS1inumeric20 unique values
0 missing
GATS6inumeric23 unique values
0 missing
GMTIVnumeric23 unique values
0 missing
H.numeric17 unique values
0 missing
MATS3mnumeric20 unique values
0 missing
SMTIVnumeric22 unique values
0 missing
SpMax3_Bh.s.numeric21 unique values
0 missing
SpMax8_Bh.s.numeric13 unique values
0 missing
SpMaxA_AEA.ri.numeric17 unique values
0 missing
ZM1MulPernumeric23 unique values
0 missing
GATS3mnumeric21 unique values
0 missing
SpMax2_Bh.e.numeric23 unique values
0 missing
MSDnumeric19 unique values
0 missing
SpMaxA_EA.ed.numeric17 unique values
0 missing
SpMax8_Bh.v.numeric16 unique values
0 missing
SpMin8_Bh.i.numeric17 unique values
0 missing
AACnumeric17 unique values
0 missing
AECCnumeric19 unique values
0 missing
ALOGPnumeric22 unique values
0 missing
ALOGP2numeric22 unique values
0 missing
ARRnumeric14 unique values
0 missing
ATS1enumeric20 unique values
0 missing
ATS1inumeric19 unique values
0 missing
ATS1mnumeric20 unique values
0 missing
ATS1pnumeric18 unique values
0 missing
ATS1vnumeric20 unique values
0 missing
ATS2enumeric20 unique values
0 missing
ATS2inumeric19 unique values
0 missing
ATS2mnumeric20 unique values
0 missing
ATS2pnumeric20 unique values
0 missing
ATS2snumeric21 unique values
0 missing
ATS2vnumeric20 unique values
0 missing
ATS3enumeric21 unique values
0 missing
ATS3inumeric22 unique values
0 missing
ATS3mnumeric22 unique values
0 missing
ATS3pnumeric22 unique values
0 missing
ATS3vnumeric22 unique values
0 missing
ATS4enumeric22 unique values
0 missing
ATS4inumeric22 unique values
0 missing
ATS4mnumeric22 unique values
0 missing
ATS4pnumeric23 unique values
0 missing
ATS4snumeric23 unique values
0 missing
ATS4vnumeric23 unique values
0 missing
ATS5enumeric23 unique values
0 missing
ATS5inumeric23 unique values
0 missing
ATS5mnumeric21 unique values
0 missing
ATS5pnumeric23 unique values
0 missing
ATS5snumeric23 unique values
0 missing
ATS5vnumeric22 unique values
0 missing
ATS6enumeric23 unique values
0 missing
ATS6inumeric22 unique values
0 missing
ATS6mnumeric22 unique values
0 missing
ATS6pnumeric23 unique values
0 missing
ATS6snumeric23 unique values
0 missing
ATS6vnumeric23 unique values
0 missing
ATS7enumeric23 unique values
0 missing
ATS7inumeric22 unique values
0 missing
ATS7mnumeric23 unique values
0 missing
ATS7pnumeric23 unique values
0 missing
ATS7vnumeric22 unique values
0 missing
ATS8enumeric23 unique values
0 missing
ATS8inumeric23 unique values
0 missing
ATS8mnumeric23 unique values
0 missing
ATS8pnumeric22 unique values
0 missing
ATS8snumeric23 unique values
0 missing
ATS8vnumeric23 unique values
0 missing
ATSC1enumeric18 unique values
0 missing
ATSC1inumeric19 unique values
0 missing
ATSC1mnumeric20 unique values
0 missing
ATSC1pnumeric20 unique values
0 missing
ATSC1snumeric21 unique values
0 missing
ATSC1vnumeric20 unique values
0 missing
ATSC2enumeric20 unique values
0 missing
ATSC2inumeric20 unique values
0 missing
ATSC2mnumeric20 unique values
0 missing
ATSC2pnumeric20 unique values
0 missing
ATSC2snumeric23 unique values
0 missing
ATSC2vnumeric20 unique values
0 missing
ATSC3enumeric21 unique values
0 missing
ATSC3inumeric20 unique values
0 missing
ATSC3mnumeric22 unique values
0 missing
ATSC3pnumeric21 unique values
0 missing
ATSC3snumeric23 unique values
0 missing
ATSC3vnumeric22 unique values
0 missing
ATSC4enumeric23 unique values
0 missing
ATSC4inumeric22 unique values
0 missing

107 properties

23
Number of instances (rows) of the dataset.
233
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
232
Number of numeric attributes.
1
Number of nominal attributes.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
20480.04
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
0.68
Second quartile (Median) of skewness among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
10.13
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
0
Percentage of binary attributes.
0.24
Second quartile (Median) of standard deviation of attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
The maximum number of distinct values among attributes of the nominal type.
-1.48
Minimum skewness among attributes of the numeric type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
3.48
Maximum skewness among attributes of the numeric type.
0
Minimum standard deviation of attributes of the numeric type.
0
Percentage of missing values.
2.93
Third quartile of kurtosis among attributes of the numeric type.
0.7
Average class difference between consecutive instances.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
25084.79
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
99.57
Percentage of numeric attributes.
5.01
Third quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
0.43
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
2
Mean kurtosis among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
First quartile of entropy among attributes.
1.36
Third quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
365.2
Mean of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.72
First quartile of kurtosis among attributes of the numeric type.
0.85
Third quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
1.28
First quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
0
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Standard deviation of the number of distinct values among attributes of the nominal type.
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
Average number of distinct values among the attributes of the nominal type.
0.03
First quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.71
Mean skewness among attributes of the numeric type.
0.16
First quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
Percentage of instances belonging to the most frequent class.
302.4
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Entropy of the target attribute values.
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
1.78
Second quartile (Median) of kurtosis among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-1.37
Minimum kurtosis among attributes of the numeric type.
3.59
Second quartile (Median) of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
13.11
Maximum kurtosis among attributes of the numeric type.
-1.55
Minimum of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

12 tasks

2 runs - estimation_procedure: Custom 10-fold Crossvalidation - target_feature: pXC50
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task