Data
QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL1914272

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL1914272

deactivated ARFF Publicly available Visibility: public Uploaded 16-07-2016 by Noureddin Sadawi
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL1914272 (TID: 104368), and it has 38 rows and 286 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

288 features

pXC50 (target)numeric11 unique values
0 missing
molecule_id (row identifier)nominal38 unique values
0 missing
ATS8inumeric33 unique values
0 missing
P_VSA_s_3numeric31 unique values
0 missing
ATS8mnumeric37 unique values
0 missing
SaaNnumeric33 unique values
0 missing
ATS8pnumeric35 unique values
0 missing
BLTA96numeric30 unique values
0 missing
ATS8snumeric37 unique values
0 missing
BLTD48numeric30 unique values
0 missing
ATS8vnumeric34 unique values
0 missing
BLTF96numeric31 unique values
0 missing
ATSC1enumeric33 unique values
0 missing
MLOGPnumeric31 unique values
0 missing
ATSC1inumeric34 unique values
0 missing
MLOGP2numeric32 unique values
0 missing
ATSC1mnumeric35 unique values
0 missing
P_VSA_e_3numeric15 unique values
0 missing
ATSC1pnumeric35 unique values
0 missing
P_VSA_i_4numeric21 unique values
0 missing
ATSC1snumeric37 unique values
0 missing
GATS2mnumeric33 unique values
0 missing
ATSC1vnumeric34 unique values
0 missing
GATS4pnumeric36 unique values
0 missing
ATSC2enumeric33 unique values
0 missing
N.075numeric6 unique values
0 missing
ATSC2inumeric33 unique values
0 missing
NaaNnumeric6 unique values
0 missing
ATSC2mnumeric35 unique values
0 missing
NaasCnumeric8 unique values
0 missing
ATSC2pnumeric35 unique values
0 missing
P_VSA_MR_7numeric19 unique values
0 missing
ATSC2snumeric38 unique values
0 missing
X3numeric31 unique values
0 missing
ATSC2vnumeric35 unique values
0 missing
Yindexnumeric30 unique values
0 missing
ATSC3enumeric36 unique values
0 missing
AECCnumeric30 unique values
0 missing
ATSC3inumeric38 unique values
0 missing
Eig15_AEA.bo.numeric25 unique values
0 missing
ATSC3mnumeric38 unique values
0 missing
HVcpxnumeric31 unique values
0 missing
ATSC3pnumeric38 unique values
0 missing
MSDnumeric31 unique values
0 missing
ATSC3snumeric38 unique values
0 missing
ZM2Kupnumeric37 unique values
0 missing
ATSC3vnumeric38 unique values
0 missing
Eig06_AEA.bo.numeric23 unique values
0 missing
ATSC4enumeric37 unique values
0 missing
Eig06_AEA.ed.numeric18 unique values
0 missing
ATSC4inumeric35 unique values
0 missing
Eig06_EA.bo.numeric24 unique values
0 missing
ATSC4mnumeric38 unique values
0 missing
Eig06_EA.ed.numeric21 unique values
0 missing
ATSC4pnumeric38 unique values
0 missing
Eta_betaPnumeric11 unique values
0 missing
ATSC4snumeric38 unique values
0 missing
GATS6pnumeric31 unique values
0 missing
ATSC4vnumeric38 unique values
0 missing
IDDEnumeric28 unique values
0 missing
ATSC5enumeric35 unique values
0 missing
MATS6pnumeric33 unique values
0 missing
ATSC5inumeric35 unique values
0 missing
MATS6vnumeric33 unique values
0 missing
nBMnumeric10 unique values
0 missing
SM15_AEA.dm.numeric21 unique values
0 missing
SpMax1_Bh.e.numeric20 unique values
0 missing
SpMax1_Bh.i.numeric19 unique values
0 missing
SpMax1_Bh.v.numeric16 unique values
0 missing
Ucnumeric10 unique values
0 missing
GATS1mnumeric34 unique values
0 missing
P_VSA_m_2numeric33 unique values
0 missing
ALOGPnumeric35 unique values
0 missing
ALOGP2numeric35 unique values
0 missing
Hypertens.80numeric2 unique values
0 missing
Chi0_AEA.bo.numeric27 unique values
0 missing
Chi0_AEA.dm.numeric27 unique values
0 missing
Chi0_AEA.ed.numeric27 unique values
0 missing
Chi0_AEA.ri.numeric27 unique values
0 missing
Chi0_EAnumeric27 unique values
0 missing
Chi0_EA.ed.numeric31 unique values
0 missing
Chi0_EA.ri.numeric38 unique values
0 missing
CSInumeric30 unique values
0 missing
Eig14_EAnumeric21 unique values
0 missing
Eig14_EA.ri.numeric26 unique values
0 missing
Eig15_AEA.ri.numeric29 unique values
0 missing
Eig15_EAnumeric25 unique values
0 missing
Eig15_EA.bo.numeric26 unique values
0 missing
Eig15_EA.ed.numeric26 unique values
0 missing
Eig15_EA.ri.numeric30 unique values
0 missing
Eta_betaSnumeric25 unique values
0 missing
SM08_AEA.dm.numeric21 unique values
0 missing
SM09_AEA.dm.numeric25 unique values
0 missing
SM10_AEA.ri.numeric26 unique values
0 missing
SMTIVnumeric36 unique values
0 missing
UNIPnumeric26 unique values
0 missing
Eta_sh_pnumeric30 unique values
0 missing
ZM1Kupnumeric34 unique values
0 missing
Eig15_AEA.dm.numeric31 unique values
0 missing
GMTIVnumeric35 unique values
0 missing
MWnumeric34 unique values
0 missing
XMODnumeric36 unique values
0 missing
D.Dtr05numeric31 unique values
0 missing
Eig14_EA.ed.numeric26 unique values
0 missing
Eig15_AEA.ed.numeric27 unique values
0 missing
JGI9numeric8 unique values
0 missing
MATS8vnumeric35 unique values
0 missing
SIC3numeric27 unique values
0 missing
SM09_AEA.ri.numeric26 unique values
0 missing
Vindexnumeric25 unique values
0 missing
Xindexnumeric26 unique values
0 missing
Eta_betanumeric29 unique values
0 missing
Eta_FLnumeric37 unique values
0 missing
Eig01_EA.bo.numeric11 unique values
0 missing
SM11_AEA.ri.numeric11 unique values
0 missing
SpDiam_EA.bo.numeric11 unique values
0 missing
SpMax1_Bh.p.numeric13 unique values
0 missing
SpMax_EA.bo.numeric11 unique values
0 missing
GATS6vnumeric35 unique values
0 missing
IDEnumeric31 unique values
0 missing
SM12_EA.bo.numeric29 unique values
0 missing
SM13_EA.bo.numeric29 unique values
0 missing
SM14_EA.bo.numeric29 unique values
0 missing
SM15_EA.bo.numeric29 unique values
0 missing
X0Avnumeric30 unique values
0 missing
Eig06_AEA.dm.numeric30 unique values
0 missing
Eig12_AEA.dm.numeric27 unique values
0 missing
MATS4pnumeric36 unique values
0 missing
MATS6mnumeric35 unique values
0 missing
N.068numeric2 unique values
0 missing
NsssNnumeric2 unique values
0 missing
SsssNnumeric8 unique values
0 missing
P_VSA_LogP_6numeric10 unique values
0 missing
SpMax1_Bh.m.numeric18 unique values
0 missing
D.Dtr06numeric31 unique values
0 missing
P_VSA_LogP_5numeric25 unique values
0 missing
X5numeric31 unique values
0 missing
X5solnumeric27 unique values
0 missing
Eta_beta_Anumeric32 unique values
0 missing
ICRnumeric26 unique values
0 missing
Chi0_EA.bo.numeric31 unique values
0 missing
Dznumeric23 unique values
0 missing
ECCnumeric30 unique values
0 missing
IDDMnumeric28 unique values
0 missing
IDMTnumeric31 unique values
0 missing
LPRSnumeric31 unique values
0 missing
H.051numeric9 unique values
0 missing
SpMin1_Bh.i.numeric12 unique values
0 missing
ATS1snumeric36 unique values
0 missing
SpMin1_Bh.v.numeric17 unique values
0 missing
CENTnumeric30 unique values
0 missing
Chi1_AEA.bo.numeric31 unique values
0 missing
Chi1_AEA.dm.numeric31 unique values
0 missing
Chi1_AEA.ed.numeric31 unique values
0 missing
Chi1_AEA.ri.numeric31 unique values
0 missing
Chi1_EAnumeric31 unique values
0 missing
Chi1_EA.ri.numeric38 unique values
0 missing
CIDnumeric26 unique values
0 missing
Eig07_AEA.bo.numeric21 unique values
0 missing
Eig07_AEA.ed.numeric13 unique values
0 missing
Eig07_EAnumeric19 unique values
0 missing
Eig07_EA.bo.numeric18 unique values
0 missing
Eig07_EA.ed.numeric17 unique values
0 missing
Eig09_EA.ed.numeric26 unique values
0 missing
Eta_Fnumeric38 unique values
0 missing
GMTInumeric31 unique values
0 missing
IDETnumeric31 unique values
0 missing
IVDMnumeric25 unique values
0 missing
MPC01numeric13 unique values
0 missing
MWC01numeric13 unique values
0 missing
nBOnumeric13 unique values
0 missing
ON1numeric27 unique values
0 missing
RDCHInumeric31 unique values
0 missing
RDSQnumeric31 unique values
0 missing
S0Knumeric26 unique values
0 missing
SM02_AEA.ri.numeric17 unique values
0 missing
SM04_AEA.ri.numeric26 unique values
0 missing
SM15_AEA.bo.numeric19 unique values
0 missing
SMTInumeric31 unique values
0 missing
SpAD_AEA.ed.numeric31 unique values
0 missing
SpAD_AEA.ri.numeric38 unique values
0 missing
SpAD_EAnumeric31 unique values
0 missing
SpMaxA_AEA.bo.numeric23 unique values
0 missing
SpMaxA_AEA.ed.numeric23 unique values
0 missing
SpMaxA_AEA.ri.numeric21 unique values
0 missing
SpMaxA_EAnumeric19 unique values
0 missing
SpMaxA_EA.bo.numeric19 unique values
0 missing
SpMaxA_EA.ed.numeric20 unique values
0 missing
SpMaxA_EA.ri.numeric15 unique values
0 missing
SRW02numeric13 unique values
0 missing
X1numeric29 unique values
0 missing
Xunumeric31 unique values
0 missing
ZM2Pernumeric36 unique values
0 missing
Eig13_AEA.dm.numeric34 unique values
0 missing
GATS6snumeric36 unique values
0 missing
SpMin3_Bh.s.numeric24 unique values
0 missing
nHAccnumeric8 unique values
0 missing
SaaOnumeric20 unique values
0 missing
MATS6snumeric35 unique values
0 missing
TIC1numeric33 unique values
0 missing
GATS3enumeric37 unique values
0 missing
IACnumeric34 unique values
0 missing
S1Knumeric31 unique values
0 missing
TIC0numeric34 unique values
0 missing
Eig14_AEA.ri.numeric27 unique values
0 missing
SpMin1_Bh.e.numeric10 unique values
0 missing
IC2numeric32 unique values
0 missing
MATS1snumeric36 unique values
0 missing
MPC09numeric29 unique values
0 missing
MPC10numeric29 unique values
0 missing
X1Madnumeric37 unique values
0 missing
MATS6enumeric34 unique values
0 missing
ATSC6enumeric36 unique values
0 missing
Eig13_AEA.bo.numeric23 unique values
0 missing
Eig13_AEA.ri.numeric33 unique values
0 missing
Eig13_EA.bo.numeric31 unique values
0 missing
Eig13_EA.ri.numeric34 unique values
0 missing
Eig14_AEA.bo.numeric22 unique values
0 missing
Eig14_AEA.dm.numeric33 unique values
0 missing
Eig14_EA.bo.numeric26 unique values
0 missing
nABnumeric6 unique values
0 missing
P_VSA_MR_2numeric23 unique values
0 missing
SpAD_AEA.bo.numeric32 unique values
0 missing
SpAD_EA.ri.numeric38 unique values
0 missing
IC1numeric32 unique values
0 missing
Chi1_EA.ed.numeric31 unique values
0 missing
Eig10_EAnumeric28 unique values
0 missing
Eig13_EAnumeric28 unique values
0 missing
Eig13_EA.ed.numeric30 unique values
0 missing
NRSnumeric3 unique values
0 missing
Rperimnumeric6 unique values
0 missing
SM04_AEA.dm.numeric28 unique values
0 missing
SM07_AEA.dm.numeric28 unique values
0 missing
SM08_AEA.ri.numeric30 unique values
0 missing
SNarnumeric20 unique values
0 missing
X4numeric30 unique values
0 missing
Xtnumeric20 unique values
0 missing
MATS1vnumeric28 unique values
0 missing
Eig01_AEA.bo.numeric13 unique values
0 missing
SpDiam_AEA.bo.numeric14 unique values
0 missing
SpMax_AEA.bo.numeric13 unique values
0 missing
SpMax4_Bh.i.numeric28 unique values
0 missing
AACnumeric33 unique values
0 missing
AMRnumeric35 unique values
0 missing
AMWnumeric34 unique values
0 missing
ARRnumeric19 unique values
0 missing
ATS1enumeric31 unique values
0 missing
ATS1inumeric33 unique values
0 missing
ATS1mnumeric31 unique values
0 missing
ATS1pnumeric34 unique values
0 missing
ATS1vnumeric31 unique values
0 missing
ATS2enumeric32 unique values
0 missing
ATS2inumeric34 unique values
0 missing
ATS2mnumeric33 unique values
0 missing
ATS2pnumeric32 unique values
0 missing
ATS2snumeric37 unique values
0 missing
ATS2vnumeric34 unique values
0 missing
ATS3enumeric35 unique values
0 missing
ATS3inumeric29 unique values
0 missing
ATS3mnumeric35 unique values
0 missing
ATS3pnumeric35 unique values
0 missing
ATS3snumeric34 unique values
0 missing
ATS3vnumeric33 unique values
0 missing
ATS4enumeric38 unique values
0 missing
ATS4inumeric35 unique values
0 missing
ATS4mnumeric37 unique values
0 missing
ATS4pnumeric35 unique values
0 missing
ATS4snumeric37 unique values
0 missing
ATS4vnumeric35 unique values
0 missing
ATS5enumeric37 unique values
0 missing
ATS5inumeric36 unique values
0 missing
ATS5mnumeric36 unique values
0 missing
ATS5pnumeric32 unique values
0 missing
ATS5snumeric36 unique values
0 missing
ATS5vnumeric35 unique values
0 missing
ATS6enumeric33 unique values
0 missing
ATS6inumeric34 unique values
0 missing
ATS6mnumeric36 unique values
0 missing
ATS6pnumeric34 unique values
0 missing
ATS6snumeric36 unique values
0 missing
ATS6vnumeric35 unique values
0 missing
ATS7enumeric35 unique values
0 missing
ATS7inumeric32 unique values
0 missing
ATS7mnumeric34 unique values
0 missing
ATS7pnumeric34 unique values
0 missing
ATS7snumeric35 unique values
0 missing
ATS7vnumeric31 unique values
0 missing
ATS8enumeric36 unique values
0 missing

62 properties

38
Number of instances (rows) of the dataset.
288
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
287
Number of numeric attributes.
1
Number of nominal attributes.
Entropy of the target attribute values.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
Second quartile (Median) of entropy among attributes.
7.58
Number of attributes divided by the number of instances.
Average number of distinct values among the attributes of the nominal type.
0.32
Second quartile (Median) of kurtosis among attributes of the numeric type.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
-0.18
Mean skewness among attributes of the numeric type.
4.52
Second quartile (Median) of means among attributes of the numeric type.
Percentage of instances belonging to the most frequent class.
111.26
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
-0.39
Second quartile (Median) of skewness among attributes of the numeric type.
Maximum entropy among attributes.
-2.11
Minimum kurtosis among attributes of the numeric type.
0
Percentage of binary attributes.
0.3
Second quartile (Median) of standard deviation of attributes of the numeric type.
9.79
Maximum kurtosis among attributes of the numeric type.
-5.45
Minimum of means among attributes of the numeric type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
34971.47
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
0
Percentage of missing values.
1.21
Third quartile of kurtosis among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
99.65
Percentage of numeric attributes.
16.87
Third quartile of means among attributes of the numeric type.
The maximum number of distinct values among attributes of the nominal type.
-2.71
Minimum skewness among attributes of the numeric type.
0.35
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
2.54
Maximum skewness among attributes of the numeric type.
0
Minimum standard deviation of attributes of the numeric type.
First quartile of entropy among attributes.
0.45
Third quartile of skewness among attributes of the numeric type.
11166.01
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
-0.18
First quartile of kurtosis among attributes of the numeric type.
1.71
Third quartile of standard deviation of attributes of the numeric type.
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
1.34
First quartile of means among attributes of the numeric type.
Standard deviation of the number of distinct values among attributes of the nominal type.
0.73
Mean kurtosis among attributes of the numeric type.
0
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
408.54
Mean of means among attributes of the numeric type.
-0.85
First quartile of skewness among attributes of the numeric type.
0.3
Average class difference between consecutive instances.
Average mutual information between the nominal attributes and the target attribute.
0.1
First quartile of standard deviation of attributes of the numeric type.

12 tasks

2 runs - estimation_procedure: Custom 10-fold Crossvalidation - target_feature: pXC50
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task