Data
QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL279

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL279

deactivated ARFF Publicly available Visibility: public Uploaded 15-07-2016 by Noureddin Sadawi
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL279 (TID: 10980), and it has 5766 rows and 283 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

285 features

pXC50 (target)numeric1382 unique values
0 missing
molecule_id (row identifier)nominal5766 unique values
0 missing
Eig03_EA.ri.numeric963 unique values
0 missing
nCONNnumeric4 unique values
0 missing
SpMaxA_EAnumeric166 unique values
0 missing
C.041numeric6 unique values
0 missing
IC5numeric1172 unique values
0 missing
SM03_EA.ri.numeric1147 unique values
0 missing
SM02_AEA.bo.numeric645 unique values
0 missing
CATS2D_02_DDnumeric6 unique values
0 missing
SM03_AEA.bo.numeric957 unique values
0 missing
ATSC1inumeric990 unique values
0 missing
XMODnumeric4888 unique values
0 missing
Eig01_EA.ed.numeric1683 unique values
0 missing
MPC04numeric150 unique values
0 missing
SM10_AEA.dm.numeric1683 unique values
0 missing
CATS2D_03_DLnumeric17 unique values
0 missing
SpMax_EA.ed.numeric1683 unique values
0 missing
nR05numeric5 unique values
0 missing
N.072numeric7 unique values
0 missing
SM06_EAnumeric1251 unique values
0 missing
SpMin8_Bh.i.numeric915 unique values
0 missing
SpAD_AEA.ri.numeric5212 unique values
0 missing
SpMax7_Bh.e.numeric954 unique values
0 missing
Xtnumeric155 unique values
0 missing
Eig04_EA.ri.numeric1117 unique values
0 missing
Eig13_EA.ri.numeric1976 unique values
0 missing
SdOnumeric2567 unique values
0 missing
Eig05_EAnumeric1148 unique values
0 missing
SpAD_EA.dm.numeric1310 unique values
0 missing
SM13_AEA.bo.numeric1148 unique values
0 missing
SpMin8_Bh.e.numeric925 unique values
0 missing
Svnumeric3672 unique values
0 missing
Eig13_AEA.dm.numeric1833 unique values
0 missing
SpMin5_Bh.i.numeric866 unique values
0 missing
Eig15_AEA.bo.numeric1865 unique values
0 missing
Eig06_AEA.dm.numeric1467 unique values
0 missing
Eig14_AEA.bo.numeric1851 unique values
0 missing
X5solnumeric3286 unique values
0 missing
SpMax1_Bh.s.numeric386 unique values
0 missing
SM11_EA.dm.numeric256 unique values
0 missing
SpMin8_Bh.p.numeric958 unique values
0 missing
BIDnumeric204 unique values
0 missing
SM04_EA.dm.numeric1165 unique values
0 missing
SpMax6_Bh.e.numeric925 unique values
0 missing
SM03_EA.ed.numeric602 unique values
0 missing
SpMaxA_AEA.ed.numeric303 unique values
0 missing
nR09numeric7 unique values
0 missing
MWnumeric3579 unique values
0 missing
ATS2snumeric1353 unique values
0 missing
ON1numeric438 unique values
0 missing
Hypnotic.80numeric2 unique values
0 missing
IC4numeric1188 unique values
0 missing
nCb.numeric16 unique values
0 missing
SpAD_EA.bo.numeric4214 unique values
0 missing
Xindexnumeric363 unique values
0 missing
SM13_EA.dm.numeric234 unique values
0 missing
UNIPnumeric304 unique values
0 missing
ATSC2enumeric1029 unique values
0 missing
Eig08_AEA.ed.numeric1661 unique values
0 missing
Eta_Bnumeric723 unique values
0 missing
GGI1numeric29 unique values
0 missing
GGI3numeric464 unique values
0 missing
GGI9numeric542 unique values
0 missing
Vindexnumeric283 unique values
0 missing
Eig07_AEA.dm.numeric1522 unique values
0 missing
SpMax8_Bh.s.numeric1450 unique values
0 missing
O.058numeric7 unique values
0 missing
SM10_EA.dm.numeric761 unique values
0 missing
NdOnumeric7 unique values
0 missing
D.Dtr10numeric1212 unique values
0 missing
SM06_EA.dm.numeric1081 unique values
0 missing
Eig11_AEA.ed.numeric1686 unique values
0 missing
Chi1_EA.dm.numeric4007 unique values
0 missing
BACnumeric121 unique values
0 missing
SM07_EAnumeric869 unique values
0 missing
SM07_AEA.bo.numeric1140 unique values
0 missing
SpMax5_Bh.i.numeric929 unique values
0 missing
SpMax5_Bh.e.numeric965 unique values
0 missing
Eig06_EAnumeric1238 unique values
0 missing
SM14_AEA.bo.numeric1238 unique values
0 missing
Eig04_EAnumeric1042 unique values
0 missing
SM12_AEA.bo.numeric1042 unique values
0 missing
X2numeric3361 unique values
0 missing
Eig10_AEA.dm.numeric1673 unique values
0 missing
SM04_AEA.bo.numeric1005 unique values
0 missing
Yindexnumeric723 unique values
0 missing
Eig07_AEA.ed.numeric1700 unique values
0 missing
CATS2D_07_LLnumeric36 unique values
0 missing
Eig09_AEA.ri.numeric1517 unique values
0 missing
SpAD_AEA.dm.numeric4790 unique values
0 missing
Eig10_AEA.bo.numeric1435 unique values
0 missing
Eig09_AEA.ed.numeric1711 unique values
0 missing
SpAD_AEA.bo.numeric4186 unique values
0 missing
Eig10_EA.bo.numeric1672 unique values
0 missing
SPInumeric3714 unique values
0 missing
ATSC4inumeric1857 unique values
0 missing
P_VSA_MR_6numeric3715 unique values
0 missing
MSDnumeric3028 unique values
0 missing
nHetnumeric22 unique values
0 missing
ATSC7snumeric5480 unique values
0 missing
SpMin5_Bh.p.numeric835 unique values
0 missing
Eig09_EA.ed.numeric2474 unique values
0 missing
SM04_AEA.ri.numeric2474 unique values
0 missing
Eig09_AEA.bo.numeric1310 unique values
0 missing
X1solnumeric2685 unique values
0 missing
SM08_EA.bo.numeric1482 unique values
0 missing
Eig09_EAnumeric1364 unique values
0 missing
SM03_AEA.dm.numeric1364 unique values
0 missing
Eig04_AEA.ri.numeric1137 unique values
0 missing
Eig11_EAnumeric1532 unique values
0 missing
SM05_AEA.dm.numeric1532 unique values
0 missing
SM05_AEA.bo.numeric1030 unique values
0 missing
Eig12_AEA.ed.numeric1681 unique values
0 missing
MWC10numeric1353 unique values
0 missing
SM08_AEA.bo.numeric1205 unique values
0 missing
Eig02_EA.bo.numeric980 unique values
0 missing
SM12_AEA.ri.numeric980 unique values
0 missing
DBInumeric65 unique values
0 missing
SRW04numeric173 unique values
0 missing
MWC02numeric114 unique values
0 missing
ZM1numeric114 unique values
0 missing
ATSC2snumeric5223 unique values
0 missing
Eig07_EA.bo.numeric1381 unique values
0 missing
Eig09_EA.ri.numeric1475 unique values
0 missing
SRW06numeric675 unique values
0 missing
IDMTnumeric4243 unique values
0 missing
Eig10_EAnumeric1473 unique values
0 missing
SM04_AEA.dm.numeric1473 unique values
0 missing
GGI10numeric430 unique values
0 missing
SMTInumeric3885 unique values
0 missing
X2solnumeric3462 unique values
0 missing
Eig08_AEA.bo.numeric1218 unique values
0 missing
MWC04numeric570 unique values
0 missing
P_VSA_LogP_2numeric754 unique values
0 missing
Dznumeric473 unique values
0 missing
CSInumeric1391 unique values
0 missing
Xunumeric3818 unique values
0 missing
Eig05_EA.ed.numeric2521 unique values
0 missing
SM14_AEA.dm.numeric2521 unique values
0 missing
CIDnumeric896 unique values
0 missing
GMTInumeric3902 unique values
0 missing
SM06_EA.ri.numeric1385 unique values
0 missing
Eig09_EA.bo.numeric1554 unique values
0 missing
SpMAD_EA.dm.numeric490 unique values
0 missing
Eig08_EA.ed.numeric2538 unique values
0 missing
SM03_AEA.ri.numeric2538 unique values
0 missing
SM02_EA.ri.numeric1071 unique values
0 missing
Eig12_AEA.ri.numeric1897 unique values
0 missing
DECCnumeric1654 unique values
0 missing
piIDnumeric2950 unique values
0 missing
Eig05_EA.ri.numeric1245 unique values
0 missing
SpMax3_Bh.v.numeric720 unique values
0 missing
Eig03_AEA.ri.numeric941 unique values
0 missing
SpMax6_Bh.v.numeric956 unique values
0 missing
SpMax8_Bh.v.numeric1061 unique values
0 missing
Psi_e_1numeric3111 unique values
0 missing
Chi0_EA.bo.numeric3369 unique values
0 missing
SpMax5_Bh.v.numeric987 unique values
0 missing
SpMax3_Bh.p.numeric757 unique values
0 missing
SpDiam_AEA.dm.numeric989 unique values
0 missing
ATSC2inumeric1567 unique values
0 missing
P_VSA_e_2numeric4626 unique values
0 missing
ICRnumeric1088 unique values
0 missing
ATSC6snumeric5497 unique values
0 missing
Eig08_EA.bo.numeric1426 unique values
0 missing
Eig11_AEA.dm.numeric1719 unique values
0 missing
Chi1_EA.ed.numeric2807 unique values
0 missing
ZM1Kupnumeric4566 unique values
0 missing
Eig07_EA.ri.numeric1300 unique values
0 missing
Chi0_EA.ed.numeric3365 unique values
0 missing
Eig01_AEA.dm.numeric986 unique values
0 missing
SpMax_AEA.dm.numeric986 unique values
0 missing
RDSQnumeric4242 unique values
0 missing
SM02_AEA.ed.numeric263 unique values
0 missing
Eig08_EA.ri.numeric1380 unique values
0 missing
MWC05numeric1032 unique values
0 missing
Eig07_AEA.bo.numeric1163 unique values
0 missing
MAXDNnumeric2204 unique values
0 missing
X4solnumeric3387 unique values
0 missing
ATSC6inumeric1965 unique values
0 missing
LPRSnumeric4213 unique values
0 missing
SpMax7_Bh.v.numeric970 unique values
0 missing
ATS2vnumeric1084 unique values
0 missing
ECCnumeric754 unique values
0 missing
IDETnumeric4217 unique values
0 missing
Eig07_EAnumeric1233 unique values
0 missing
SM15_AEA.bo.numeric1233 unique values
0 missing
ON0numeric215 unique values
0 missing
ATSC3inumeric1719 unique values
0 missing
Eig06_AEA.bo.numeric1165 unique values
0 missing
Eig02_EA.dm.numeric177 unique values
0 missing
Eig13_AEA.ed.numeric1598 unique values
0 missing
piPC04numeric1169 unique values
0 missing
Eig15_AEA.ed.numeric1596 unique values
0 missing
IVDMnumeric616 unique values
0 missing
ATS2mnumeric1066 unique values
0 missing
SpMax8_Bh.m.numeric1044 unique values
0 missing
SpMin7_Bh.i.numeric882 unique values
0 missing
SM05_EA.ri.numeric1404 unique values
0 missing
ATSC5snumeric5537 unique values
0 missing
Eig05_AEA.ed.numeric1461 unique values
0 missing
Eig07_AEA.ri.numeric1346 unique values
0 missing
SM03_EAnumeric25 unique values
0 missing
Eig06_EA.bo.numeric1359 unique values
0 missing
Chi1_AEA.bo.numeric2830 unique values
0 missing
Chi1_AEA.dm.numeric2830 unique values
0 missing
Chi1_AEA.ed.numeric2830 unique values
0 missing
Chi1_AEA.ri.numeric2830 unique values
0 missing
Chi1_EAnumeric2830 unique values
0 missing
Eig08_EAnumeric1296 unique values
0 missing
SM02_AEA.dm.numeric1296 unique values
0 missing
Eig10_AEA.ed.numeric1713 unique values
0 missing
Chi0_AEA.bo.numeric1981 unique values
0 missing
Chi0_AEA.dm.numeric1981 unique values
0 missing
Chi0_AEA.ed.numeric1981 unique values
0 missing
Chi0_AEA.ri.numeric1981 unique values
0 missing
Chi0_EAnumeric1981 unique values
0 missing
Eta_alphanumeric1457 unique values
0 missing
SRW08numeric1096 unique values
0 missing
Chi1_EA.bo.numeric3541 unique values
0 missing
SM04_EAnumeric330 unique values
0 missing
Eig10_EA.ed.numeric2435 unique values
0 missing
SM05_AEA.ri.numeric2435 unique values
0 missing
Eig08_AEA.ri.numeric1407 unique values
0 missing
X1numeric2103 unique values
0 missing
Eig04_AEA.bo.numeric996 unique values
0 missing
Eig07_EA.ed.numeric2620 unique values
0 missing
SM02_AEA.ri.numeric2620 unique values
0 missing
IDMnumeric1898 unique values
0 missing
Eig01_AEA.ed.numeric936 unique values
0 missing
SpMax_AEA.ed.numeric936 unique values
0 missing
Eig04_EA.bo.numeric1224 unique values
0 missing
SM14_AEA.ri.numeric1224 unique values
0 missing
Eig08_AEA.dm.numeric1561 unique values
0 missing
Eig05_AEA.dm.numeric1374 unique values
0 missing
X5numeric3259 unique values
0 missing
Chi0_EA.ri.numeric4504 unique values
0 missing
SM03_AEA.ed.numeric949 unique values
0 missing
AECCnumeric2127 unique values
0 missing
Eig01_EA.bo.numeric869 unique values
0 missing
SM11_AEA.ri.numeric869 unique values
0 missing
SpMax_EA.bo.numeric869 unique values
0 missing
SM06_AEA.bo.numeric1088 unique values
0 missing
MPC03numeric103 unique values
0 missing
Eig14_AEA.ri.numeric2080 unique values
0 missing
Eig12_EAnumeric1637 unique values
0 missing
SM06_AEA.dm.numeric1637 unique values
0 missing
SNarnumeric315 unique values
0 missing
X0solnumeric1442 unique values
0 missing
SpMax5_Bh.m.numeric1001 unique values
0 missing
IDDMnumeric595 unique values
0 missing
Psi_i_0numeric4150 unique values
0 missing
CATS2D_04_DLnumeric16 unique values
0 missing
SpDiam_EA.bo.numeric872 unique values
0 missing
RDCHInumeric1954 unique values
0 missing
Chi1_EA.ri.numeric4544 unique values
0 missing
Eig12_EA.ed.numeric2453 unique values
0 missing
SM07_AEA.ri.numeric2453 unique values
0 missing
ATS4inumeric1437 unique values
0 missing
SpAD_EAnumeric4036 unique values
0 missing
VvdwMGnumeric3047 unique values
0 missing
Vxnumeric3047 unique values
0 missing
Eig09_AEA.dm.numeric1630 unique values
0 missing
MWC06numeric1151 unique values
0 missing
ATSC6mnumeric5213 unique values
0 missing
SpMax5_Bh.p.numeric990 unique values
0 missing
SpMaxA_AEA.ri.numeric214 unique values
0 missing
Eig02_AEA.dm.numeric1045 unique values
0 missing
Eig14_AEA.dm.numeric1932 unique values
0 missing
ATS4mnumeric1305 unique values
0 missing
piPC01numeric123 unique values
0 missing
SCBOnumeric123 unique values
0 missing
SpMaxA_AEA.bo.numeric230 unique values
0 missing
ATS5snumeric1496 unique values
0 missing
Eig13_AEA.bo.numeric1794 unique values
0 missing
SaasNnumeric779 unique values
0 missing
SM05_AEA.ed.numeric1243 unique values
0 missing
ATS3vnumeric1194 unique values
0 missing
Eta_betanumeric216 unique values
0 missing
SM05_EA.dm.numeric283 unique values
0 missing
MDDDnumeric3927 unique values
0 missing
Polnumeric85 unique values
0 missing
ATSC3snumeric5382 unique values
0 missing
Eig10_EA.ri.numeric1600 unique values
0 missing

62 properties

5766
Number of instances (rows) of the dataset.
285
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
284
Number of numeric attributes.
1
Number of nominal attributes.
8.09
Maximum skewness among attributes of the numeric type.
0.02
Minimum standard deviation of attributes of the numeric type.
First quartile of entropy among attributes.
0.54
Third quartile of skewness among attributes of the numeric type.
21502.69
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
0.87
First quartile of kurtosis among attributes of the numeric type.
2.65
Third quartile of standard deviation of attributes of the numeric type.
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
1.96
First quartile of means among attributes of the numeric type.
Standard deviation of the number of distinct values among attributes of the nominal type.
4.93
Mean kurtosis among attributes of the numeric type.
0
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
211.75
Mean of means among attributes of the numeric type.
-0.84
First quartile of skewness among attributes of the numeric type.
Average mutual information between the nominal attributes and the target attribute.
0.28
First quartile of standard deviation of attributes of the numeric type.
-0.12
Average class difference between consecutive instances.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
Second quartile (Median) of entropy among attributes.
Entropy of the target attribute values.
Average number of distinct values among the attributes of the nominal type.
1.69
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.05
Number of attributes divided by the number of instances.
-0.01
Mean skewness among attributes of the numeric type.
4.05
Second quartile (Median) of means among attributes of the numeric type.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
Percentage of instances belonging to the most frequent class.
151.26
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
-0.32
Second quartile (Median) of skewness among attributes of the numeric type.
Maximum entropy among attributes.
-0.79
Minimum kurtosis among attributes of the numeric type.
0
Percentage of binary attributes.
0.54
Second quartile (Median) of standard deviation of attributes of the numeric type.
170.73
Maximum kurtosis among attributes of the numeric type.
0.1
Minimum of means among attributes of the numeric type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
25866.96
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
0
Percentage of missing values.
3.71
Third quartile of kurtosis among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
99.65
Percentage of numeric attributes.
11.56
Third quartile of means among attributes of the numeric type.
The maximum number of distinct values among attributes of the nominal type.
-2.67
Minimum skewness among attributes of the numeric type.
0.35
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.

12 tasks

1 runs - estimation_procedure: Custom 10-fold Crossvalidation - target_feature: pXC50
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task