Data
QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL4860

QSAR-DATASET-FOR-DRUG-TARGET-CHEMBL4860

deactivated ARFF Publicly available Visibility: public Uploaded 15-07-2016 by Noureddin Sadawi
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target ChEMBL_ID: CHEMBL4860 (TID: 100304), and it has 514 rows and 341 features (not including molecule IDs and class feature: molecule_id and pXC50). The features represent Molecular Descriptors which were generated from SMILES strings. Missing value imputation was applied to this dataset (By choosing the Median). Feature selection was also applied.

343 features

pXC50 (target)numeric348 unique values
0 missing
molecule_id (row identifier)nominal514 unique values
0 missing
SM15_AEA.bo.numeric302 unique values
0 missing
SpMax4_Bh.m.numeric224 unique values
0 missing
ATSC6mnumeric500 unique values
0 missing
SpMax3_Bh.p.numeric196 unique values
0 missing
MPC01numeric89 unique values
0 missing
SpMax3_Bh.e.numeric214 unique values
0 missing
MWC01numeric89 unique values
0 missing
P_VSA_MR_7numeric80 unique values
0 missing
nBOnumeric89 unique values
0 missing
SpMax3_Bh.i.numeric194 unique values
0 missing
SRW02numeric89 unique values
0 missing
Eig06_EA.bo.numeric292 unique values
0 missing
ATS1pnumeric367 unique values
0 missing
nHMnumeric6 unique values
0 missing
ZM1MulPernumeric481 unique values
0 missing
P_VSA_m_4numeric32 unique values
0 missing
ZM1Vnumeric277 unique values
0 missing
SpMin3_Bh.e.numeric175 unique values
0 missing
RDCHInumeric413 unique values
0 missing
SpMin3_Bh.v.numeric189 unique values
0 missing
ATS8vnumeric430 unique values
0 missing
SpMax2_Bh.e.numeric203 unique values
0 missing
ATSC7snumeric498 unique values
0 missing
nCb.numeric21 unique values
0 missing
ATS3pnumeric394 unique values
0 missing
SpMax2_Bh.v.numeric201 unique values
0 missing
Eig14_AEA.bo.numeric283 unique values
0 missing
Eig02_EA.ri.numeric215 unique values
0 missing
SpMax7_Bh.s.numeric260 unique values
0 missing
SpMax4_Bh.e.numeric235 unique values
0 missing
Eig13_EA.bo.numeric298 unique values
0 missing
SpMin3_Bh.i.numeric170 unique values
0 missing
ATS6inumeric414 unique values
0 missing
Cl.089numeric5 unique values
0 missing
IC5numeric383 unique values
0 missing
nCLnumeric5 unique values
0 missing
SpMin1_Bh.p.numeric124 unique values
0 missing
NsClnumeric5 unique values
0 missing
S2Knumeric434 unique values
0 missing
P_VSA_e_4numeric5 unique values
0 missing
CIC0numeric380 unique values
0 missing
DLS_02numeric7 unique values
0 missing
Xtnumeric133 unique values
0 missing
N.070numeric4 unique values
0 missing
SpMax6_Bh.p.numeric293 unique values
0 missing
SpMax3_Bh.v.numeric205 unique values
0 missing
GATS5mnumeric288 unique values
0 missing
SpMax5_Bh.p.numeric241 unique values
0 missing
SpMin4_Bh.e.numeric205 unique values
0 missing
SpMin3_Bh.p.numeric187 unique values
0 missing
P_VSA_MR_1numeric87 unique values
0 missing
SpMin3_Bh.s.numeric204 unique values
0 missing
Eig06_AEA.bo.numeric260 unique values
0 missing
SpMin2_Bh.v.numeric151 unique values
0 missing
SpMin8_Bh.e.numeric278 unique values
0 missing
LLS_02numeric7 unique values
0 missing
SpMin7_Bh.v.numeric300 unique values
0 missing
P_VSA_i_2numeric445 unique values
0 missing
SM12_EA.bo.numeric339 unique values
0 missing
SpMin4_Bh.i.numeric201 unique values
0 missing
ATS5vnumeric420 unique values
0 missing
H.047numeric58 unique values
0 missing
X1Kupnumeric468 unique values
0 missing
SpMax5_Bh.m.numeric284 unique values
0 missing
Chi1_EA.bo.numeric438 unique values
0 missing
nSO2Nnumeric3 unique values
0 missing
ATS2vnumeric373 unique values
0 missing
SpMin5_Bh.m.numeric240 unique values
0 missing
Eig14_AEA.dm.numeric307 unique values
0 missing
Rperimnumeric41 unique values
0 missing
CATS2D_05_LLnumeric35 unique values
0 missing
SpMax5_Bh.v.numeric258 unique values
0 missing
MATS5mnumeric215 unique values
0 missing
SpMin3_Bh.m.numeric164 unique values
0 missing
SM14_EAnumeric362 unique values
0 missing
NsssNnumeric6 unique values
0 missing
SM07_EAnumeric331 unique values
0 missing
Eig07_AEA.bo.numeric274 unique values
0 missing
SpMaxA_AEA.ri.numeric121 unique values
0 missing
SpMax6_Bh.m.numeric314 unique values
0 missing
ATSC6vnumeric498 unique values
0 missing
D.Dtr06numeric422 unique values
0 missing
SM05_EA.ri.numeric377 unique values
0 missing
SpMin2_Bh.i.numeric154 unique values
0 missing
nArNHRnumeric4 unique values
0 missing
SpMin4_Bh.p.numeric207 unique values
0 missing
SM05_AEA.ed.numeric362 unique values
0 missing
Xindexnumeric204 unique values
0 missing
MDDDnumeric431 unique values
0 missing
NddssSnumeric3 unique values
0 missing
X1MulPernumeric460 unique values
0 missing
P_VSA_s_1numeric5 unique values
0 missing
Eig15_EA.ed.numeric326 unique values
0 missing
S.110numeric3 unique values
0 missing
SM10_AEA.ri.numeric326 unique values
0 missing
SpMax3_Bh.m.numeric245 unique values
0 missing
ATS8pnumeric420 unique values
0 missing
CATS2D_04_DAnumeric9 unique values
0 missing
ATS6snumeric411 unique values
0 missing
Eig03_AEA.ri.numeric254 unique values
0 missing
PHInumeric435 unique values
0 missing
SpMin6_Bh.i.numeric262 unique values
0 missing
CENTnumeric420 unique values
0 missing
SM08_EA.bo.numeric362 unique values
0 missing
CSInumeric402 unique values
0 missing
Vindexnumeric162 unique values
0 missing
ATSC1inumeric395 unique values
0 missing
nCarnumeric28 unique values
0 missing
ATS2snumeric412 unique values
0 missing
Eig03_EAnumeric230 unique values
0 missing
X1Madnumeric470 unique values
0 missing
SM11_AEA.bo.numeric230 unique values
0 missing
ECCnumeric381 unique values
0 missing
NRSnumeric9 unique values
0 missing
SpMin1_Bh.i.numeric117 unique values
0 missing
Yindexnumeric270 unique values
0 missing
SM10_EAnumeric371 unique values
0 missing
IC3numeric409 unique values
0 missing
SM06_AEA.ed.numeric355 unique values
0 missing
nABnumeric23 unique values
0 missing
SpMin4_Bh.s.numeric244 unique values
0 missing
SpMin1_Bh.e.numeric117 unique values
0 missing
ATS5pnumeric429 unique values
0 missing
SpMax4_Bh.v.numeric223 unique values
0 missing
GMTIVnumeric485 unique values
0 missing
piPC10numeric401 unique values
0 missing
CATS2D_09_ALnumeric42 unique values
0 missing
MATS1enumeric250 unique values
0 missing
Eig05_AEA.bo.numeric254 unique values
0 missing
SpMin2_Bh.s.numeric151 unique values
0 missing
Senumeric431 unique values
0 missing
SssCH2numeric444 unique values
0 missing
ATS1enumeric371 unique values
0 missing
NaasCnumeric19 unique values
0 missing
ZM1Madnumeric454 unique values
0 missing
Wapnumeric401 unique values
0 missing
CATS2D_08_ALnumeric39 unique values
0 missing
nCbHnumeric24 unique values
0 missing
ATS7enumeric426 unique values
0 missing
SpMax2_Bh.m.numeric221 unique values
0 missing
P_VSA_e_1numeric73 unique values
0 missing
SsClnumeric167 unique values
0 missing
P_VSA_m_1numeric73 unique values
0 missing
IC2numeric403 unique values
0 missing
P_VSA_s_2numeric94 unique values
0 missing
CATS2D_03_LLnumeric39 unique values
0 missing
P_VSA_v_1numeric73 unique values
0 missing
SpMax5_Bh.i.numeric254 unique values
0 missing
SpMaxA_AEA.ed.numeric148 unique values
0 missing
MATS4snumeric256 unique values
0 missing
ATSC3pnumeric471 unique values
0 missing
nArXnumeric5 unique values
0 missing
SpMin4_Bh.m.numeric193 unique values
0 missing
SpMin5_Bh.i.numeric244 unique values
0 missing
Psi_e_0numeric490 unique values
0 missing
NaaCHnumeric24 unique values
0 missing
ATS1inumeric373 unique values
0 missing
SM06_EA.bo.numeric367 unique values
0 missing
Eta_Fnumeric500 unique values
0 missing
Eig04_EA.bo.numeric247 unique values
0 missing
nCnumeric65 unique values
0 missing
SM14_AEA.ri.numeric247 unique values
0 missing
ATSC2pnumeric455 unique values
0 missing
SpMax5_Bh.e.numeric254 unique values
0 missing
Eig02_AEA.ri.numeric225 unique values
0 missing
SpMin2_Bh.p.numeric164 unique values
0 missing
ALOGPnumeric457 unique values
0 missing
C.006numeric14 unique values
0 missing
ATSC3inumeric444 unique values
0 missing
SpMax7_Bh.e.numeric287 unique values
0 missing
CATS2D_06_AAnumeric14 unique values
0 missing
nBnznumeric8 unique values
0 missing
SpMin6_Bh.e.numeric247 unique values
0 missing
Eig04_AEA.bo.numeric241 unique values
0 missing
SM04_EA.bo.numeric360 unique values
0 missing
SpMin6_Bh.v.numeric271 unique values
0 missing
PCDnumeric401 unique values
0 missing
Eig07_EA.ed.numeric303 unique values
0 missing
SM02_AEA.ri.numeric303 unique values
0 missing
piPC03numeric340 unique values
0 missing
Eig03_AEA.bo.numeric226 unique values
0 missing
SpMax2_Bh.p.numeric202 unique values
0 missing
SM05_EA.bo.numeric330 unique values
0 missing
Eig04_EAnumeric275 unique values
0 missing
SM12_AEA.bo.numeric275 unique values
0 missing
SpMax7_Bh.v.numeric306 unique values
0 missing
Eig07_AEA.ri.numeric318 unique values
0 missing
ON1Vnumeric436 unique values
0 missing
IDDEnumeric332 unique values
0 missing
SpMax8_Bh.s.numeric259 unique values
0 missing
ATS6mnumeric434 unique values
0 missing
MATS1mnumeric187 unique values
0 missing
Eig07_EA.bo.numeric275 unique values
0 missing
piPC04numeric354 unique values
0 missing
SAtotnumeric454 unique values
0 missing
piPC05numeric359 unique values
0 missing
Eig03_EA.ed.numeric249 unique values
0 missing
SM12_AEA.dm.numeric249 unique values
0 missing
piIDnumeric403 unique values
0 missing
Eta_sh_xnumeric57 unique values
0 missing
TRSnumeric35 unique values
0 missing
nR06numeric10 unique values
0 missing
SpMax7_Bh.p.numeric293 unique values
0 missing
SM15_EA.bo.numeric341 unique values
0 missing
SM03_EA.bo.numeric168 unique values
0 missing
Eig04_AEA.ri.numeric287 unique values
0 missing
TIC1numeric449 unique values
0 missing
X1solnumeric412 unique values
0 missing
SpMax7_Bh.m.numeric284 unique values
0 missing
C.024numeric24 unique values
0 missing
Eig04_EA.ed.numeric284 unique values
0 missing
SM13_AEA.dm.numeric284 unique values
0 missing
Eta_alphanumeric370 unique values
0 missing
GGI9numeric334 unique values
0 missing
SpMax6_Bh.s.numeric251 unique values
0 missing
GATS1enumeric290 unique values
0 missing
ATSC7inumeric476 unique values
0 missing
DLS_consnumeric63 unique values
0 missing
HVcpxnumeric366 unique values
0 missing
IC4numeric387 unique values
0 missing
ON0numeric233 unique values
0 missing
SM06_EAnumeric375 unique values
0 missing
SpMax4_Bh.p.numeric230 unique values
0 missing
ATSC7mnumeric499 unique values
0 missing
piPC07numeric382 unique values
0 missing
MATS2snumeric276 unique values
0 missing
CATS2D_06_ALnumeric40 unique values
0 missing
Eig05_EAnumeric280 unique values
0 missing
SM13_AEA.bo.numeric280 unique values
0 missing
BIDnumeric143 unique values
0 missing
Eig10_AEA.dm.numeric351 unique values
0 missing
Eig03_AEA.ed.numeric213 unique values
0 missing
X5vnumeric486 unique values
0 missing
SpMin5_Bh.e.numeric242 unique values
0 missing
ZM2Pernumeric484 unique values
0 missing
MWnumeric435 unique values
0 missing
XMODnumeric468 unique values
0 missing
IVDMnumeric315 unique values
0 missing
SM09_EA.bo.numeric362 unique values
0 missing
GGI2numeric57 unique values
0 missing
CIDnumeric324 unique values
0 missing
P_VSA_MR_6numeric339 unique values
0 missing
TIC2numeric475 unique values
0 missing
AECCnumeric406 unique values
0 missing
ALOGP2numeric463 unique values
0 missing
Eig09_AEA.dm.numeric313 unique values
0 missing
SpMax2_Bh.i.numeric187 unique values
0 missing
CATS2D_05_ALnumeric34 unique values
0 missing
ATSC6inumeric461 unique values
0 missing
Xunumeric429 unique values
0 missing
X2solnumeric431 unique values
0 missing
ATSC6pnumeric493 unique values
0 missing
ATS7pnumeric421 unique values
0 missing
Eig12_AEA.dm.numeric307 unique values
0 missing
Chi1_AEA.bo.numeric416 unique values
0 missing
Chi1_AEA.dm.numeric416 unique values
0 missing
Chi1_AEA.ed.numeric416 unique values
0 missing
Chi1_AEA.ri.numeric416 unique values
0 missing
Chi1_EAnumeric416 unique values
0 missing
S1Knumeric412 unique values
0 missing
P_VSA_p_3numeric446 unique values
0 missing
P_VSA_v_3numeric446 unique values
0 missing
Chi0_EA.ri.numeric470 unique values
0 missing
IDDMnumeric320 unique values
0 missing
X1numeric395 unique values
0 missing
ZM1Pernumeric479 unique values
0 missing
GGI5numeric328 unique values
0 missing
SRW10numeric364 unique values
0 missing
nSKnumeric84 unique values
0 missing
IDETnumeric431 unique values
0 missing
MSDnumeric423 unique values
0 missing
MAXDPnumeric441 unique values
0 missing
Psi_i_0numeric454 unique values
0 missing
TIC3numeric458 unique values
0 missing
Eig15_EAnumeric276 unique values
0 missing
SM09_AEA.dm.numeric276 unique values
0 missing
P_VSA_s_4numeric275 unique values
0 missing
SM11_EA.bo.numeric343 unique values
0 missing
ATSC8inumeric467 unique values
0 missing
IDEnumeric376 unique values
0 missing
CIC1numeric386 unique values
0 missing
LPRSnumeric431 unique values
0 missing
Uindexnumeric428 unique values
0 missing
Eig02_EA.bo.numeric211 unique values
0 missing
SM12_AEA.ri.numeric211 unique values
0 missing
SpMax8_Bh.p.numeric274 unique values
0 missing
X2numeric422 unique values
0 missing
SM05_EAnumeric165 unique values
0 missing
SM10_EA.bo.numeric346 unique values
0 missing
Eig11_AEA.dm.numeric297 unique values
0 missing
Chi1_EA.ri.numeric489 unique values
0 missing
SpAD_AEA.dm.numeric485 unique values
0 missing
X1vnumeric472 unique values
0 missing
X0vnumeric452 unique values
0 missing
TPCnumeric349 unique values
0 missing
ATS6vnumeric422 unique values
0 missing
Chi0_EA.bo.numeric422 unique values
0 missing
Dznumeric288 unique values
0 missing
ATS1mnumeric376 unique values
0 missing
IACnumeric427 unique values
0 missing
TIC0numeric427 unique values
0 missing
ICRnumeric340 unique values
0 missing
ATSC8mnumeric494 unique values
0 missing
IDMTnumeric431 unique values
0 missing
SM07_EA.bo.numeric365 unique values
0 missing
Psi_i_1numeric473 unique values
0 missing
ATS7mnumeric442 unique values
0 missing
ATS1vnumeric367 unique values
0 missing
Eta_betaSnumeric165 unique values
0 missing
MWC02numeric162 unique values
0 missing
ZM1numeric162 unique values
0 missing
SpMax1_Bh.p.numeric167 unique values
0 missing
SpMax6_Bh.e.numeric284 unique values
0 missing
ATS6enumeric428 unique values
0 missing
ATS2pnumeric381 unique values
0 missing
ATSC8pnumeric494 unique values
0 missing
ZM2Madnumeric474 unique values
0 missing
SMTInumeric430 unique values
0 missing
ATS2mnumeric391 unique values
0 missing
Chi0_AEA.bo.numeric388 unique values
0 missing
Chi0_AEA.dm.numeric388 unique values
0 missing
Chi0_AEA.ed.numeric388 unique values
0 missing
Chi0_AEA.ri.numeric388 unique values
0 missing
Chi0_EAnumeric388 unique values
0 missing
Eig07_AEA.ed.numeric273 unique values
0 missing
SpMax7_Bh.i.numeric292 unique values
0 missing
Eta_F_Anumeric356 unique values
0 missing
nCICnumeric11 unique values
0 missing
TIC5numeric416 unique values
0 missing
SpMax6_Bh.i.numeric274 unique values
0 missing
SpMaxA_EAnumeric100 unique values
0 missing
TIC4numeric426 unique values
0 missing
SpMin2_Bh.e.numeric161 unique values
0 missing
X3vnumeric491 unique values
0 missing
CATS2D_09_AAnumeric12 unique values
0 missing
Eig15_AEA.dm.numeric289 unique values
0 missing
SpMax8_Bh.e.numeric282 unique values
0 missing
SpMax4_Bh.i.numeric215 unique values
0 missing
IDMnumeric392 unique values
0 missing
X2vnumeric485 unique values
0 missing
Eig07_EAnumeric302 unique values
0 missing

62 properties

514
Number of instances (rows) of the dataset.
343
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
342
Number of numeric attributes.
1
Number of nominal attributes.
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
2.92
First quartile of means among attributes of the numeric type.
Standard deviation of the number of distinct values among attributes of the nominal type.
5.92
Mean kurtosis among attributes of the numeric type.
0
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
3711.5
Mean of means among attributes of the numeric type.
-0.83
First quartile of skewness among attributes of the numeric type.
Average mutual information between the nominal attributes and the target attribute.
0.31
First quartile of standard deviation of attributes of the numeric type.
-0.26
Average class difference between consecutive instances.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
Second quartile (Median) of entropy among attributes.
Entropy of the target attribute values.
Average number of distinct values among the attributes of the nominal type.
2.03
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.67
Number of attributes divided by the number of instances.
0.74
Mean skewness among attributes of the numeric type.
4.83
Second quartile (Median) of means among attributes of the numeric type.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
11067.21
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
Percentage of instances belonging to the most frequent class.
Minimal entropy among attributes.
0.26
Second quartile (Median) of skewness among attributes of the numeric type.
Number of instances belonging to the most frequent class.
-1.28
Minimum kurtosis among attributes of the numeric type.
0
Percentage of binary attributes.
0.67
Second quartile (Median) of standard deviation of attributes of the numeric type.
Maximum entropy among attributes.
87.36
Maximum kurtosis among attributes of the numeric type.
0.01
Minimum of means among attributes of the numeric type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
407401.94
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
0
Percentage of missing values.
11.31
Third quartile of kurtosis among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
99.71
Percentage of numeric attributes.
23.41
Third quartile of means among attributes of the numeric type.
The maximum number of distinct values among attributes of the nominal type.
-3.54
Minimum skewness among attributes of the numeric type.
0.29
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
7.92
Maximum skewness among attributes of the numeric type.
0.02
Minimum standard deviation of attributes of the numeric type.
First quartile of entropy among attributes.
2.82
Third quartile of skewness among attributes of the numeric type.
1525118.66
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
0.73
First quartile of kurtosis among attributes of the numeric type.
13.81
Third quartile of standard deviation of attributes of the numeric type.

12 tasks

2 runs - estimation_procedure: Custom 10-fold Crossvalidation - target_feature: pXC50
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task