OpenML

JavaScript is required to properly view the contents of this page!

qsar

active ARFF Publicly available Visibility: public Uploaded 27-01-2023 by Young Lee
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

The QSAR biodegradation dataset was built in the Milano Chemometrics and QSAR Research Group. The research leading to these results has received funding from the European Communitys Seventh Framework Programme [FP7/2007-2013] under Grant Agreement n. 238701 of Marie Curie ITN Environmental Chemoinformatics (ECO) project.The data have been used to develop QSAR (Quantitative Structure Activity Relationships) models for the study of the relationships between chemical structure and biodegradation of molecules. Biodegradation experimental values of 1055 chemicals were collected from the webpage of the National Institute of Technology and Evaluation of Japan (NITE). Classification models were developed in order to discriminate ready (356) and not ready (699) biodegradable molecules by means of three different modelling methods: k Nearest Neighbours, Partial Least Squares Discriminant Analysis and Support Vector Machines. Details on attributes (molecular descriptors) selected in each model can be found in the quoted reference: Mansouri, K., Ringsted, T., Ballabio, D., Todeschini, R., Consonni, V. (2013). Quantitative Structure - Activity Relationship models for ready biodegradability of chemicals. Journal of Chemical Information and Modeling, 53, 867-878.Source: https://archive.ics.uci.edu/ml/datasets/QSAR+biodegradation

41 features

class (target)	string	2 unique values 0 missing
0	numeric	440 unique values 0 missing
1	numeric	1022 unique values 0 missing
7	numeric	188 unique values 0 missing
11	numeric	384 unique values 0 missing
12	numeric	756 unique values 0 missing
13	numeric	373 unique values 0 missing
14	numeric	510 unique values 0 missing
16	numeric	167 unique values 0 missing
17	numeric	125 unique values 0 missing
21	numeric	352 unique values 0 missing
26	numeric	329 unique values 0 missing
27	numeric	205 unique values 0 missing
29	numeric	470 unique values 0 missing
30	numeric	553 unique values 0 missing
35	numeric	705 unique values 0 missing
36	numeric	624 unique values 0 missing
38	numeric	862 unique values 0 missing
2	numeric	11 unique values 0 missing
4	numeric	16 unique values 0 missing
5	numeric	13 unique values 0 missing
6	numeric	15 unique values 0 missing
8	numeric	15 unique values 0 missing
9	numeric	12 unique values 0 missing
10	numeric	21 unique values 0 missing
15	numeric	24 unique values 0 missing
31	numeric	8 unique values 0 missing
32	numeric	11 unique values 0 missing
33	numeric	16 unique values 0 missing
37	numeric	8 unique values 0 missing
40	numeric	17 unique values 0 missing
39	nominal	5 unique values 0 missing
20	nominal	4 unique values 0 missing
28	nominal	2 unique values 0 missing
23	nominal	2 unique values 0 missing
3	nominal	4 unique values 0 missing
22	nominal	13 unique values 0 missing
34	nominal	8 unique values 0 missing
19	nominal	4 unique values 0 missing
25	nominal	4 unique values 0 missing
24	nominal	2 unique values 0 missing