Data
QSAR_fish_toxicity

QSAR_fish_toxicity

active ARFF CC BY 4.0 Visibility: public Uploaded 22-12-2022 by Sebastian Fischer
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Life Science Machine Learning study_353
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Data Description Data set containing values for 6 attributes (molecular descriptors) of 908 chemicals used to predict quantitative acute aquatic toxicity towards the fish Pimephales promelas (fathead minnow). This dataset was used to develop quantitative regression QSAR models to predict acute aquatic toxicity towards the fish Pimephales promelas (fathead minnow) on a set of 908 chemicals. LC50 data, which is the concentration that causes death in 50% of test fish over a test duration of 96 hours, was used as model response. Attribute Description The model comprised 6 molecular descriptors 1. *CIC0* - information indices 2. *SM1_Dz* - 2D matrix-based descriptors 3. *GATS1i* - 2D autocorrelations 4. *NdsCH* - atom-type counts 5. *NdssC* - atom-type counts 6. *MLOGP* - molecular properties 7. *LC50* - quantitative response, LC50 [-LOG(mol/L)], target feature

7 features

LC50 (target)numeric827 unique values
0 missing
CIC0numeric502 unique values
0 missing
SM1_Dznumeric186 unique values
0 missing
GATS1inumeric557 unique values
0 missing
NdsCHnumeric5 unique values
0 missing
NdssCnumeric7 unique values
0 missing
MLOGPnumeric559 unique values
0 missing

19 properties

908
Number of instances (rows) of the dataset.
7
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
7
Number of numeric attributes.
0
Number of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
-0.43
Average class difference between consecutive instances.
0
Percentage of missing values.
0.01
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.

2 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: LC50
0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - target_feature: LC50
Define a new task