Unknown

Dataset Information

0

Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR.


ABSTRACT: QSAR/QSPR (quantitative structure-activity/property relationship) modeling has been a prevalent approach in various, overlapping sub-fields of computational, medicinal and environmental chemistry for decades. The generation and selection of molecular descriptors is an essential part of this process. In typical QSAR workflows, the starting pool of molecular descriptors is rationalized based on filtering out descriptors which are (i) constant throughout the whole dataset, or (ii) very strongly correlated to another descriptor. While the former is fairly straightforward, the latter involves a level of subjectivity when deciding what exactly is considered to be a strong correlation. Despite that, most QSAR modeling studies do not report on this step. In this study, we examine in detail the effect of various possible descriptor intercorrelation limits on the resulting QSAR models. Statistical comparisons are carried out based on four case studies from contemporary QSAR literature, using a combined methodology based on sum of ranking differences (SRD) and analysis of variance (ANOVA).

SUBMITTER: Racz A 

PROVIDER: S-EPMC6767540 | biostudies-literature | 2019 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR.

Rácz Anita A   Bajusz Dávid D   Héberger Károly K  

Molecular informatics 20190404 8-9


QSAR/QSPR (quantitative structure-activity/property relationship) modeling has been a prevalent approach in various, overlapping sub-fields of computational, medicinal and environmental chemistry for decades. The generation and selection of molecular descriptors is an essential part of this process. In typical QSAR workflows, the starting pool of molecular descriptors is rationalized based on filtering out descriptors which are (i) constant throughout the whole dataset, or (ii) very strongly cor  ...[more]

Similar Datasets

| S-EPMC3805266 | biostudies-literature
| S-EPMC4803518 | biostudies-literature
| S-EPMC7549127 | biostudies-literature
| S-EPMC7922354 | biostudies-literature
| S-EPMC4902069 | biostudies-literature
| S-EPMC7321456 | biostudies-literature
| S-EPMC6824857 | biostudies-literature
| S-EPMC4530125 | biostudies-literature
| S-EPMC5801138 | biostudies-literature
| S-EPMC10377864 | biostudies-literature