Unknown

Dataset Information

0

SYBA: Bayesian estimation of synthetic accessibility of organic compounds.


ABSTRACT: SYBA (SYnthetic Bayesian Accessibility) is a fragment-based method for the rapid classification of organic compounds as easy- (ES) or hard-to-synthesize (HS). It is based on a Bernoulli naïve Bayes classifier that is used to assign SYBA score contributions to individual fragments based on their frequencies in the database of ES and HS molecules. SYBA was trained on ES molecules available in the ZINC15 database and on HS molecules generated by the Nonpher methodology. SYBA was compared with a random forest, that was utilized as a baseline method, as well as with other two methods for synthetic accessibility assessment: SAScore and SCScore. When used with their suggested thresholds, SYBA improves over random forest classification, albeit marginally, and outperforms SAScore and SCScore. However, upon the optimization of SAScore threshold (that changes from 6.0 to -?4.5), SAScore yields similar results as SYBA. Because SYBA is based merely on fragment contributions, it can be used for the analysis of the contribution of individual molecular parts to compound synthetic accessibility. SYBA is publicly available at https://github.com/lich-uct/syba under the GNU General Public License.

SUBMITTER: Vorsilak M 

PROVIDER: S-EPMC7238540 | biostudies-literature | 2020 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

SYBA: Bayesian estimation of synthetic accessibility of organic compounds.

Voršilák Milan M   Kolář Michal M   Čmelo Ivan I   Svozil Daniel D  

Journal of cheminformatics 20200520 1


SYBA (SYnthetic Bayesian Accessibility) is a fragment-based method for the rapid classification of organic compounds as easy- (ES) or hard-to-synthesize (HS). It is based on a Bernoulli naïve Bayes classifier that is used to assign SYBA score contributions to individual fragments based on their frequencies in the database of ES and HS molecules. SYBA was trained on ES molecules available in the ZINC15 database and on HS molecules generated by the Nonpher methodology. SYBA was compared with a ran  ...[more]

Similar Datasets

2024-07-18 | MODEL2407180002 | BioModels
2024-07-18 | MODEL2407180004 | BioModels
| S-EPMC3225829 | biostudies-other
| S-EPMC8479809 | biostudies-literature
| S-EPMC10864934 | biostudies-literature
2024-03-26 | GSE234010 | GEO
| S-EPMC7664059 | biostudies-literature
| S-EPMC8486869 | biostudies-literature
| S-EPMC3855595 | biostudies-literature
| S-EPMC3745480 | biostudies-other