Unknown

Dataset Information

0

Assigning the Origin of Microbial Natural Products by Chemical Space Map and Machine Learning.


ABSTRACT: Microbial natural products (NPs) are an important source of drugs, however, their structural diversity remains poorly understood. Here we used our recently reported MinHashed Atom Pair fingerprint with diameter of four bonds (MAP4), a fingerprint suitable for molecules across very different sizes, to analyze the Natural Products Atlas (NPAtlas), a database of 25,523 NPs of bacterial or fungal origin. To visualize NPAtlas by MAP4 similarity, we used the dimensionality reduction method tree map (TMAP). The resulting interactive map organizes molecules by physico-chemical properties and compound families such as peptides and glycosides. Remarkably, the map separates bacterial and fungal NPs from one another, revealing that these two compound families are intrinsically different despite their related biosynthetic pathways. We used these differences to train a machine learning model capable of distinguishing between NPs of bacterial or fungal origin.

SUBMITTER: Capecchi A 

PROVIDER: S-EPMC7600738 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assigning the Origin of Microbial Natural Products by Chemical Space Map and Machine Learning.

Capecchi Alice A   Reymond Jean-Louis JL  

Biomolecules 20200928 10


Microbial natural products (NPs) are an important source of drugs, however, their structural diversity remains poorly understood. Here we used our recently reported MinHashed Atom Pair fingerprint with diameter of four bonds (MAP4), a fingerprint suitable for molecules across very different sizes, to analyze the Natural Products Atlas (NPAtlas), a database of 25,523 NPs of bacterial or fungal origin. To visualize NPAtlas by MAP4 similarity, we used the dimensionality reduction method tree map (T  ...[more]

Similar Datasets

| S-EPMC7603480 | biostudies-literature
| S-EPMC2696019 | biostudies-literature
| S-EPMC8169684 | biostudies-literature
| S-EPMC7659707 | biostudies-literature
| S-EPMC1297657 | biostudies-literature
| S-EPMC8153233 | biostudies-literature
| S-EPMC8238997 | biostudies-literature
| S-EPMC6726486 | biostudies-literature
2019-01-19 | GSE125362 | GEO