Dataset Information

SeMPI 2.0-A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases.

ABSTRACT: Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) can be predicted due to their building block-like assembly. SeMPI v2 provides a comprehensive prediction pipeline, which includes the screening of the scaffold in publicly available natural compound databases. The screening algorithm was designed to detect homologous structures even for partial, incomplete clusters. The pipeline allows linking of gene clusters to known natural products and therefore also provides a metric to estimate the novelty of the cluster if a matching scaffold cannot be found. Whereas currently available tools attempt to provide comprehensive information about a wide range of gene clusters, SeMPI v2 aims to focus on precise predictions. Therefore, the cluster detection algorithm, including building block generation and domain substrate prediction, was thoroughly refined and benchmarked, to provide high-quality scaffold predictions. In a benchmark based on 559 gene clusters, SeMPI v2 achieved comparable or better results than antiSMASH v5. Additionally, the SeMPI v2 web server provides features that can help to further investigate a submitted gene cluster, such as the incorporation of a genome browser, and the possibility to modify a predicted scaffold in a workbench before the database screening.

SUBMITTER: Zierep PF

PROVIDER: S-EPMC7823522 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

SeMPI 2.0-A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases.

Zierep Paul F PF Ceci Adriana T AT Dobrusin Ilia I Rockwell-Kollmann Sinclair C SC Günther Stefan S

Metabolites 20201229 1

Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) ...[more]

PMID: 33383692

Similar Datasets

Project description:NRPS-PKS is web-based software for analysing large multi-enzymatic, multi-domain megasynthases that are involved in the biosynthesis of pharmaceutically important natural products such as cyclosporin, rifamycin and erythromycin. NRPS-PKS has been developed based on a comprehensive analysis of the sequence and structural features of several experimentally characterized biosynthetic gene clusters. The results of these analyses have been organized as four integrated searchable databases for elucidating domain organization and substrate specificity of nonribosomal peptide synthetases and three types of polyketide synthases. These databases work as the backend of NRPS-PKS and provide the knowledge base for predicting domain organization and substrate specificity of uncharacterized NRPS/PKS clusters. Benchmarking on a large set of biosynthetic gene clusters has demonstrated that, apart from correct identification of NRPS and PKS domains, NRPS-PKS can also predict specificities of adenylation and acyltransferase domains with reasonably high accuracy. These features of NRPS-PKS make it a valuable resource for identification of natural products biosynthesized by NRPS/PKS gene clusters found in newly sequenced genomes. The training and test sets of gene clusters included in NRPS-PKS correlate information on 307 open reading frames, 2223 functional protein domains, 68 starter/extender precursors and their specific recognition motifs, and also the chemical structure of 101 natural products from four different families. NRPS-PKS is a unique resource which provides a user-friendly interface for correlating chemical structures of natural products with the domains and modules in the corresponding nonribosomal peptide synthetases or polyketide synthases. It also provides guidelines for domain/module swapping as well as site-directed mutagenesis experiments to engineer biosynthesis of novel natural products. NRPS-PKS can be accessed at http://www.nii.res.in/nrps-pks.html.

Project description:BACKGROUND:Filamentous fungi produce a vast amount of bioactive secondary metabolites (SMs) synthesized by e.g. hybrid polyketide synthase-nonribosomal peptide synthetase enzymes (PKS-NRPS; NRPS-PKS). While their domain structure suggests a common ancestor with other SM proteins, their evolutionary origin and dynamics in fungi are still unclear. Recent rational engineering approaches highlighted the possibility to reassemble hybrids into chimeras - suggesting molecular recombination as diversifying mechanism. RESULTS:Phylogenetic analysis of hybrids in 37 species - spanning 9 sections of Aspergillus and Penicillium chrysogenum - let us describe their dynamics throughout the genus Aspergillus. The tree topology indicates that three groups of PKS-NRPS as well as one group of NRPS-PKS hybrids developed independently from each other. Comparison to other SM genes lead to the conclusion that hybrids in Aspergilli have several PKS ancestors; in contrast, hybrids are monophyletic when compared to available NRPS genes - with the exception of a small group of NRPSs. Our analysis also revealed that certain NRPS-likes are derived from NRPSs, suggesting that the NRPS/NRPS-like relationship is dynamic and proteins can diverge from one function to another. An extended phylogenetic analysis including bacterial and fungal taxa revealed multiple ancestors of hybrids. Homologous hybrids are present in all sections which suggests frequent horizontal gene transfer between genera and a finite number of hybrids in fungi. CONCLUSION:Phylogenetic distances between hybrids provide us with evidence for their evolution: Large inter-group distances indicate multiple independent events leading to the generation of hybrids, while short intra-group distances of hybrids from different taxonomic sections indicate frequent horizontal gene transfer. Our results are further supported by adding bacterial and fungal genera. Presence of related hybrid genes in all Ascomycetes suggests a frequent horizontal gene transfer between genera and a finite diversity of hybrids - also explaining their scarcity. The provided insights into relations of hybrids and other SM genes will serve in rational design of new hybrid enzymes.

Dataset Information

SeMPI 2.0-A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases.

Publications

SeMPI 2.0-A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets