Unknown

Dataset Information

0

CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes.


ABSTRACT: Secondary metabolites (SM) are structurally diverse natural products of high pharmaceutical importance. Genes involved in their biosynthesis are often organized in clusters, i.e., are co-localized and co-expressed. In silico cluster prediction in eukaryotic genomes remains problematic mainly due to the high variability of the clusters' content and lack of other distinguishing sequence features.We present Cluster Assignment by Islands of Sites (CASSIS), a method for SM cluster prediction in eukaryotic genomes, and Secondary Metabolites by InterProScan (SMIPS), a tool for genome-wide detection of SM key enzymes ('anchor' genes): polyketide synthases, non-ribosomal peptide synthetases and dimethylallyl tryptophan synthases. Unlike other tools based on protein similarity, CASSIS exploits the idea of co-regulation of the cluster genes, which assumes the existence of common regulatory patterns in the cluster promoters. The method searches for 'islands' of enriched cluster-specific motifs in the vicinity of anchor genes. It was validated in a series of cross-validation experiments and showed high sensitivity and specificity.CASSIS and SMIPS are freely available at https://sbi.hki-jena.de/cassisthomas.wolf@leibniz-hki.de or ekaterina.shelest@leibniz-hki.deSupplementary data are available at Bioinformatics online.

SUBMITTER: Wolf T 

PROVIDER: S-EPMC4824125 | biostudies-literature | 2016 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes.

Wolf Thomas T   Shelest Vladimir V   Nath Neetika N   Shelest Ekaterina E  

Bioinformatics (Oxford, England) 20151209 8


<h4>Motivation</h4>Secondary metabolites (SM) are structurally diverse natural products of high pharmaceutical importance. Genes involved in their biosynthesis are often organized in clusters, i.e., are co-localized and co-expressed. In silico cluster prediction in eukaryotic genomes remains problematic mainly due to the high variability of the clusters' content and lack of other distinguishing sequence features.<h4>Results</h4>We present Cluster Assignment by Islands of Sites (CASSIS), a method  ...[more]

Similar Datasets

| S-EPMC3538241 | biostudies-literature
| S-EPMC8070225 | biostudies-literature
| S-EPMC4338041 | biostudies-literature
| S-EPMC7719232 | biostudies-literature
| S-EPMC5561558 | biostudies-literature
| S-EPMC3044413 | biostudies-other
| S-EPMC3818861 | biostudies-other
| S-EPMC8574697 | biostudies-literature
| S-EPMC8336260 | biostudies-literature
| S-EPMC6950998 | biostudies-literature