CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes.
Ontology highlight
ABSTRACT: Secondary metabolites (SM) are structurally diverse natural products of high pharmaceutical importance. Genes involved in their biosynthesis are often organized in clusters, i.e., are co-localized and co-expressed. In silico cluster prediction in eukaryotic genomes remains problematic mainly due to the high variability of the clusters' content and lack of other distinguishing sequence features.We present Cluster Assignment by Islands of Sites (CASSIS), a method for SM cluster prediction in eukaryotic genomes, and Secondary Metabolites by InterProScan (SMIPS), a tool for genome-wide detection of SM key enzymes ('anchor' genes): polyketide synthases, non-ribosomal peptide synthetases and dimethylallyl tryptophan synthases. Unlike other tools based on protein similarity, CASSIS exploits the idea of co-regulation of the cluster genes, which assumes the existence of common regulatory patterns in the cluster promoters. The method searches for 'islands' of enriched cluster-specific motifs in the vicinity of anchor genes. It was validated in a series of cross-validation experiments and showed high sensitivity and specificity.CASSIS and SMIPS are freely available at https://sbi.hki-jena.de/cassisthomas.wolf@leibniz-hki.de or ekaterina.shelest@leibniz-hki.deSupplementary data are available at Bioinformatics online.
SUBMITTER: Wolf T
PROVIDER: S-EPMC4824125 | biostudies-literature | 2016 Apr
REPOSITORIES: biostudies-literature
ACCESS DATA