Unknown

Dataset Information

0

APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.


ABSTRACT: RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot.

SUBMITTER: Sharan M 

PROVIDER: S-EPMC5499795 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.

Sharan Malvika M   Förstner Konrad U KU   Eulalio Ana A   Vogel Jörg J  

Nucleic acids research 20170601 11


RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated  ...[more]

Similar Datasets

2024-07-05 | GSE237017 | GEO
| S-EPMC6247937 | biostudies-literature
| S-EPMC4041461 | biostudies-literature
| S-EPMC1147777 | biostudies-other
| S-EPMC7226056 | biostudies-literature
| S-EPMC7072731 | biostudies-literature
| S-EPMC5117216 | biostudies-literature
| S-EPMC7443297 | biostudies-literature
| S-EPMC7880356 | biostudies-literature
| S-EPMC8361134 | biostudies-literature