Dataset Information

APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.

ABSTRACT: RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot.

SUBMITTER: Sharan M

PROVIDER: S-EPMC5499795 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.

Sharan Malvika M Förstner Konrad U KU Eulalio Ana A Vogel Jörg J

Nucleic acids research 20170601 11

RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated ...[more]

PMID: 28334975

Similar Datasets

Project description:RNA-binding proteins (RBPs) play important roles in many cancer types. However, RBPs have not been thoroughly and systematically studied in gliomas. Global analysis of the functional impact of RBPs will provide a better understanding of gliomagenesis and new insights into glioma therapy. In this study, we integrated a list of the human RBPs from six sources-Gerstberger, SONAR, Gene Ontology project, Poly(A) binding protein, CARIC, and XRNAX-which covered 4127 proteins with RNA-binding activity. The RNA sequencing data were downloaded from The Cancer Genome Atlas (TCGA) (n = 699) and Chinese Glioma Genome Atlas (CGGA) (n = 325 + 693). We examined the differentially expressed genes (DEGs) using the R package DESeq2, and constructed a weighted gene co-expression network analysis (WGCNA) of RBPs. Furthermore, survival analysis was also performed based on the univariate and multivariate Cox proportional hazards regression models. In the WGCNA analysis, we identified a key module involved in the overall survival (OS) of glioblastomas. Survival analysis revealed eight RBPs (PTRF, FNDC3B, SLC25A43, ZC3H12A, LRRFIP1, HSP90B1, HSPA5, and BNC2) are significantly associated with the survival of glioblastoma patients. Another 693 patients within the CGGA database were used to validate the findings. Additionally, 3564 RBPs were classified into canonical and non-canonical RBPs depending on the domains that they contain, and non-canonical RBPs account for the majority (72.95%). The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that some non-canonical RBPs may have functions in glioma. Finally, we found that the knockdown of non-canonical RBPs, PTRF, or FNDC3B can alone significantly inhibit the proliferation of LN229 and U251 cells. Simultaneously, RNA Immunoprecipitation (RIP) analysis indicated that PTRF may regulate cell growth and death- related pathways to maintain tumor cell growth. In conclusion, our findings presented an integrated view to assess the potential death risks of glioblastoma at a molecular level, based on the expression of RBPs. More importantly, we identified non-canonical RNA-binding proteins PTRF and FNDC3B, showing them to be potential prognostic biomarkers for glioblastoma.

Project description:An azidophenacyl derivative of a chemically synthesized consensus signal peptide has been prepared. The peptide, when photoactivated in the presence of rough or high-salt-stripped microsomes from pancreas, leads to inhibition of their activity in cotranslational processing of secretory pre-proteins translated from their mRNA in vitro. The peptide binds specifically with high affinity to components in the microsomal membranes from pancreas and liver, and photoreaction of a radioactive form of the azidophenacyl derivative leads to covalent linkage to yield two closely related radiolabelled proteins of Mr about 45,000. These proteins are integrated into the membrane, with large 30,000-Mr domains embedded into the phospholipid bilayer to which the signal peptide binds. A smaller, endopeptidase-sensitive, domain is exposed on the cytoplasmic surface of the microsomal vesicles. The specificity and selectivity of the binding of azidophenacyl-derivatized consensus signal peptide was demonstrated by concentration-dependent inhibition of photolabelling by the 'cold' synthetic consensus signal peptide and by a natural internal signal sequence cleaved and isolated from ovalbumin. The properties of the labelled 45,000-Mr protein-signal peptide complexes, i.e. mass, pI, ease of dissociation from the membrane by detergent or salts and immunological properties, distinguish them from other proteins, e.g. subunits of signal recognition particle, docking protein and signal peptidase, already known to be involved in targetting and processing of nascent secretory proteins at the rough endoplasmic reticulum membrane. Although the 45,000-Mr signal peptide binding protein displays properties similar to those of the signal peptidase, a component of the endoplasmic reticulum, the azido-derivatized consensus signal peptide does not interact with it. It is proposed that the endoplasmic reticulum proteins with which the azidophenacyl-derivatized consensus signal peptide interacts to yield the 45,000-Mr adducts may act as receptors for signals in nascent secretory pre-proteins in transduction of changes in the endoplasmic reticulum which bring about translocation of secretory protein across the membrane.

Dataset Information

APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.

Publications

APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets