Dataset Information

Mining RNA-seq data for infections and contaminations.

ABSTRACT: RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences) of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

SUBMITTER: Bonfert T

PROVIDER: S-EPMC3760913 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Mining RNA-seq data for infections and contaminations.

Bonfert Thomas T Csaba Gergely G Zimmer Ralf R Friedel Caroline C CC

PloS one 20130903 9

RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for conta ...[more]

PMID: 24019895

Dataset Information

Mining RNA-seq data for infections and contaminations.

Publications

Mining RNA-seq data for infections and contaminations.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Massive mining of publicly available RNA-seq data from human and mouse.
| S-EPMC5893633 | biostudies-literature

DRscDB: A single-cell RNA-seq resource for data mining and data comparison across species.
| S-EPMC8085783 | biostudies-literature

Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data.
| S-EPMC8415361 | biostudies-literature

Screening human cell lines for viral infections applying RNA-Seq data analysis.
| S-EPMC6328144 | biostudies-literature

Improving the performance of single-cell RNA-seq data mining based on relative expression orderings.
| S-EPMC9851298 | biostudies-literature

Tomato RNA-seq Data Mining Reveals the Taxonomic and Functional Diversity of Root-Associated Microbiota.
| S-EPMC7022885 | biostudies-literature

Stratification of gene coexpression patterns and GO function mining for a RNA-Seq data series.
| S-EPMC4052503 | biostudies-literature

RNA-Seq Data-Mining Allows the Discovery of Two Long Non-Coding RNA Biomarkers of Viral Infection in Humans.
| S-EPMC7215422 | biostudies-literature

RNA-Seq data mining: downregulation of NeuroD6 serves as a possible biomarker for alzheimer's disease brains.
| S-EPMC4274867 | biostudies-literature

A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining.
| S-EPMC6696874 | biostudies-literature