Dataset Information

Identification and removal of sequencing artifacts produced by mispriming during reverse transcription in multiple RNA-seq technologies.

ABSTRACT: The quality of RNA sequencing data relies on specific priming by the primer used for reverse transcription (RT-primer). Nonspecific annealing of the RT-primer to the RNA template can generate reads with incorrect cDNA ends and can cause misinterpretation of data (RT mispriming). This kind of artifact in RNA-seq based technologies is underappreciated and currently no adequate tools exist to computationally remove them from published data sets. We show that mispriming can occur with as little as two bases of complementarity at the 3' end of the primer followed by intermittent regions of complementarity. We also provide a computational pipeline that identifies cDNA reads produced from RT mispriming, allowing users to filter them out from any aligned data set. Using this analysis pipeline, we identify thousands of mispriming events in a dozen published data sets from diverse technologies including short RNA-seq, total/mRNA-seq, HITS-CLIP, and GRO-seq. We further show how RT mispriming can lead to misinterpretation of data. In addition to providing a solution to computationally remove RT-misprimed reads, we also propose an experimental solution to completely avoid RT-mispriming by performing RNA-seq using thermostable group II intron derived reverse transcriptase (TGIRT-seq).

SUBMITTER: Shivram H

PROVIDER: S-EPMC6097653 | biostudies-literature | 2018 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Identification and removal of sequencing artifacts produced by mispriming during reverse transcription in multiple RNA-seq technologies.

Shivram Haridha H Iyer Vishwanath R VR

RNA (New York, N.Y.) 20180627 9

The quality of RNA sequencing data relies on specific priming by the primer used for reverse transcription (RT-primer). Nonspecific annealing of the RT-primer to the RNA template can generate reads with incorrect cDNA ends and can cause misinterpretation of data (RT mispriming). This kind of artifact in RNA-seq based technologies is underappreciated and currently no adequate tools exist to computationally remove them from published data sets. We show that mispriming can occur with as little as t ...[more]

PMID: 29950518

Dataset Information

Identification and removal of sequencing artifacts produced by mispriming during reverse transcription in multiple RNA-seq technologies.

Publications

Identification and removal of sequencing artifacts produced by mispriming during reverse transcription in multiple RNA-seq technologies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Sequencing artifacts produced by mispriming during reverse transcription in multiple RNA-seq technologies
2018-06-26 | GSE85163 | GEO

Direct long-read RNA sequencing identifies a subset of questionable exitrons likely arising from reverse transcription artifacts.
| S-EPMC8240250 | biostudies-literature

LAST-seq: single-cell RNA sequencing by direct amplification of single-stranded RNA without prior reverse transcription and second-strand synthesis.
| S-EPMC10413806 | biostudies-literature

The reverse transcription signature of N-1-methyladenosine in RNA-Seq is sequence dependent.
| S-EPMC4787781 | biostudies-literature

circFL-seq reveals full-length circular RNAs with rolling circular reverse transcription and nanopore sequencing.
| S-EPMC8550772 | biostudies-literature

Rolling circle reverse transcription enables high fidelity nanopore sequencing of small RNA.
| S-EPMC9550094 | biostudies-literature

Massively parallel sequencing, aCGH, and RNA-Seq technologies provide a comprehensive molecular diagnosis of Fanconi anemia.
| S-EPMC3668494 | biostudies-literature

Noise regularization removes correlation artifacts in single-cell RNA-seq data preprocessing.
| S-EPMC7961184 | biostudies-literature

Profiling of RNA-binding protein binding sites by in situ reverse transcription-based sequencing.
| S-EPMC10864177 | biostudies-literature

Zinc finger function of HIV-1 nucleocapsid protein is required for removal of 5'-terminal genomic RNA fragments: a paradigm for RNA removal reactions in HIV-1 reverse transcription.
| S-EPMC3578084 | biostudies-literature