Unknown

Dataset Information

0

On the optimal trimming of high-throughput mRNA sequence data.


ABSTRACT: The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score <2 or <5, is optimal for most studies across a wide variety of metrics.

SUBMITTER: Macmanes MD 

PROVIDER: S-EPMC3908319 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

On the optimal trimming of high-throughput mRNA sequence data.

Macmanes Matthew D MD  

Frontiers in genetics 20140131


The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other ind  ...[more]

Similar Datasets

| S-EPMC5805770 | biostudies-literature
2024-07-06 | GSE179164 | GEO
| S-EPMC3983938 | biostudies-literature
| S-EPMC2911117 | biostudies-literature
| S-EPMC3295828 | biostudies-literature
| S-EPMC3494082 | biostudies-literature
| S-EPMC3088570 | biostudies-literature
| S-EPMC9881243 | biostudies-literature
| S-EPMC3706340 | biostudies-literature
2024-07-06 | GSE171317 | GEO