Unknown

Dataset Information

0

A Simple Guideline to Assess the Characteristics of RNA-Seq Data.


ABSTRACT: Next-generation sequencing (NGS) techniques have been used to generate various molecular maps including genomes, epigenomes, and transcriptomes. Transcriptomes from a given cell population can be profiled via RNA-seq. However, there is no simple way to assess the characteristics of RNA-seq data systematically. In this study, we provide a simple method that can intuitively evaluate RNA-seq data using two different principal component analysis (PCA) plots. The gene expression PCA plot provides insights into the association between samples, while the transcript integrity number (TIN) score plot provides a quality map of given RNA-seq data. With this approach, we found that RNA-seq datasets deposited in public repositories often contain a few low-quality RNA-seq data that can lead to misinterpretations. The effect of sampling errors for differentially expressed gene (DEG) analysis was evaluated with ten RNA-seq data from invasive ductal carcinoma tissues and three RNA-seq data from adjacent normal tissues taken from a Korean breast cancer patient. The evaluation demonstrated that sampling errors, which select samples that do not represent a given population, can lead to different interpretations when conducting the DEG analysis. Therefore, the proposed approach can be used to avoid sampling errors prior to RNA-seq data analysis.

SUBMITTER: Son K 

PROVIDER: S-EPMC6241233 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Simple Guideline to Assess the Characteristics of RNA-Seq Data.

Son Keunhong K   Yu Sungryul S   Shin Wonseok W   Han Kyudong K   Kang Keunsoo K  

BioMed research international 20181104


Next-generation sequencing (NGS) techniques have been used to generate various molecular maps including genomes, epigenomes, and transcriptomes. Transcriptomes from a given cell population can be profiled via RNA-seq. However, there is no simple way to assess the characteristics of RNA-seq data systematically. In this study, we provide a simple method that can intuitively evaluate RNA-seq data using two different principal component analysis (PCA) plots. The gene expression PCA plot provides ins  ...[more]

Similar Datasets

| S-EPMC4870397 | biostudies-literature
| S-EPMC10275512 | biostudies-literature
| S-EPMC8284643 | biostudies-literature
| S-EPMC8955282 | biostudies-literature
| S-EPMC6501316 | biostudies-literature
| S-ECPF-GEOD-40890 | biostudies-other
| S-EPMC6830085 | biostudies-literature
| S-EPMC4232570 | biostudies-literature
| S-EPMC3650863 | biostudies-literature
| S-EPMC5681780 | biostudies-literature