Unknown

Dataset Information

0

Broad distribution spectrum from Gaussian to power law appears in stochastic variations in RNA-seq data.


ABSTRACT: Gene expression levels exhibit stochastic variations among genetically identical organisms under the same environmental conditions. In many recent transcriptome analyses based on RNA sequencing (RNA-seq), variations in gene expression levels among replicates were assumed to follow a negative binomial distribution, although the physiological basis of this assumption remains unclear. In this study, RNA-seq data were obtained from Arabidopsis thaliana under eight conditions (21-27 replicates), and the characteristics of gene-dependent empirical probability density function (ePDF) profiles of gene expression levels were analyzed. For A. thaliana and Saccharomyces cerevisiae, various types of ePDF of gene expression levels were obtained that were classified as Gaussian, power law-like containing a long tail, or intermediate. These ePDF profiles were well fitted with a Gauss-power mixing distribution function derived from a simple model of a stochastic transcriptional network containing a feedback loop. The fitting function suggested that gene expression levels with long-tailed ePDFs would be strongly influenced by feedback regulation. Furthermore, the features of gene expression levels are correlated with their functions, with the levels of essential genes tending to follow a Gaussian-like ePDF while those of genes encoding nucleic acid-binding proteins and transcription factors exhibit long-tailed ePDF.

SUBMITTER: Awazu A 

PROVIDER: S-EPMC5974282 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Broad distribution spectrum from Gaussian to power law appears in stochastic variations in RNA-seq data.

Awazu Akinori A   Tanabe Takahiro T   Kamitani Mari M   Tezuka Ayumi A   Nagano Atsushi J AJ  

Scientific reports 20180529 1


Gene expression levels exhibit stochastic variations among genetically identical organisms under the same environmental conditions. In many recent transcriptome analyses based on RNA sequencing (RNA-seq), variations in gene expression levels among replicates were assumed to follow a negative binomial distribution, although the physiological basis of this assumption remains unclear. In this study, RNA-seq data were obtained from Arabidopsis thaliana under eight conditions (21-27 replicates), and  ...[more]

Similar Datasets

| S-EPMC5215546 | biostudies-literature
| S-EPMC6731627 | biostudies-literature
| S-EPMC7519452 | biostudies-literature
| S-EPMC5375979 | biostudies-literature
| S-EPMC3532103 | biostudies-other
| S-EPMC33276 | biostudies-literature
| S-EPMC3672888 | biostudies-literature
| S-EPMC3281125 | biostudies-literature
| S-EPMC5789915 | biostudies-literature
| S-EPMC5428393 | biostudies-literature