Transcriptomics

Dataset Information

0

RNA-Seq data for 35 controls using total RNA extracted using RNAZol


ABSTRACT: Methods routinely used to analyze RNA sequencing data focus on statistical significance and the detection gene differential expression changes that meet a two-fold minimum change between groups. Due to the unique expression variability present in RNA sequencing data, this strategy may potentially overlook or obscure the detection of valuable information as a result of large expression variability in specific genes in certain samples. This paper develops tools and methods that apply variance and dispersion estimates to intra-group data in order to identify genes with expression values that diverge from the group envelope.  STRING database analysis of the genes identified with this analysis characterize gene affiliations involved in physiological regulatory networks that are associated to biological variability. Samples or genes identified as divergent can be judiciously evaluated prior to any standard differential analysis. A three-step process is presented for evaluating biological variability within a group in RNA sequencing data in which gene counts were: (1) scaled to minimize heteroscedasticity; (2) rank-ordered to potentially divergent “trendlines” for every gene in the data set; and (3) tested with the STRING database to identify statistically significant pathway associations among the genes displaying marked trendline variability and dispersion. This approach was used to identify and portray the “trendline” profile of every gene in three test data sets. Control data from an in-house data set and two archived samples revealed that 65-70 % of the sequenced genes displayed trendlines with minimal variation and dispersion across the sample group after rank-ordering the samples; this is referred to as linear trendline. Nonlinear trendlines refer to all cases where the trendline is not linear. Smaller subsets of genes within the three data sets displayed markedly skewed trendlines, wide dispersion, and variability. STRING database analysis of these genes identified interferon-mediated response networks in 11-20 % of the individuals sampled at the time of blood collection. For example, in the three control data sets, 14 to 26 genes in the defense response to virus pathway were identified in 7 individuals at false discovery rates ≤ 1.92 E-15. Gene clusters involving leukocyte and neutrophil activation and degranulation pathways were also detected.

ORGANISM(S): Homo sapiens

PROVIDER: GSE169359 | GEO | 2021/03/23

REPOSITORIES: GEO

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1

Similar Datasets

2022-10-16 | GSE205812 | GEO
2022-02-27 | GSE197236 | GEO
2014-04-09 | GSE56593 | GEO
2014-06-08 | GSE58201 | GEO
2023-04-11 | GSE224211 | GEO
2014-04-09 | E-GEOD-56593 | biostudies-arrayexpress
2014-06-08 | E-GEOD-58201 | biostudies-arrayexpress
| PRJNA10767 | ENA
2023-06-28 | GSE218129 | GEO
2012-12-18 | PXD000094 | Pride