Modeling the human aging transcriptome across tissues, health status, and sex.
Ontology highlight
ABSTRACT: Aging in humans is an incredibly complex biological process that leads to increased susceptibility to various diseases. Understanding which genes are associated with healthy aging can provide valuable insights into aging mechanisms and possible avenues for therapeutics to prolong healthy life. However, modeling this complex biological process requires an enormous collection of high-quality data along with cutting-edge computational methods. Here, we have compiled a large meta-analysis of gene expression data from RNA-Seq experiments available from the Sequence Read Archive. We began by reprocessing more than 6000 raw samples-including mapping, filtering, normalization, and batch correction-to generate 3060 high-quality samples spanning a large age range and multiple different tissues. We then used standard differential expression analyses and machine learning approaches to model and predict aging across the dataset, achieving an R2 value of 0.96 and a root-mean-square error of 3.22 years. These models allow us to explore aging across health status, sex, and tissue and provide novel insights into possible aging processes. We also explore how preprocessing parameters affect predictions and highlight the reproducibility limits of these machine learning models. Finally, we develop an online tool for predicting the ages of human transcriptomic samples given raw gene expression counts. Together, this study provides valuable resources and insights into the transcriptomics of human aging.
SUBMITTER: Shokhirev MN
PROVIDER: S-EPMC7811842 | biostudies-literature | 2021 Jan
REPOSITORIES: biostudies-literature
ACCESS DATA