Unknown

Dataset Information

0

Heritability estimation and differential analysis of count data with generalized linear mixed models in genomic sequencing studies.


ABSTRACT:

Motivation

Genomic sequencing studies, including RNA sequencing and bisulfite sequencing studies, are becoming increasingly common and increasingly large. Large genomic sequencing studies open doors for accurate molecular trait heritability estimation and powerful differential analysis. Heritability estimation and differential analysis in sequencing studies requires the development of statistical methods that can properly account for the count nature of the sequencing data and that are computationally efficient for large datasets.

Results

Here, we develop such a method, PQLseq (Penalized Quasi-Likelihood for sequencing count data), to enable effective and efficient heritability estimation and differential analysis using the generalized linear mixed model framework. With extensive simulations and comparisons to previous methods, we show that PQLseq is the only method currently available that can produce unbiased heritability estimates for sequencing count data. In addition, we show that PQLseq is well suited for differential analysis in large sequencing studies, providing calibrated type I error control and more power compared to the standard linear mixed model methods. Finally, we apply PQLseq to perform gene expression heritability estimation and differential expression analysis in a large RNA sequencing study in the Hutterites.

Availability and implementation

PQLseq is implemented as an R package with source code freely available at www.xzlab.org/software.html and https://cran.r-project.org/web/packages/PQLseq/index.html.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Sun S 

PROVIDER: S-EPMC6361238 | biostudies-literature | 2019 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Heritability estimation and differential analysis of count data with generalized linear mixed models in genomic sequencing studies.

Sun Shiquan S   Zhu Jiaqiang J   Mozaffari Sahar S   Ober Carole C   Chen Mengjie M   Zhou Xiang X  

Bioinformatics (Oxford, England) 20190201 3


<h4>Motivation</h4>Genomic sequencing studies, including RNA sequencing and bisulfite sequencing studies, are becoming increasingly common and increasingly large. Large genomic sequencing studies open doors for accurate molecular trait heritability estimation and powerful differential analysis. Heritability estimation and differential analysis in sequencing studies requires the development of statistical methods that can properly account for the count nature of the sequencing data and that are c  ...[more]

Similar Datasets

| S-EPMC7236949 | biostudies-literature
| S-EPMC10387571 | biostudies-literature
| S-EPMC3866838 | biostudies-literature
| S-EPMC2883299 | biostudies-literature
| S-EPMC6493759 | biostudies-literature
| S-EPMC6668092 | biostudies-literature
| S-EPMC8620156 | biostudies-literature
| S-EPMC4941438 | biostudies-literature
| S-EPMC10187526 | biostudies-literature
| S-EPMC5754233 | biostudies-literature