Unknown

Dataset Information

0

Parametric modeling of whole-genome sequencing data for CNV identification.


ABSTRACT: Copy number variants (CNVs) constitute an important class of genetic variants in human genome and are shown to be associated with complex diseases. Whole-genome sequencing provides an unbiased way of identifying all the CNVs that an individual carries. In this paper, we consider parametric modeling of the read depth (RD) data from whole-genome sequencing with the aim of identifying the CNVs, including both Poisson and negative-binomial modeling of such count data. We propose a unified approach of using a mean-matching variance stabilizing transformation to turn the relatively complicated problem of sparse segment identification for count data into a sparse segment identification problem for a sequence of Gaussian data. We apply the optimal sparse segment identification procedure to the transformed data in order to identify the CNV segments. This provides a computationally efficient approach for RD-based CNV identification. Simulation results show that this approach often results in a small number of false identifications of the CNVs and has similar or better performances in identifying the true CNVs when compared with other RD-based approaches. We demonstrate the methods using the trio data from the 1000 Genomes Project.

SUBMITTER: Vardhanabhuti S 

PROVIDER: S-EPMC4059462 | biostudies-literature | 2014 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Parametric modeling of whole-genome sequencing data for CNV identification.

Vardhanabhuti Saran S   Jeng X Jessie XJ   Wu Yinghua Y   Li Hongzhe H  

Biostatistics (Oxford, England) 20140128 3


Copy number variants (CNVs) constitute an important class of genetic variants in human genome and are shown to be associated with complex diseases. Whole-genome sequencing provides an unbiased way of identifying all the CNVs that an individual carries. In this paper, we consider parametric modeling of the read depth (RD) data from whole-genome sequencing with the aim of identifying the CNVs, including both Poisson and negative-binomial modeling of such count data. We propose a unified approach o  ...[more]

Similar Datasets

| S-EPMC10077681 | biostudies-literature
| S-EPMC4253833 | biostudies-other
| S-EPMC3268238 | biostudies-literature
| S-EPMC5617305 | biostudies-literature
| S-EPMC5569469 | biostudies-literature
| S-EPMC8131063 | biostudies-literature
| S-EPMC6288940 | biostudies-literature
| S-EPMC5842653 | biostudies-literature
| S-EPMC5868770 | biostudies-other
| S-BSST685 | biostudies-other