Dataset Information

Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.

ABSTRACT: Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding-window method requires the pre-specification of fixed window sizes, which are often unknown as a priori, are difficult to specify in practice, and are subject to limitations given that the sizes of genetic-association regions are likely to vary across the genome and phenotypes. We propose a computationally efficient and dynamic scan-statistic method (Scan the Genome [SCANG]) for analyzing WGS data; this method flexibly detects the sizes and the locations of rare-variant association regions without the need to specify a prior, fixed window size. The proposed method controls for the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected sizes of rare-variant association regions to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative methods for detecting rare-variant-associations while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.

SUBMITTER: Li Z

PROVIDER: S-EPMC6507043 | biostudies-literature | 2019 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.

Li Zilin Z Li Xihao X Liu Yaowu Y Shen Jincheng J Chen Han H Zhou Hufeng H Morrison Alanna C AC Boerwinkle Eric E Lin Xihong X

American journal of human genetics 20190412 5

Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the l ...[more]

PMID: 30982610

Similar Datasets

Project description:Essential tremor (ET) is one of the most common movement disorders. The etiology of ET remains largely unexplained. Whole genome sequencing (WGS) is likely to be of value in understanding a large proportion of ET with Mendelian and complex disease inheritance patterns. In ET families with Mendelian inheritance patterns, WGS may lead to gene identification where WES analysis failed to identify the causative single nucleotide variant (SNV) or indel due to incomplete coverage of the entire coding region of the genome, in addition to accurate detection of larger structural variants (SVs) and copy number variants (CNVs). Alternatively, in ET families with complex disease inheritance patterns with gene x gene and gene x environment interactions enrichment of functional rare coding and non-coding variants may explain the heritability of ET. We performed WGS in eight ET families (n = 40 individuals) enrolled in the Family Study of Essential Tremor. The analysis included filtering WGS data based on allele frequency in population databases, rare SNV and indel classification and association testing using the Mixed-Model Kernel Based Adaptive Cluster (MM-KBAC) test. A separate analysis of rare SV and CNVs segregating within ET families was also performed. Prioritization of candidate genes identified within families was performed using phenolyzer. WGS analysis identified candidate genes for ET in 5/8 (62.5%) of the families analyzed. WES analysis in a subset of these families in our previously published study failed to identify candidate genes. In one family, we identified a deleterious and damaging variant (c.1367G>A, p.(Arg456Gln)) in the candidate gene, CACNA1G, which encodes the pore forming subunit of T-type Ca(2+) channels, CaV3.1, and is expressed in various motor pathways and has been previously implicated in neuronal autorhythmicity and ET. Other candidate genes identified include SLIT3 which encodes an axon guidance molecule and in three families, phenolyzer prioritized genes that are associated with hereditary neuropathies (family A, KARS, family B, KIF5A and family F, NTRK1). Functional studies of CACNA1G and SLIT3 suggest a role for these genes in ET disease pathogenesis.

Dataset Information

Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.

Publications

Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets