Ontology highlight
ABSTRACT: Motivation
Fast and accurate genotype imputation is necessary for facilitating gene-mapping studies, especially with the ever increasing numbers of both common and rare variants generated by high-throughput-sequencing experiments. However, most of the existing imputation approaches suffer from either inaccurate results or heavy computational demand.Results
In this article, aiming to perform fast and accurate genotype-imputation analysis, we propose a novel, fast and yet accurate method to impute diploid genotypes. Specifically, we extend a hidden Markov model that is widely used to describe haplotype structures. But we model hidden states onto single reference haplotypes rather than onto pairs of haplotypes. Consequently the computational complexity is linear to size of reference haplotypes. We further develop an algorithm 'merge-and-recover (MAR)' to speed up the calculation. Working on compact representation of segmental reference haplotypes, the MAR algorithm always calculates an exact form of transition probabilities regardless of partition of segments. Both simulation studies and real-data analyses demonstrated that our proposed method was comparable to most of the existing popular methods in terms of imputation accuracy, but was much more efficient in terms of computation. The MAR algorithm can further speed up the calculation by several folds without loss of accuracy. The proposed method will be useful in large-scale imputation studies with a large number of reference subjects.Availability
The implemented multi-threading software FISH is freely available for academic use at https://sites.google.com/site/lzhanghomepage/FISH.
SUBMITTER: Zhang L
PROVIDER: S-EPMC4071209 | biostudies-literature | 2014 Jul
REPOSITORIES: biostudies-literature
Zhang Lei L Pei Yu-Fang YF Fu Xiaoying X Lin Yong Y Wang Yu-Ping YP Deng Hong-Wen HW
Bioinformatics (Oxford, England) 20140310 13
<h4>Motivation</h4>Fast and accurate genotype imputation is necessary for facilitating gene-mapping studies, especially with the ever increasing numbers of both common and rare variants generated by high-throughput-sequencing experiments. However, most of the existing imputation approaches suffer from either inaccurate results or heavy computational demand.<h4>Results</h4>In this article, aiming to perform fast and accurate genotype-imputation analysis, we propose a novel, fast and yet accurate ...[more]