Unknown

Dataset Information

0

Efficient clustering of identity-by-descent between multiple individuals.


ABSTRACT: MOTIVATION:Most existing identity-by-descent (IBD) detection methods only consider haplotype pairs; less attention has been paid to considering multiple haplotypes simultaneously, even though IBD is an equivalence relation on haplotypes that partitions a set of haplotypes into IBD clusters. Multiple-haplotype IBD clusters may have advantages over pairwise IBD in some applications, such as IBD mapping. Existing methods for detecting multiple-haplotype IBD clusters are often computationally expensive and unable to handle large samples with thousands of haplotypes. RESULTS:We present a clustering method, efficient multiple-IBD, which uses pairwise IBD segments to infer multiple-haplotype IBD clusters. It expands clusters from seed haplotypes by adding qualified neighbors and extends clusters across sliding windows in the genome. Our method is an order of magnitude faster than existing methods and has comparable performance with respect to the quality of clusters it uncovers. We further investigate the potential application of multiple-haplotype IBD clusters in association studies by testing for association between multiple-haplotype IBD clusters and low-density lipoprotein cholesterol in the Northern Finland Birth Cohort. Using our multiple-haplotype IBD cluster approach, we found an association with a genomic interval covering the PCSK9 gene in these data that is missed by standard single-marker association tests. Previously published studies confirm association of PCSK9 with low-density lipoprotein. AVAILABILITY AND IMPLEMENTATION:Source code is available under the GNU Public License http://cs.au.dk/~qianyuxx/EMI/.

SUBMITTER: Qian Y 

PROVIDER: S-EPMC3967111 | biostudies-literature | 2014 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Efficient clustering of identity-by-descent between multiple individuals.

Qian Yu Y   Browning Brian L BL   Browning Sharon R SR  

Bioinformatics (Oxford, England) 20131219 7


<h4>Motivation</h4>Most existing identity-by-descent (IBD) detection methods only consider haplotype pairs; less attention has been paid to considering multiple haplotypes simultaneously, even though IBD is an equivalence relation on haplotypes that partitions a set of haplotypes into IBD clusters. Multiple-haplotype IBD clusters may have advantages over pairwise IBD in some applications, such as IBD mapping. Existing methods for detecting multiple-haplotype IBD clusters are often computationall  ...[more]

Similar Datasets

| S-EPMC4402631 | biostudies-literature
| S-EPMC4296155 | biostudies-literature
| S-EPMC3817948 | biostudies-literature
| S-EPMC5952413 | biostudies-other
| S-EPMC3948483 | biostudies-literature
| S-EPMC7555865 | biostudies-literature
| S-EPMC3589405 | biostudies-literature
| S-EPMC2427226 | biostudies-literature
| S-EPMC4275846 | biostudies-literature
| S-EPMC2881406 | biostudies-literature