Unknown

Dataset Information

0

Improved haplotype inference by exploiting long-range linking and allelic imbalance in RNA-seq datasets.


ABSTRACT: Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that differential allele-specific expression can link genetic variants from the same physical chromosome, thus even enabling using reads that cover only individual variants. We demonstrate HapTree-X's feasibility on in-house sequenced Genome in a Bottle RNA-seq and various whole exome, genome, and 10X Genomics datasets. HapTree-X produces more complete phases (up to 25%), even in clinically important genes, and phases more variants than other methods while maintaining similar or higher accuracy and being up to 10×  faster than other tools. The advantage of HapTree-X's ability to use multiple lines of evidence, as well as to phase polyploid genomes in a single integrative framework, substantially grows as the amount of diverse data increases.

SUBMITTER: Berger E 

PROVIDER: S-EPMC7494856 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improved haplotype inference by exploiting long-range linking and allelic imbalance in RNA-seq datasets.

Berger Emily E   Yorukoglu Deniz D   Zhang Lillian L   Nyquist Sarah K SK   Shalek Alex K AK   Kellis Manolis M   Numanagić Ibrahim I   Berger Bonnie B  

Nature communications 20200916 1


Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that differential allele-specific expression can link genetic variants from the same physical chromosome, thus even enabling using reads that cover only indivi  ...[more]

Similar Datasets

| S-EPMC3530675 | biostudies-literature
| S-EPMC9113279 | biostudies-literature
| S-EPMC5039928 | biostudies-literature
| S-EPMC6454553 | biostudies-literature
| S-EPMC4230747 | biostudies-literature
| S-EPMC11343128 | biostudies-literature
| S-EPMC7346379 | biostudies-literature
| S-EPMC6385462 | biostudies-literature
| S-EPMC6576752 | biostudies-literature
| S-EPMC6715686 | biostudies-literature