Unknown

Dataset Information

0

Allele balance bias identifies systematic genotyping errors and false disease associations.


ABSTRACT: In recent years, next-generation sequencing (NGS) has become a cornerstone of clinical genetics and diagnostics. Many clinical applications require high precision, especially if rare events such as somatic mutations in cancer or genetic variants causing rare diseases need to be identified. Although random sequencing errors can be modeled statistically and deep sequencing minimizes their impact, systematic errors remain a problem even at high depth of coverage. Understanding their source is crucial to increase precision of clinical NGS applications. In this work, we studied the relation between recurrent biases in allele balance (AB), systematic errors, and false positive variant calls across a large cohort of human samples analyzed by whole exome sequencing (WES). We have modeled the AB distribution for biallelic genotypes in 987 WES samples in order to identify positions recurrently deviating significantly from the expectation, a phenomenon we termed allele balance bias (ABB). Furthermore, we have developed a genotype callability score based on ABB for all positions of the human exome, which detects false positive variant calls that passed state-of-the-art filters. Finally, we demonstrate the use of ABB for detection of false associations proposed by rare variant association studies. Availability: https://github.com/Francesc-Muyas/ABB.

SUBMITTER: Muyas F 

PROVIDER: S-EPMC6587442 | biostudies-literature | 2019 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Allele balance bias identifies systematic genotyping errors and false disease associations.

Muyas Francesc F   Bosio Mattia M   Puig Anna A   Susak Hana H   Domènech Laura L   Escaramis Georgia G   Zapata Luis L   Demidov German G   Estivill Xavier X   Rabionet Raquel R   Ossowski Stephan S  

Human mutation 20181123 1


In recent years, next-generation sequencing (NGS) has become a cornerstone of clinical genetics and diagnostics. Many clinical applications require high precision, especially if rare events such as somatic mutations in cancer or genetic variants causing rare diseases need to be identified. Although random sequencing errors can be modeled statistically and deep sequencing minimizes their impact, systematic errors remain a problem even at high depth of coverage. Understanding their source is cruci  ...[more]

Similar Datasets

| EGAS00001003027 | EGA
2016-03-21 | GSE79254 | GEO
| S-EPMC4185115 | biostudies-literature
| S-EPMC4576452 | biostudies-literature
2016-03-21 | GSE79262 | GEO
| S-EPMC7249892 | biostudies-literature
| S-EPMC7351776 | biostudies-literature
| S-EPMC5143225 | biostudies-literature
2012-03-03 | E-GEOD-36217 | biostudies-arrayexpress
| S-EPMC6033641 | biostudies-literature