Unknown

Dataset Information

0

A robust benchmark for detection of germline large deletions and insertions.


ABSTRACT: New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.

SUBMITTER: Zook JM 

PROVIDER: S-EPMC8454654 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4372442 | biostudies-literature
| S-EPMC2989997 | biostudies-literature
| S-EPMC5549930 | biostudies-other
| S-EPMC8096211 | biostudies-literature
| S-EPMC5217656 | biostudies-literature
| S-EPMC3128616 | biostudies-literature
| S-EPMC2527138 | biostudies-literature
| S-EPMC2940567 | biostudies-literature
| S-EPMC4411138 | biostudies-literature
| S-EPMC5127803 | biostudies-literature