Unknown

Dataset Information

0

A synthetic-diploid benchmark for accurate variant-calling evaluation.


ABSTRACT: Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context.

SUBMITTER: Li H 

PROVIDER: S-EPMC6341484 | biostudies-literature | 2018 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

A synthetic-diploid benchmark for accurate variant-calling evaluation.

Li Heng H   Bloom Jonathan M JM   Farjoun Yossi Y   Fleharty Mark M   Gauthier Laura L   Neale Benjamin B   MacArthur Daniel D  

Nature methods 20180716 8


Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context. ...[more]

Similar Datasets

| S-EPMC6788989 | biostudies-literature
| S-EPMC11322167 | biostudies-literature
| S-EPMC7751401 | biostudies-literature
| S-EPMC6853766 | biostudies-literature
| S-EPMC8141913 | biostudies-literature
| S-EPMC10250393 | biostudies-literature
| S-EPMC11246426 | biostudies-literature
| S-EPMC9710574 | biostudies-literature
| S-EPMC7576216 | biostudies-literature
| S-EPMC7611855 | biostudies-literature