Unknown

Dataset Information

0

A study on fast calling variants from next-generation sequencing data using decision tree.


ABSTRACT: BACKGROUND:The rapid development of next-generation sequencing (NGS) technology has continuously been refreshing the throughput of sequencing data. However, due to the lack of a smart tool that is both fast and accurate, the analysis task for NGS data, especially those with low-coverage, remains challenging. RESULTS:We proposed a decision-tree based variant calling algorithm. Experiments on a set of real data indicate that our algorithm achieves high accuracy and sensitivity for SNVs and indels and shows good adaptability on low-coverage data. In particular, our algorithm is obviously faster than 3 widely used tools in our experiments. CONCLUSIONS:We implemented our algorithm in a software named Fuwa and applied it together with 4 well-known variant callers, i.e., Platypus, GATK-UnifiedGenotyper, GATK-HaplotypeCaller and SAMtools, to three sequencing data sets of a well-studied sample NA12878, which were produced by whole-genome, whole-exome and low-coverage whole-genome sequencing technology respectively. We also conducted additional experiments on the WGS data of 4 newly released samples that have not been used to populate dbSNP.

SUBMITTER: Li Z 

PROVIDER: S-EPMC5907718 | biostudies-literature | 2018 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

A study on fast calling variants from next-generation sequencing data using decision tree.

Li Zhentang Z   Wang Yi Y   Wang Fei F  

BMC bioinformatics 20180419 1


<h4>Background</h4>The rapid development of next-generation sequencing (NGS) technology has continuously been refreshing the throughput of sequencing data. However, due to the lack of a smart tool that is both fast and accurate, the analysis task for NGS data, especially those with low-coverage, remains challenging.<h4>Results</h4>We proposed a decision-tree based variant calling algorithm. Experiments on a set of real data indicate that our algorithm achieves high accuracy and sensitivity for S  ...[more]

Similar Datasets

| S-EPMC3907006 | biostudies-literature
| S-EPMC3493122 | biostudies-literature
| S-EPMC5324109 | biostudies-literature
| S-EPMC10794290 | biostudies-literature
| S-EPMC5528527 | biostudies-other
| S-EPMC5583360 | biostudies-literature
| S-EPMC4265454 | biostudies-literature
| S-EPMC4137768 | biostudies-literature
| S-EPMC3792961 | biostudies-literature
| S-EPMC6281872 | biostudies-literature