Unknown

Dataset Information

0

Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data.


ABSTRACT: Recent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accuracy. In this study, we aimed to develop a tool, Heap, that enables robustly sensitive and accurate calling of SNPs, particularly with a low coverage NGS data, which must be aligned to the reference genome sequences in advance. To reduce false positive SNPs, Heap determines genotypes and calls SNPs at each site except for sites at the both ends of reads or containing a minor allele supported by only one read. Performance comparison with existing tools showed that Heap achieved the highest F-scores with low coverage (7X) restriction-site associated DNA sequencing reads of sorghum and rice individuals. This will facilitate cost-effective GWAS and GP studies in this NGS era. Code and documentation of Heap are freely available from https://github.com/meiji-bioinf/heap (29 March 2017, date last accessed) and our web site (http://bioinf.mind.meiji.ac.jp/lab/en/tools.html (29 March 2017, date last accessed)).

SUBMITTER: Kobayashi M 

PROVIDER: S-EPMC5737671 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data.

Kobayashi Masaaki M   Ohyanagi Hajime H   Takanashi Hideki H   Asano Satomi S   Kudo Toru T   Kajiya-Kanegae Hiromi H   Nagano Atsushi J AJ   Tainaka Hitoshi H   Tokunaga Tsuyoshi T   Sazuka Takashi T   Iwata Hiroyoshi H   Tsutsumi Nobuhiro N   Yano Kentaro K  

DNA research : an international journal for rapid publication of reports on genes and genomes 20170801 4


Recent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accur  ...[more]

Similar Datasets

| S-EPMC1274293 | biostudies-other
| S-EPMC5972415 | biostudies-other
| S-EPMC3848615 | biostudies-literature
2024-08-14 | PXD048271 | Pride
| S-EPMC5996464 | biostudies-literature
| S-EPMC8762119 | biostudies-literature
| S-EPMC8480091 | biostudies-literature
| S-EPMC6251912 | biostudies-literature
| S-EPMC7811225 | biostudies-literature
| S-EPMC7144076 | biostudies-literature