Unknown

Dataset Information

0

HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly.


ABSTRACT:

Summary

De novo assembly is a difficult issue for heterozygous diploid genomes. The advent of high-throughput short-read and long-read sequencing technologies provides both new challenges and potential solutions to the issue. Here, we present HaploMerger2 (HM2), an automated pipeline for rebuilding both haploid sub-assemblies from the polymorphic diploid genome assembly. It is designed to work on pre-existing diploid assemblies, which are typically created by using de novo assemblers. HM2 can process any diploid assemblies, but it is especially suitable for diploid assemblies with high heterozygosity (?3%), which can be difficult for other tools. This pipeline also implements flexible and sensitive assembly error detection, a hierarchical scaffolding procedure and a reliable gap-closing method for haploid sub-assemblies. Using HM2, we demonstrate that two haploid sub-assemblies reconstructed from a real, highly-polymorphic diploid assembly show greatly improved continuity.

Availability and implementation

Source code, executables and the testing dataset are freely available at https://github.com/mapleforest/HaploMerger2/releases/.

Contact

hshengf2@mail.sysu.edu.cn.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Huang S 

PROVIDER: S-EPMC5870766 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly.

Huang Shengfeng S   Kang Mingjing M   Xu Anlong A  

Bioinformatics (Oxford, England) 20170801 16


<h4>Summary</h4>De novo assembly is a difficult issue for heterozygous diploid genomes. The advent of high-throughput short-read and long-read sequencing technologies provides both new challenges and potential solutions to the issue. Here, we present HaploMerger2 (HM2), an automated pipeline for rebuilding both haploid sub-assemblies from the polymorphic diploid genome assembly. It is designed to work on pre-existing diploid assemblies, which are typically created by using de novo assemblers. HM  ...[more]

Similar Datasets

| S-EPMC8016491 | biostudies-literature
| S-EPMC5411779 | biostudies-literature
| S-EPMC3409271 | biostudies-literature
| S-EPMC10902894 | biostudies-literature
| S-EPMC6267036 | biostudies-literature
| S-EPMC5223519 | biostudies-literature
| S-EPMC6022571 | biostudies-literature
| S-EPMC6505170 | biostudies-literature
| S-EPMC4703000 | biostudies-literature
| S-EPMC5984525 | biostudies-literature