Unknown

Dataset Information

0

Redundans: an assembly pipeline for highly heterozygous genomes.


ABSTRACT: Many genomes display high levels of heterozygosity (i.e. presence of different alleles at the same loci in homologous chromosomes), being those of hybrid organisms an extreme such case. The assembly of highly heterozygous genomes from short sequencing reads is a challenging task because it is difficult to accurately recover the different haplotypes. When confronted with highly heterozygous genomes, the standard assembly process tends to collapse homozygous regions and reports heterozygous regions in alternative contigs. The boundaries between homozygous and heterozygous regions result in multiple assembly paths that are hard to resolve, which leads to highly fragmented assemblies with a total size larger than expected. This, in turn, causes numerous problems in downstream analyses such as fragmented gene models, wrong gene copy number, or broken synteny. To circumvent these caveats we have developed a pipeline that specifically deals with the assembly of heterozygous genomes by introducing a step to recognise and selectively remove alternative heterozygous contigs. We tested our pipeline on simulated and naturally-occurring heterozygous genomes and compared its accuracy to other existing tools. Our method is freely available at https://github.com/Gabaldonlab/redundans.

SUBMITTER: Pryszcz LP 

PROVIDER: S-EPMC4937319 | biostudies-literature | 2016 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Redundans: an assembly pipeline for highly heterozygous genomes.

Pryszcz Leszek P LP   Gabaldón Toni T  

Nucleic acids research 20160429 12


Many genomes display high levels of heterozygosity (i.e. presence of different alleles at the same loci in homologous chromosomes), being those of hybrid organisms an extreme such case. The assembly of highly heterozygous genomes from short sequencing reads is a challenging task because it is difficult to accurately recover the different haplotypes. When confronted with highly heterozygous genomes, the standard assembly process tends to collapse homozygous regions and reports heterozygous region  ...[more]

Similar Datasets

| S-EPMC4120091 | biostudies-literature
| S-EPMC5310280 | biostudies-literature
| S-EPMC3441570 | biostudies-literature
| S-EPMC7247394 | biostudies-literature
| S-EPMC10696586 | biostudies-literature
| S-EPMC9852648 | biostudies-literature
| S-EPMC2865496 | biostudies-literature
| S-EPMC3268234 | biostudies-literature
| S-EPMC9719158 | biostudies-literature
| S-EPMC5406902 | biostudies-literature