Unknown

Dataset Information

0

BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly.


ABSTRACT:

Background

The problem of de-novo assembly for metagenomes using only long reads is gaining attention. We study whether post-processing metagenomic assemblies with the original input long reads can result in quality improvement. Previous approaches have focused on pre-processing reads and optimizing assemblers. BIGMAC takes an alternative perspective to focus on the post-processing step.

Results

Using both the assembled contigs and original long reads as input, BIGMAC first breaks the contigs at potentially mis-assembled locations and subsequently scaffolds contigs. Our experiments on metagenomes assembled from long reads show that BIGMAC can improve assembly quality by reducing the number of mis-assemblies while maintaining or increasing N50 and N75. Moreover, BIGMAC shows the largest N75 to number of mis-assemblies ratio on all tested datasets when compared to other post-processing tools.

Conclusions

BIGMAC demonstrates the effectiveness of the post-processing approach in improving the quality of metagenomic assemblies.

SUBMITTER: Lam KK 

PROVIDER: S-EPMC5084376 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly.

Lam Ka-Kit KK   Hall Richard R   Clum Alicia A   Rao Satish S  

BMC bioinformatics 20161028 1


<h4>Background</h4>The problem of de-novo assembly for metagenomes using only long reads is gaining attention. We study whether post-processing metagenomic assemblies with the original input long reads can result in quality improvement. Previous approaches have focused on pre-processing reads and optimizing assemblers. BIGMAC takes an alternative perspective to focus on the post-processing step.<h4>Results</h4>Using both the assembled contigs and original long reads as input, BIGMAC first breaks  ...[more]

Similar Datasets

| S-EPMC8248648 | biostudies-literature
| S-EPMC8254474 | biostudies-literature
2023-10-14 | GSE215357 | GEO
2023-10-14 | GSE215355 | GEO
| PRJEB19201 | ENA
| S-EPMC5100563 | biostudies-literature
| S-EPMC6807382 | biostudies-literature
| S-EPMC3648784 | biostudies-literature
| S-EPMC6325685 | biostudies-literature