Unknown

Dataset Information

0

ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data.


ABSTRACT:

Background

Metagenomics is the study of microbial genomes for pathogen detection and discovery in human clinical, animal, and environmental samples via Next-Generation Sequencing (NGS). Metagenome de novo sequence assembly is a crucial analytical step in which longer contigs, ideally whole chromosomes/genomes, are formed from shorter NGS reads. However, the contigs generated from the de novo assembly are often very fragmented and rarely longer than a few kilo base pairs (kb). Therefore, a time-consuming extension process is routinely performed on the de novo assembled contigs.

Results

To facilitate this process, we propose a new tool for metagenome contig extension after de novo assembly. ContigExtender employs a novel recursive extending strategy that explores multiple extending paths to achieve highly accurate longer contigs. We demonstrate that ContigExtender outperforms existing tools in synthetic, animal, and human metagenomics datasets.

Conclusions

A novel software tool ContigExtender has been developed to assist and enhance the performance of metagenome de novo assembly. ContigExtender effectively extends contigs from a variety of sources and can be incorporated in most viral metagenomics analysis pipelines for a wide variety of applications, including pathogen detection and viral discovery.

SUBMITTER: Deng Z 

PROVIDER: S-EPMC7953547 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5767027 | biostudies-literature
| S-EPMC4927370 | biostudies-literature
| S-EPMC2824677 | biostudies-literature
| S-EPMC3469330 | biostudies-literature
| S-EPMC4245761 | biostudies-literature
| S-EPMC2768983 | biostudies-literature
| S-EPMC3272011 | biostudies-literature
| S-EPMC5411778 | biostudies-literature
| S-EPMC8287296 | biostudies-literature
| S-EPMC7852260 | biostudies-literature