Unknown

Dataset Information

0

SeedsGraph: an efficient assembler for next-generation sequencing data.


ABSTRACT: DNA sequencing technology has been rapidly evolving, and produces a large number of short reads with a fast rising tendency. This has led to a resurgence of research in whole genome shotgun assembly algorithms. We start the assembly algorithm by clustering the short reads in a cloud computing framework, and the clustering process groups fragments according to their original consensus long-sequence similarity. We condense each group of reads to a chain of seeds, which is a kind of substring with reads aligned, and then build a graph accordingly. Finally, we analyze the graph to find Euler paths, and assemble the reads related in the paths into contigs, and then lay out contigs with mate-pair information for scaffolds. The result shows that our algorithm is efficient and feasible for a large set of reads such as in next-generation sequencing technology.

SUBMITTER: Wang C 

PROVIDER: S-EPMC4460749 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

SeedsGraph: an efficient assembler for next-generation sequencing data.

Wang Chunyu C   Guo Maozu M   Liu Xiaoyan X   Liu Yang Y   Zou Quan Q  

BMC medical genomics 20150529


DNA sequencing technology has been rapidly evolving, and produces a large number of short reads with a fast rising tendency. This has led to a resurgence of research in whole genome shotgun assembly algorithms. We start the assembly algorithm by clustering the short reads in a cloud computing framework, and the clustering process groups fragments according to their original consensus long-sequence similarity. We condense each group of reads to a chain of seeds, which is a kind of substring with  ...[more]

Similar Datasets

| S-EPMC4133164 | biostudies-literature
| S-EPMC9891242 | biostudies-literature
| S-EPMC3291790 | biostudies-literature
2017-04-03 | PXD003804 | Pride
| S-EPMC4666565 | biostudies-literature
| S-EPMC3244761 | biostudies-literature
| S-EPMC6599308 | biostudies-literature
| S-EPMC3769656 | biostudies-literature
| S-EPMC3813857 | biostudies-other
| S-EPMC3813878 | biostudies-other