Unknown

Dataset Information

0

A new strategy for genome assembly using short sequence reads and reduced representation libraries.


ABSTRACT: We have developed a novel approach for using massively parallel short-read sequencing to generate fast and inexpensive de novo genomic assemblies comparable to those generated by capillary-based methods. The ultrashort (<100 base) sequences generated by this technology pose specific biological and computational challenges for de novo assembly of large genomes. To account for this, we devised a method for experimentally partitioning the genome using reduced representation (RR) libraries prior to assembly. We use two restriction enzymes independently to create a series of overlapping fragment libraries, each containing a tractable subset of the genome. Together, these libraries allow us to reassemble the entire genome without the need of a reference sequence. As proof of concept, we applied this approach to sequence and assembled the majority of the 125-Mb Drosophila melanogaster genome. We subsequently demonstrate the accuracy of our assembly method with meaningful comparisons against the current available D. melanogaster reference genome (dm3). The ease of assembly and accuracy for comparative genomics suggest that our approach will scale to future mammalian genome-sequencing efforts, saving both time and money without sacrificing quality.

SUBMITTER: Young AL 

PROVIDER: S-EPMC2813480 | biostudies-literature | 2010 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

A new strategy for genome assembly using short sequence reads and reduced representation libraries.

Young Andrew L AL   Abaan Hatice Ozel HO   Zerbino Daniel D   Mullikin James C JC   Birney Ewan E   Margulies Elliott H EH  

Genome research 20100201 2


We have developed a novel approach for using massively parallel short-read sequencing to generate fast and inexpensive de novo genomic assemblies comparable to those generated by capillary-based methods. The ultrashort (<100 base) sequences generated by this technology pose specific biological and computational challenges for de novo assembly of large genomes. To account for this, we devised a method for experimentally partitioning the genome using reduced representation (RR) libraries prior to  ...[more]

Similar Datasets

| S-EPMC3268122 | biostudies-literature
| S-EPMC3092772 | biostudies-literature
| S-EPMC2848820 | biostudies-literature
| PRJEB15234 | ENA
| PRJEB8197 | ENA
| S-EPMC3158087 | biostudies-literature
| S-EPMC3995342 | biostudies-literature
| S-EPMC9022495 | biostudies-literature
| S-EPMC3557168 | biostudies-literature
| S-EPMC5173252 | biostudies-literature