Unknown

Dataset Information

0

Misassembly detection using paired-end sequence reads and optical mapping data.


ABSTRACT: A crucial problem in genome assembly is the discovery and correction of misassembly errors in draft genomes. We develop a method called misSEQuel that enhances the quality of draft genomes by identifying misassembly errors and their breakpoints using paired-end sequence reads and optical mapping data. Our method also fulfills the critical need for open source computational methods for analyzing optical mapping data. We apply our method to various assemblies of the loblolly pine, Francisella tularensis, rice and budgerigar genomes. We generated and used stimulated optical mapping data for loblolly pine and F.tularensis and used real optical mapping data for rice and budgerigar.Our results demonstrate that we detect more than 54% of extensively misassembled contigs and more than 60% of locally misassembled contigs in assemblies of F.tularensis and between 31% and 100% of extensively misassembled contigs and between 57% and 73% of locally misassembled contigs in assemblies of loblolly pine. Using the real optical mapping data, we correctly identified 75% of extensively misassembled contigs and 100% of locally misassembled contigs in rice, and 77% of extensively misassembled contigs and 80% of locally misassembled contigs in budgerigar.misSEQuel can be used as a post-processing step in combination with any genome assembler and is freely available at http://www.cs.colostate.edu/seq/.

SUBMITTER: Muggli MD 

PROVIDER: S-EPMC4542784 | biostudies-literature | 2015 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Misassembly detection using paired-end sequence reads and optical mapping data.

Muggli Martin D MD   Puglisi Simon J SJ   Ronen Roy R   Boucher Christina C  

Bioinformatics (Oxford, England) 20150601 12


<h4>Motivation</h4>A crucial problem in genome assembly is the discovery and correction of misassembly errors in draft genomes. We develop a method called misSEQuel that enhances the quality of draft genomes by identifying misassembly errors and their breakpoints using paired-end sequence reads and optical mapping data. Our method also fulfills the critical need for open source computational methods for analyzing optical mapping data. We apply our method to various assemblies of the loblolly pin  ...[more]

Similar Datasets

| S-EPMC3614465 | biostudies-other
| S-EPMC4582294 | biostudies-literature
| S-EPMC4234483 | biostudies-literature
| S-EPMC7168855 | biostudies-literature
| S-EPMC3624798 | biostudies-literature
| S-EPMC3967115 | biostudies-literature
| S-EPMC2865866 | biostudies-literature
2010-07-13 | GSE22765 | GEO
2010-07-13 | E-GEOD-22765 | biostudies-arrayexpress
| S-EPMC5834899 | biostudies-literature