Unknown

Dataset Information

0

Assemblathon 1: a competitive assessment of de novo short read assembly methods.


ABSTRACT: Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.

SUBMITTER: Earl D 

PROVIDER: S-EPMC3227110 | biostudies-literature | 2011 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assemblathon 1: a competitive assessment of de novo short read assembly methods.

Earl Dent D   Bradnam Keith K   St John John J   Darling Aaron A   Lin Dawei D   Fass Joseph J   Yu Hung On Ken HO   Buffalo Vince V   Zerbino Daniel R DR   Diekhans Mark M   Nguyen Ngan N   Ariyaratne Pramila Nuwantha PN   Sung Wing-Kin WK   Ning Zemin Z   Haimel Matthias M   Simpson Jared T JT   Fonseca Nuno A NA   Birol İnanç İ   Docking T Roderick TR   Ho Isaac Y IY   Rokhsar Daniel S DS   Chikhi Rayan R   Lavenier Dominique D   Chapuis Guillaume G   Naquin Delphine D   Maillet Nicolas N   Schatz Michael C MC   Kelley David R DR   Phillippy Adam M AM   Koren Sergey S   Yang Shiaw-Pyng SP   Wu Wei W   Chou Wen-Chi WC   Srivastava Anuj A   Shaw Timothy I TI   Ruby J Graham JG   Skewes-Cox Peter P   Betegon Miguel M   Dimon Michelle T MT   Solovyev Victor V   Seledtsov Igor I   Kosarev Petr P   Vorobyev Denis D   Ramirez-Gonzalez Ricardo R   Leggett Richard R   MacLean Dan D   Xia Fangfang F   Luo Ruibang R   Li Zhenyu Z   Xie Yinlong Y   Liu Binghang B   Gnerre Sante S   MacCallum Iain I   Przybylski Dariusz D   Ribeiro Filipe J FJ   Yin Shuangye S   Sharpe Ted T   Hall Giles G   Kersey Paul J PJ   Durbin Richard R   Jackman Shaun D SD   Chapman Jarrod A JA   Huang Xiaoqiu X   DeRisi Joseph L JL   Caccamo Mario M   Li Yingrui Y   Jaffe David B DB   Green Richard E RE   Haussler David D   Korf Ian I   Paten Benedict B  

Genome research 20110916 12


Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of  ...[more]

Similar Datasets

| S-EPMC2336801 | biostudies-literature
| S-EPMC2813482 | biostudies-literature
| S-EPMC3663818 | biostudies-literature
| S-EPMC6802306 | biostudies-literature
| S-EPMC3100316 | biostudies-literature
| S-EPMC3485621 | biostudies-literature
| S-EPMC3749127 | biostudies-literature
| S-EPMC9795473 | biostudies-literature
| S-EPMC3558281 | biostudies-literature
| S-EPMC3287467 | biostudies-literature