Unknown

Dataset Information

0

Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes.


ABSTRACT: Amplification artifacts introduced during library preparation for the Illumina Genome Analyzer increase the likelihood that an appreciable proportion of these sequences will be duplicates and cause an uneven distribution of read coverage across the targeted sequencing regions. As a consequence, these unfavorable features result in difficulties in genome assembly and variation analysis from the short reads, particularly when the sequences are from genomes with base compositions at the extremes of high or low G+C content. Here we present an amplification-free method of library preparation, in which the cluster amplification step, rather than the PCR, enriches for fully ligated template strands, reducing the incidence of duplicate sequences, improving read mapping and single nucleotide polymorphism calling and aiding de novo assembly. We illustrate this by generating and analyzing DNA sequences from extremely (G+C)-poor (Plasmodium falciparum), (G+C)-neutral (Escherichia coli) and (G+C)-rich (Bordetella pertussis) genomes.

SUBMITTER: Kozarewa I 

PROVIDER: S-EPMC2664327 | biostudies-literature | 2009 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes.

Kozarewa Iwanka I   Ning Zemin Z   Quail Michael A MA   Sanders Mandy J MJ   Berriman Matthew M   Turner Daniel J DJ  

Nature methods 20090315 4


Amplification artifacts introduced during library preparation for the Illumina Genome Analyzer increase the likelihood that an appreciable proportion of these sequences will be duplicates and cause an uneven distribution of read coverage across the targeted sequencing regions. As a consequence, these unfavorable features result in difficulties in genome assembly and variation analysis from the short reads, particularly when the sequences are from genomes with base compositions at the extremes of  ...[more]

Similar Datasets

| S-EPMC3312816 | biostudies-literature
| S-EPMC6994068 | biostudies-literature
| S-EPMC6796507 | biostudies-literature
| S-EPMC3428589 | biostudies-literature
| S-EPMC4864918 | biostudies-literature
| S-EPMC3273786 | biostudies-literature
| S-EPMC8342411 | biostudies-literature
| S-EPMC4441430 | biostudies-literature
| PRJEB4315 | ENA
| S-EPMC5808798 | biostudies-literature