Unknown

Dataset Information

0

Quality control of next-generation sequencing data without a reference.


ABSTRACT: Next-generation sequencing (NGS) technologies have dramatically expanded the breadth of genomics. Genome-scale data, once restricted to a small number of biomedical model organisms, can now be generated for virtually any species at remarkable speed and low cost. Yet non-model organisms often lack a suitable reference to map sequence reads against, making alignment-based quality control (QC) of NGS data more challenging than cases where a well-assembled genome is already available. Here we show that by generating a rapid, non-optimized draft assembly of raw reads, it is possible to obtain reliable and informative QC metrics, thus removing the need for a high quality reference. We use benchmark datasets generated from control samples across a range of genome sizes to illustrate that QC inferences made using draft assemblies are broadly equivalent to those made using a well-established reference, and describe QC tools routinely used in our production facility to assess the quality of NGS data from non-model organisms.

SUBMITTER: Trivedi UH 

PROVIDER: S-EPMC4018527 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Quality control of next-generation sequencing data without a reference.

Trivedi Urmi H UH   Cézard Timothée T   Bridgett Stephen S   Montazam Anna A   Nichols Jenna J   Blaxter Mark M   Gharbi Karim K  

Frontiers in genetics 20140506


Next-generation sequencing (NGS) technologies have dramatically expanded the breadth of genomics. Genome-scale data, once restricted to a small number of biomedical model organisms, can now be generated for virtually any species at remarkable speed and low cost. Yet non-model organisms often lack a suitable reference to map sequence reads against, making alignment-based quality control (QC) of NGS data more challenging than cases where a well-assembled genome is already available. Here we show t  ...[more]

Similar Datasets

| S-EPMC4298064 | biostudies-literature
| PRJEB5201 | ENA
| S-EPMC2707382 | biostudies-literature
| S-EPMC7934511 | biostudies-literature
| S-EPMC3270013 | biostudies-literature
| S-EPMC4268903 | biostudies-literature
| S-EPMC6938691 | biostudies-literature
| S-EPMC8408346 | biostudies-literature
| S-EPMC6020721 | biostudies-literature
| S-EPMC7109398 | biostudies-literature