Unknown

Dataset Information

0

Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly.


ABSTRACT: For many researchers, next generation sequencing data holds the key to answering a category of questions previously unassailable. One of the important and challenging steps in achieving these goals is accurately assembling the massive quantity of short sequencing reads into full nucleic acid sequences. For research groups working with non-model or wild systems, short read assembly can pose a significant challenge due to the lack of pre-existing EST or genome reference libraries. While many publications describe the overall process of sequencing and assembly, few address the topic of how many and what types of reads are best for assembly. The goal of this project was use real world data to explore the effects of read quantity and short read quality scores on the resulting de novo assemblies. Using several samples of short reads of various sizes and qualities we produced many assemblies in an automated manner. We observe how the properties of read length, read quality, and read quantity affect the resulting assemblies and provide some general recommendations based on our real-world data set.

SUBMITTER: Garcia TI 

PROVIDER: S-EPMC3223268 | biostudies-literature | 2012 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly.

Garcia T I TI   Shen Y Y   Catchen J J   Amores A A   Schartl M M   Postlethwait J J   Walter R B RB  

Comparative biochemistry and physiology. Toxicology & pharmacology : CBP 20110601 1


For many researchers, next generation sequencing data holds the key to answering a category of questions previously unassailable. One of the important and challenging steps in achieving these goals is accurately assembling the massive quantity of short sequencing reads into full nucleic acid sequences. For research groups working with non-model or wild systems, short read assembly can pose a significant challenge due to the lack of pre-existing EST or genome reference libraries. While many publi  ...[more]

Similar Datasets

| S-EPMC3749127 | biostudies-literature
| S-EPMC3485621 | biostudies-literature
| S-EPMC3287467 | biostudies-literature
| S-EPMC6511074 | biostudies-literature
| S-EPMC2336801 | biostudies-literature
| S-EPMC3227110 | biostudies-literature
| S-EPMC2813482 | biostudies-literature
| S-EPMC3663818 | biostudies-literature
| S-EPMC6289447 | biostudies-literature
| S-EPMC3100316 | biostudies-literature