Unknown

Dataset Information

0

In-depth comparative analysis of Illumina® MiSeq run metrics: Development of a wet-lab quality assessment tool.


ABSTRACT: Whole genome sequencing of bacterial isolates has become a daily task in many laboratories, generating incredible amounts of data. However, data acquisition is not an end in itself; the goal is to acquire high-quality data useful for understanding genetic relationships. Having a method that could rapidly determine which of the many available run metrics are the most important indicators of overall run quality and having a way to monitor these during a given sequencing run would be extremely helpful to this effect. Therefore, we compared various run metrics across 486 MiSeq runs, from five different machines. By performing a statistical analysis using principal components analysis and a K-means clustering algorithm of the metrics, we were able to validate metric comparisons among instruments, allowing for the development of a predictive algorithm, which permits one to observe whether a given MiSeq run has performed adequately. This algorithm is available in an Excel spreadsheet: that is, MiSeq Instrument & Run (In-Run) Forecast. Our tool can help verify that the quantity/quality of the generated sequencing data consistently meets or exceeds recommended manufacturer expectations. Patterns of deviation from those expectations can be used to assess potential run problems and plan preventative maintenance, which can save valuable time and funding resources.

SUBMITTER: Kastanis GJ 

PROVIDER: S-EPMC6487961 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

In-depth comparative analysis of Illumina<sup>®</sup> MiSeq run metrics: Development of a wet-lab quality assessment tool.

Kastanis George John GJ   Santana-Quintero Luis V LV   Sanchez-Leon Maria M   Lomonaco Sara S   Brown Eric W EW   Allard Marc W MW  

Molecular ecology resources 20190117 2


Whole genome sequencing of bacterial isolates has become a daily task in many laboratories, generating incredible amounts of data. However, data acquisition is not an end in itself; the goal is to acquire high-quality data useful for understanding genetic relationships. Having a method that could rapidly determine which of the many available run metrics are the most important indicators of overall run quality and having a way to monitor these during a given sequencing run would be extremely help  ...[more]

Similar Datasets

| S-EPMC4850673 | biostudies-literature
| S-EPMC5133466 | biostudies-literature
| S-EPMC5409056 | biostudies-literature
| S-EPMC7214482 | biostudies-literature
| S-EPMC4872057 | biostudies-literature
| S-EPMC8373214 | biostudies-literature
| S-EPMC4542947 | biostudies-literature
| S-EPMC4401116 | biostudies-literature
| S-EPMC4222545 | biostudies-literature
| S-EPMC4641896 | biostudies-literature