Unknown

Dataset Information

0

Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias.


ABSTRACT: Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed the quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.

SUBMITTER: Krishnakumar R 

PROVIDER: S-EPMC5816649 | biostudies-literature | 2018 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias.

Krishnakumar Raga R   Sinha Anupama A   Bird Sara W SW   Jayamohan Harikrishnan H   Edwards Harrison S HS   Schoeniger Joseph S JS   Patel Kamlesh D KD   Branda Steven S SS   Bartsch Michael S MS  

Scientific reports 20180216 1


Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we  ...[more]

Similar Datasets

| PRJEB8318 | ENA
| S-EPMC4907500 | biostudies-literature
| S-EPMC5610714 | biostudies-literature
| S-EPMC4374364 | biostudies-literature
| S-EPMC7002467 | biostudies-literature
| S-EPMC4226419 | biostudies-literature
| S-EPMC9122479 | biostudies-literature
| S-EPMC6279828 | biostudies-literature
| S-EPMC4730766 | biostudies-literature
| S-EPMC5598014 | biostudies-literature