Dataset Information

Data-driven noise modeling of digital DNA melting analysis enables prediction of sequence discriminating power.

ABSTRACT: The need to rapidly screen complex samples for a wide range of nucleic acid targets, like infectious diseases, remains unmet. Digital High-Resolution Melt (dHRM) is an emerging technology with potential to meet this need by accomplishing broad-based, rapid nucleic acid sequence identification. Here, we set out to develop a computational framework for estimating the resolving power of dHRM technology for defined sequence profiling tasks. By deriving noise models from experimentally generated dHRM datasets and applying these to in silico predicted melt curves, we enable the production of synthetic dHRM datasets that faithfully recapitulate real-world variations arising from sample and machine variables. We then use these datasets to identify the most challenging melt curve classification tasks likely to arise for a given application and test the performance of benchmark classifiers. This toolbox enables the in silico design and testing of broad-based dHRM screening assays and the selection of optimal classifiers. For an example application of screening common human bacterial pathogens, we show that human pathogens having the most similar sequences and melt curves are still reliably identifiable in the presence of experimental noise. Further, we find that ensemble methods outperform whole series classifiers for this task and are in some cases able to resolve melt curves with single-nucleotide resolution. Data and code available on https://github.com/lenlan/dHRM-noise-modeling. Supplementary data are available at Bioinformatics online.

SUBMITTER: Langouche L

PROVIDER: S-EPMC8016452 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Data-driven noise modeling of digital DNA melting analysis enables prediction of sequence discriminating power.

Langouche Lennart L Aralar April A Sinha Mridu M Lawrence Shelley M SM Fraley Stephanie I SI Coleman Todd P TP

Bioinformatics (Oxford, England) 20210401 22-23

<h4>Motivation</h4>The need to rapidly screen complex samples for a wide range of nucleic acid targets, like infectious diseases, remains unmet. Digital High-Resolution Melt (dHRM) is an emerging technology with potential to meet this need by accomplishing broad-based, rapid nucleic acid sequence identification. Here, we set out to develop a computational framework for estimating the resolving power of dHRM technology for defined sequence profiling tasks. By deriving noise models from experiment ...[more]

PMID: 33355665

Dataset Information

Data-driven noise modeling of digital DNA melting analysis enables prediction of sequence discriminating power.

Publications

Data-driven noise modeling of digital DNA melting analysis enables prediction of sequence discriminating power.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Improving Quantitative Power in Digital PCR through Digital High-Resolution Melting.
| S-EPMC7269394 | biostudies-literature

Coherent noise enables probabilistic sequence replay in spiking neuronal networks.
| S-EPMC10153753 | biostudies-literature

Explicit DNase sequence bias modeling enables high resolution transcription factor footprint detection
2014-09-04 | E-GEOD-61105 | biostudies-arrayexpress

Detector Blur and Correlated Noise Modeling for Digital Breast Tomosynthesis Reconstruction.
| S-EPMC5772655 | biostudies-literature

Digital RNA Sequencing Minimizes Sequence-Dependent Bias and Amplification Noise with Optimized Single Molecule Barcodes
2012-01-07 | E-GEOD-34449 | biostudies-arrayexpress

Digital RNA Sequencing Minimizes Sequence-Dependent Bias and Amplification Noise with Optimized Single Molecule Barcodes
2012-01-07 | GSE34449 | GEO

Discriminating power of localized three-dimensional facial morphology.
| S-EPMC1285182 | biostudies-literature

Explicit DNase sequence bias modeling enables high resolution transcription factor footprint detection
2014-09-04 | GSE61105 | GEO

DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
| S-EPMC5870879 | biostudies-literature

Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection.
| S-EPMC4231734 | biostudies-literature