Unknown

Dataset Information

0

Data-driven noise modeling of digital DNA melting analysis enables prediction of sequence discriminating power.


ABSTRACT:

Motivation

The need to rapidly screen complex samples for a wide range of nucleic acid targets, like infectious diseases, remains unmet. Digital High-Resolution Melt (dHRM) is an emerging technology with potential to meet this need by accomplishing broad-based, rapid nucleic acid sequence identification. Here, we set out to develop a computational framework for estimating the resolving power of dHRM technology for defined sequence profiling tasks. By deriving noise models from experimentally generated dHRM datasets and applying these to in silico predicted melt curves, we enable the production of synthetic dHRM datasets that faithfully recapitulate real-world variations arising from sample and machine variables. We then use these datasets to identify the most challenging melt curve classification tasks likely to arise for a given application and test the performance of benchmark classifiers.

Results

This toolbox enables the in silico design and testing of broad-based dHRM screening assays and the selection of optimal classifiers. For an example application of screening common human bacterial pathogens, we show that human pathogens having the most similar sequences and melt curves are still reliably identifiable in the presence of experimental noise. Further, we find that ensemble methods outperform whole series classifiers for this task and are in some cases able to resolve melt curves with single-nucleotide resolution.

Availability

Data and code available on https://github.com/lenlan/dHRM-noise-modeling.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Langouche L 

PROVIDER: S-EPMC8016452 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7269394 | biostudies-literature
2014-09-04 | E-GEOD-61105 | biostudies-arrayexpress
| S-EPMC5772655 | biostudies-literature
2012-01-07 | E-GEOD-34449 | biostudies-arrayexpress
2012-01-07 | GSE34449 | GEO
| S-EPMC1285182 | biostudies-literature
2014-09-04 | GSE61105 | GEO
| S-EPMC6059929 | biostudies-literature
| S-EPMC3268301 | biostudies-literature
| S-EPMC4231734 | biostudies-literature