Unknown

Dataset Information

0

Prediction system for rapid identification of Salmonella serotypes based on pulsed-field gel electrophoresis fingerprints.


ABSTRACT: A classification model is presented for rapid identification of Salmonella serotypes based on pulsed-field gel electrophoresis (PFGE) fingerprints. The classification model was developed using random forest and support vector machine algorithms and was then applied to a database of 45,923 PFGE patterns, randomly selected from all submissions to CDC PulseNet from 2005 to 2010. The patterns selected included the top 20 most frequent serotypes and 12 less frequent serotypes from various sources. The prediction accuracies for the 32 serotypes ranged from 68.8% to 99.9%, with an overall accuracy of 96.0% for the random forest classification, and ranged from 67.8% to 100.0%, with an overall accuracy of 96.1% for the support vector machine classification. The prediction system improves reliability and accuracy and provides a new tool for early and fast screening and source tracking of outbreak isolates. It is especially useful to get serotype information before the conventional methods are done. Additionally, this system also works well for isolates that are serotyped as "unknown" by conventional methods, and it is useful for a laboratory where standard serotyping is not available.

SUBMITTER: Zou W 

PROVIDER: S-EPMC3347130 | biostudies-literature | 2012 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Prediction system for rapid identification of Salmonella serotypes based on pulsed-field gel electrophoresis fingerprints.

Zou Wen W   Lin Wei-Jiun WJ   Hise Kelley B KB   Chen Hung-Chia HC   Keys Christine C   Chen James J JJ  

Journal of clinical microbiology 20120229 5


A classification model is presented for rapid identification of Salmonella serotypes based on pulsed-field gel electrophoresis (PFGE) fingerprints. The classification model was developed using random forest and support vector machine algorithms and was then applied to a database of 45,923 PFGE patterns, randomly selected from all submissions to CDC PulseNet from 2005 to 2010. The patterns selected included the top 20 most frequent serotypes and 12 less frequent serotypes from various sources. Th  ...[more]

Similar Datasets

| S-EPMC2937721 | biostudies-literature
| S-EPMC3132108 | biostudies-literature
| S-EPMC84923 | biostudies-literature
| S-EPMC2832379 | biostudies-literature
| S-EPMC1153745 | biostudies-literature
| S-EPMC1829001 | biostudies-literature
| S-EPMC5832666 | biostudies-literature
| S-EPMC3865863 | biostudies-literature
| S-EPMC3165606 | biostudies-literature
| S-EPMC3774442 | biostudies-literature