Unknown

Dataset Information

0

Assessment of BOLD and GenBank - Their accuracy and reliability for the identification of biological materials.


ABSTRACT: Taxonomic identification of biological materials can be achieved through DNA barcoding, where an unknown "barcode" sequence is compared to a reference database. In many disciplines, obtaining accurate taxonomic identifications can be imperative (e.g., evolutionary biology, food regulatory compliance, forensics). The Barcode of Life DataSystems (BOLD) and GenBank are the main public repositories of DNA barcode sequences. In this study, an assessment of the accuracy and reliability of sequences in these databases was performed. To achieve this, 1) curated reference materials for plants, macro-fungi and insects were obtained from national collections, 2) relevant barcode sequences (rbcL, matK, trnH-psbA, ITS and COI) from these reference samples were generated and used for searching against both databases, and 3) optimal search parameters were determined that ensure the best match to the known species in either database. While GenBank outperformed BOLD for species-level identification of insect taxa (53% and 35%, respectively), both databases performed comparably for plants and macro-fungi (~81% and ~57%, respectively). Results illustrated that using a multi-locus barcode approach increased identification success. This study outlines the utility of the BLAST search tool in GenBank and the BOLD identification engine for taxonomic identifications and identifies some precautions needed when using public sequence repositories in applied scientific disciplines.

SUBMITTER: Meiklejohn KA 

PROVIDER: S-EPMC6584008 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assessment of BOLD and GenBank - Their accuracy and reliability for the identification of biological materials.

Meiklejohn Kelly A KA   Damaso Natalie N   Robertson James M JM  

PloS one 20190619 6


Taxonomic identification of biological materials can be achieved through DNA barcoding, where an unknown "barcode" sequence is compared to a reference database. In many disciplines, obtaining accurate taxonomic identifications can be imperative (e.g., evolutionary biology, food regulatory compliance, forensics). The Barcode of Life DataSystems (BOLD) and GenBank are the main public repositories of DNA barcode sequences. In this study, an assessment of the accuracy and reliability of sequences in  ...[more]

Similar Datasets

| S-EPMC7162515 | biostudies-literature
| S-EPMC7873228 | biostudies-literature
2014-06-20 | E-MTAB-1166 | biostudies-arrayexpress
| S-EPMC3890684 | biostudies-literature
| S-EPMC9043013 | biostudies-literature
| S-EPMC8273191 | biostudies-literature
| S-EPMC9322037 | biostudies-literature
| S-EPMC3606546 | biostudies-literature
| S-EPMC9849057 | biostudies-literature
| S-EPMC7075771 | biostudies-literature