Unknown

Dataset Information

0

CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies.


ABSTRACT: BACKGROUND:Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology. However, the unique error patterns in the third generation sequencing (TGS) technologies could reduce the accuracy of these programs. RESULTS:We developed a Classification tool using Discriminative K-mers and Approximate Matching algorithm (CDKAM). This approximate matching method was used for searching k-mers, which included two phases, a quick mapping phase and a dynamic programming phase. Simulated datasets as well as real TGS datasets have been tested to compare the performance of CDKAM with existing methods. We showed that CDKAM performed better in many aspects, especially when classifying TGS data with average length 1000-1500 bases. CONCLUSIONS:CDKAM is an effective program with higher accuracy and lower memory requirement for TGS metagenome sequence classification. It produces a high species-level accuracy.

SUBMITTER: Bui VK 

PROVIDER: S-EPMC7576720 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies.

Bui Van-Kien VK   Wei Chaochun C  

BMC bioinformatics 20201020 1


<h4>Background</h4>Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology. However, the unique error patterns in the third generation sequencing (TGS) technologies could reduce the accuracy of these programs.<h4>Results</h4>We developed a Classification tool using Discriminative K-mers and Approximate Matching algorithm (CDKAM). This approximate matching method was used for searching k-mers,  ...[more]

Similar Datasets

| S-EPMC4428112 | biostudies-literature
| S-EPMC4401624 | biostudies-literature
| S-EPMC8246400 | biostudies-literature
| S-EPMC4186660 | biostudies-literature
| S-EPMC6573793 | biostudies-other
| S-EPMC2648743 | biostudies-literature
| S-EPMC3712219 | biostudies-literature
| S-EPMC4364207 | biostudies-literature
| S-EPMC5704239 | biostudies-literature