Unknown

Dataset Information

0

ECAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains.


ABSTRACT:

Background

Inconsistencies are often observed in the genome annotations of bacterial strains. Moreover, these inconsistencies are often not reflected by sequence discrepancies, but are caused by wrongly annotated gene starts as well as mis-identified gene presence. Thus, tools are needed for improving annotation consistency and accuracy among sets of bacterial strain genomes.

Results

We have developed eCAMBer, a tool for efficiently supporting comparative analysis of multiple bacterial strains within the same species. eCAMBer is a highly optimized revision of our earlier tool, CAMBer, scaling it up for significantly larger datasets comprising hundreds of bacterial strains. eCAMBer works in two phases. First, it transfers gene annotations among all considered bacterial strains. In this phase, it also identifies homologous gene families and annotation inconsistencies. Second, eCAMBer, tries to improve the quality of annotations by resolving the gene start inconsistencies and filtering out gene families arising from annotation errors propagated in the previous phase.

Conclusions

[corrected] eCAMBer efficiently identifies and resolves annotation inconsistencies among closely related bacterial genomes. It outperforms other competing tools both in terms of running time and accuracy of produced annotations. Software, user manual, and case study results are available at the project website: http://bioputer.mimuw.edu.pl/ecamber.

SUBMITTER: Wozniak M 

PROVIDER: S-EPMC4023553 | biostudies-literature | 2014 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

eCAMBer: efficient support for large-scale comparative analysis of multiple bacterial strains.

Wozniak Michal M   Wong Limsoon L   Tiuryn Jerzy J  

BMC bioinformatics 20140305


<h4>Background</h4>Inconsistencies are often observed in the genome annotations of bacterial strains. Moreover, these inconsistencies are often not reflected by sequence discrepancies, but are caused by wrongly annotated gene starts as well as mis-identified gene presence. Thus, tools are needed for improving annotation consistency and accuracy among sets of bacterial strain genomes.<h4>Results</h4>We have developed eCAMBer, a tool for efficiently supporting comparative analysis of multiple bact  ...[more]

Similar Datasets

| S-EPMC3194237 | biostudies-literature
| S-EPMC4510541 | biostudies-literature
| S-EPMC5860313 | biostudies-literature
| S-EPMC11003185 | biostudies-literature
| S-EPMC8190105 | biostudies-literature
| S-EPMC1187872 | biostudies-literature
| S-EPMC8690886 | biostudies-literature
| S-EPMC1298293 | biostudies-literature
| S-EPMC7229196 | biostudies-literature
| S-EPMC5100282 | biostudies-literature