Unknown

Dataset Information

0

Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics.


ABSTRACT: A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures for large scale functional comparative analysis as an alternative method. The performance of both approaches was assessed through functional comparison of 446 bacterial genomes sampled at different taxonomic levels. We show that protein domain architectures provide a fast and efficient alternative to methods based on sequence similarity to identify groups of functionally equivalent proteins within and across taxonomic boundaries, and it is suitable for large scale comparative analysis. Running both methods in parallel pinpoints potential functional adaptations that may add to bacterial fitness.

SUBMITTER: Koehorst JJ 

PROVIDER: S-EPMC5031134 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics.

Koehorst Jasper J JJ   Saccenti Edoardo E   Schaap Peter J PJ   Martins Dos Santos Vitor A P VAP   Suarez-Diez Maria M  

F1000Research 20160815


A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures  ...[more]

Similar Datasets

| S-EPMC3463605 | biostudies-other
| S-EPMC2394858 | biostudies-literature
| S-EPMC6472370 | biostudies-literature
| S-EPMC3942035 | biostudies-literature
| S-EPMC2957689 | biostudies-literature
| S-EPMC5674928 | biostudies-literature
| S-EPMC10516370 | biostudies-literature
| S-EPMC1500816 | biostudies-literature
| S-EPMC6528568 | biostudies-literature
| S-EPMC3091305 | biostudies-literature