Unknown

Dataset Information

0

BranchClust: a phylogenetic algorithm for selecting gene families.


ABSTRACT:

Background

Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved.

Results

Here we describe an algorithm for the automated selection of orthologous genes that recognizes orthologous genes from different species in a phylogenetic tree for any number of taxa. The algorithm is capable of distinguishing complete (containing all taxa) and incomplete (not containing all taxa) families and recognizes in- and outparalogs. The BranchClust algorithm is implemented in Perl with the use of the BioPerl module for parsing trees and is freely available at http://bioinformatics.org/branchclust.

Conclusion

BranchClust outperforms the Reciprocal Best Blast hit method in selecting more sets of putatively orthologous genes. In the test cases examined, the correctness of the selected families and of the identified in- and outparalogs was confirmed by inspection of the pertinent phylogenetic trees.

SUBMITTER: Poptsova MS 

PROVIDER: S-EPMC1853112 | biostudies-literature | 2007 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

BranchClust: a phylogenetic algorithm for selecting gene families.

Poptsova Maria S MS   Gogarten J Peter JP  

BMC bioinformatics 20070410


<h4>Background</h4>Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the s  ...[more]

Similar Datasets

| S-EPMC8096466 | biostudies-literature
| S-EPMC2826416 | biostudies-literature
| S-EPMC1347480 | biostudies-literature
| S-EPMC7228096 | biostudies-literature
| S-EPMC3985644 | biostudies-literature
| S-EPMC2235844 | biostudies-literature
| S-EPMC2566581 | biostudies-literature
| S-EPMC3505183 | biostudies-literature
| S-EPMC184357 | biostudies-literature
| S-EPMC7779534 | biostudies-literature