Unknown

Dataset Information

0

Simple chained guide trees give high-quality protein multiple sequence alignments.


ABSTRACT: Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random.

SUBMITTER: Boyce K 

PROVIDER: S-EPMC4115562 | biostudies-literature | 2014 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Simple chained guide trees give high-quality protein multiple sequence alignments.

Boyce Kieran K   Sievers Fabian F   Higgins Desmond G DG  

Proceedings of the National Academy of Sciences of the United States of America 20140707 29


Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These  ...[more]

Similar Datasets

| S-EPMC3532078 | biostudies-literature
| S-EPMC3598851 | biostudies-literature
| S-EPMC5079479 | biostudies-literature
| S-EPMC3261699 | biostudies-literature
| S-EPMC2893182 | biostudies-literature
| S-EPMC1687212 | biostudies-literature
| S-EPMC1948021 | biostudies-literature
| S-EPMC7297217 | biostudies-literature
| S-EPMC1463900 | biostudies-literature