Unknown

Dataset Information

0

An algorithm for progressive multiple alignment of sequences with insertions.


ABSTRACT: Dynamic programming algorithms guarantee to find the optimal alignment between two sequences. For more than a few sequences, exact algorithms become computationally impractical, and progressive algorithms iterating pairwise alignments are widely used. These heuristic methods have a serious drawback because pairwise algorithms do not differentiate insertions from deletions and end up penalizing single insertion events multiple times. Such an unrealistically high penalty for insertions typically results in overmatching of sequences and an underestimation of the number of insertion events. We describe a modification of the traditional alignment algorithm that can distinguish insertion from deletion and avoid repeated penalization of insertions and illustrate this method with a pair hidden Markov model that uses an evolutionary scoring function. In comparison with a traditional progressive alignment method, our algorithm infers a greater number of insertion events and creates gaps that are phylogenetically consistent but spatially less concentrated. Our results suggest that some insertion/deletion "hot spots" may actually be artifacts of traditional alignment algorithms.

SUBMITTER: Loytynoja A 

PROVIDER: S-EPMC1180752 | biostudies-literature | 2005 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

An algorithm for progressive multiple alignment of sequences with insertions.

Löytynoja Ari A   Goldman Nick N  

Proceedings of the National Academy of Sciences of the United States of America 20050706 30


Dynamic programming algorithms guarantee to find the optimal alignment between two sequences. For more than a few sequences, exact algorithms become computationally impractical, and progressive algorithms iterating pairwise alignments are widely used. These heuristic methods have a serious drawback because pairwise algorithms do not differentiate insertions from deletions and end up penalizing single insertion events multiple times. Such an unrealistically high penalty for insertions typically r  ...[more]

Similar Datasets

| S-EPMC1904245 | biostudies-literature
| S-EPMC4599319 | biostudies-literature
| S-EPMC6151001 | biostudies-literature
| S-EPMC2478692 | biostudies-literature
| S-EPMC7859483 | biostudies-literature
| S-EPMC546147 | biostudies-literature
| S-EPMC2366961 | biostudies-literature
| S-EPMC5042171 | biostudies-literature
| S-EPMC2722656 | biostudies-literature
| S-EPMC4179140 | biostudies-literature