Unknown

Dataset Information

0

Significance of gapped sequence alignments.


ABSTRACT: Measurement of the the statistical significance of extreme sequence alignment scores is key to many important applications, but it is difficult. To precisely approximate alignment score significance, we draw random samples directly from a well chosen, importance-sampling probability distribution. We apply our technique to pairwise local sequence alignment of nucleic acid and amino acid sequences of length up to 1000. For instance, using a BLOSUM62 scoring system for local sequence alignment, we compute that the p-value of a score of 6000 for the alignment of two sequences of length 1000 is (3.4 +/- 0.3) x 10(-1314). Further, we show that the extreme value significance statistic for the local alignment model that we examine does not follow a Gumbel distribution. A web server for this application is available at http://bayesweb.wadsworth.org/alignmentSignificanceV1/.

SUBMITTER: Newberg LA 

PROVIDER: S-EPMC2737730 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4102394 | biostudies-literature
| S-EPMC6693267 | biostudies-literature
| S-EPMC1538804 | biostudies-literature
| S-EPMC1948021 | biostudies-literature
| S-EPMC1933219 | biostudies-literature
| S-EPMC3573025 | biostudies-literature
| S-EPMC6994045 | biostudies-literature
| S-EPMC2850363 | biostudies-literature