Unknown

Dataset Information

0

Finding Protein and Nucleotide Similarities with FASTA.


ABSTRACT: The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons.

SUBMITTER: Pearson WR 

PROVIDER: S-EPMC5072362 | biostudies-literature | 2016 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Finding Protein and Nucleotide Similarities with FASTA.

Pearson William R WR  

Current protocols in bioinformatics 20160324


The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matr  ...[more]

Similar Datasets

| S-EPMC168977 | biostudies-literature
| S-EPMC140518 | biostudies-literature
| S-EPMC6890405 | biostudies-literature
| S-EPMC11362251 | biostudies-literature
| S-EPMC3866555 | biostudies-other
| S-EPMC5598423 | biostudies-literature
| S-EPMC3493650 | biostudies-literature
| S-EPMC4908341 | biostudies-literature
| S-EPMC137493 | biostudies-literature
| S-EPMC2896118 | biostudies-literature