Dataset Information

Finding Protein and Nucleotide Similarities with FASTA.

ABSTRACT: The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons.

SUBMITTER: Pearson WR

PROVIDER: S-EPMC5072362 | biostudies-literature | 2016 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Finding Protein and Nucleotide Similarities with FASTA.

Pearson William R WR

Current protocols in bioinformatics 20160324

The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matr ...[more]

PMID: 27010337

Dataset Information

Finding Protein and Nucleotide Similarities with FASTA.

Publications

Finding Protein and Nucleotide Similarities with FASTA.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

LGA: A method for finding 3D similarities in protein structures.
| S-EPMC168977 | biostudies-literature

Finding weak similarities between proteins by sequence profile comparison.
| S-EPMC140518 | biostudies-literature

MFCompress: a compression tool for FASTA and multi-FASTA data.
| S-EPMC3866555 | biostudies-literature

Idiopathic Pulmonary Fibrosis and Lung Cancer: Finding Similarities within Differences.
| S-EPMC6890405 | biostudies-literature

Finding Similarities in Differences Between Autistic Adults: Two Replicated Subgroups.
| S-EPMC11362251 | biostudies-literature

PDBspheres: a method for finding 3D similarities in local regions in proteins.
| S-EPMC9549786 | biostudies-literature

Harmonization of maternal balanced energy-protein supplementation studies for individual participant data (IPD) meta-analyses - finding and creating similarities in variables and data collection.
| S-EPMC9919738 | biostudies-literature

1:1 FASTA update: Using the power of <i>E</i>-values in FASTA to detect potential allergen cross-reactivity.
| S-EPMC5598423 | biostudies-literature

Similarities between protein folding and granular jamming.
| S-EPMC3493650 | biostudies-literature

Finding correct protein-protein docking models using ProQDock.
| S-EPMC4908341 | biostudies-literature