Unknown

Dataset Information

0

PairsDB atlas of protein sequence space.


ABSTRACT: Sequence similarity/database searching is a cornerstone of molecular biology. PairsDB is a database intended to make exploring protein sequences and their similarity relationships quick and easy. Behind PairsDB is a comprehensive collection of protein sequences and BLAST and PSI-BLAST alignments between them. Instead of running BLAST or PSI-BLAST individually on each request, results are retrieved instantaneously from a database of pre-computed alignments. Filtering options allow you to find a set of sequences satisfying a set of criteria-for example, all human proteins with solved structure and without transmembrane segments. PairsDB is continually updated and covers all sequences in Uniprot. The data is stored in a MySQL relational database. Data files will be made available for download at ftp://nic.funet.fi/pub/sci/molbio. PairsDB can also be accessed interactively at http://pairsdb.csc.fi. PairsDB data is a valuable platform to build various downstream automated analysis pipelines. For example, the graph of all-against-all similarity relationships is the starting point for clustering protein families, delineating domains, improving alignment accuracy by consistency measures, and defining orthologous genes. Moreover, query-anchored stacked sequence alignments, profiles and consensus sequences are useful in studies of sequence conservation patterns for clues about possible functional sites.

SUBMITTER: Heger A 

PROVIDER: S-EPMC2238971 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5738032 | biostudies-literature
2019-08-05 | GSE120789 | GEO
| S-EPMC6070207 | biostudies-literature
| S-EPMC169885 | biostudies-literature
2019-09-20 | GSE120780 | GEO
2019-07-27 | GSE120786 | GEO
| S-EPMC3495008 | biostudies-literature
| S-EPMC7777073 | biostudies-literature
2019-09-20 | GSE128611 | GEO
| S-EPMC4908355 | biostudies-literature