Unknown

Dataset Information

0

The InterPro protein families database: the classification resource after 15 years.


ABSTRACT: The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.

SUBMITTER: Mitchell A 

PROVIDER: S-EPMC4383996 | biostudies-literature | 2015 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

The InterPro protein families database: the classification resource after 15 years.

Mitchell Alex A   Chang Hsin-Yu HY   Daugherty Louise L   Fraser Matthew M   Hunter Sarah S   Lopez Rodrigo R   McAnulla Craig C   McMenamin Conor C   Nuka Gift G   Pesseat Sebastien S   Sangrador-Vegas Amaia A   Scheremetjew Maxim M   Rato Claudia C   Yong Siew-Yit SY   Bateman Alex A   Punta Marco M   Attwood Teresa K TK   Sigrist Christian J A CJ   Redaschi Nicole N   Rivoire Catherine C   Xenarios Ioannis I   Kahn Daniel D   Guyot Dominique D   Bork Peer P   Letunic Ivica I   Gough Julian J   Oates Matt M   Haft Daniel D   Huang Hongzhan H   Natale Darren A DA   Wu Cathy H CH   Orengo Christine C   Sillitoe Ian I   Mi Huaiyu H   Thomas Paul D PD   Finn Robert D RD  

Nucleic acids research 20141126 Database issue


The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, ca  ...[more]

Similar Datasets

| S-EPMC7778928 | biostudies-literature
| S-EPMC29841 | biostudies-literature
| S-EPMC3170169 | biostudies-literature
| S-EPMC6323941 | biostudies-literature
| S-EPMC2448387 | biostudies-other
| S-EPMC308855 | biostudies-literature
| S-EPMC3965110 | biostudies-literature
| S-EPMC2238907 | biostudies-literature
| S-EPMC2808889 | biostudies-literature
| S-EPMC102420 | biostudies-literature