Unknown

Dataset Information

0

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.


ABSTRACT: BACKGROUND:A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS:We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS:The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

SUBMITTER: Jiang Y 

PROVIDER: S-EPMC5015320 | biostudies-literature | 2016 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.

Jiang Yuxiang Y   Oron Tal Ronnen TR   Clark Wyatt T WT   Bankapur Asma R AR   D'Andrea Daniel D   Lepore Rosalba R   Funk Christopher S CS   Kahanda Indika I   Verspoor Karin M KM   Ben-Hur Asa A   Koo Da Chen Emily da CE   Penfold-Brown Duncan D   Shasha Dennis D   Youngs Noah N   Bonneau Richard R   Lin Alexandra A   Sahraeian Sayed M E SM   Martelli Pier Luigi PL   Profiti Giuseppe G   Casadio Rita R   Cao Renzhi R   Zhong Zhaolong Z   Cheng Jianlin J   Altenhoff Adrian A   Skunca Nives N   Dessimoz Christophe C   Dogan Tunca T   Hakala Kai K   Kaewphan Suwisa S   Mehryary Farrokh F   Salakoski Tapio T   Ginter Filip F   Fang Hai H   Smithers Ben B   Oates Matt M   Gough Julian J   Törönen Petri P   Koskinen Patrik P   Holm Liisa L   Chen Ching-Tai CT   Hsu Wen-Lian WL   Bryson Kevin K   Cozzetto Domenico D   Minneci Federico F   Jones David T DT   Chapman Samuel S   Bkc Dukka D   Khan Ishita K IK   Kihara Daisuke D   Ofer Dan D   Rappoport Nadav N   Stern Amos A   Cibrian-Uhalte Elena E   Denny Paul P   Foulger Rebecca E RE   Hieta Reija R   Legge Duncan D   Lovering Ruth C RC   Magrane Michele M   Melidoni Anna N AN   Mutowo-Meullenet Prudence P   Pichler Klemens K   Shypitsyna Aleksandra A   Li Biao B   Zakeri Pooya P   ElShal Sarah S   Tranchevent Léon-Charles LC   Das Sayoni S   Dawson Natalie L NL   Lee David D   Lees Jonathan G JG   Sillitoe Ian I   Bhat Prajwal P   Nepusz Tamás T   Romero Alfonso E AE   Sasidharan Rajkumar R   Yang Haixuan H   Paccanaro Alberto A   Gillis Jesse J   Sedeño-Cortés Adriana E AE   Pavlidis Paul P   Feng Shou S   Cejuela Juan M JM   Goldberg Tatyana T   Hamp Tobias T   Richter Lothar L   Salamov Asaf A   Gabaldon Toni T   Marcet-Houben Marina M   Supek Fran F   Gong Qingtian Q   Ning Wei W   Zhou Yuanpeng Y   Tian Weidong W   Falda Marco M   Fontana Paolo P   Lavezzo Enrico E   Toppo Stefano S   Ferrari Carlo C   Giollo Manuel M   Piovesan Damiano D   Tosatto Silvio C E SC   Del Pozo Angela A   Fernández José M JM   Maietta Paolo P   Valencia Alfonso A   Tress Michael L ML   Benso Alfredo A   Di Carlo Stefano S   Politano Gianfranco G   Savino Alessandro A   Rehman Hafeez Ur HU   Re Matteo M   Mesiti Marco M   Valentini Giorgio G   Bargsten Joachim W JW   van Dijk Aalt D J AD   Gemovic Branislava B   Glisic Sanja S   Perovic Vladmir V   Veljkovic Veljko V   Veljkovic Nevena N   Almeida-E-Silva Danillo C DC   Vencio Ricardo Z N RZ   Sharan Malvika M   Vogel Jörg J   Kansakar Lakesh L   Zhang Shanshan S   Vucetic Slobodan S   Wang Zheng Z   Sternberg Michael J E MJ   Wass Mark N MN   Huntley Rachael P RP   Martin Maria J MJ   O'Donovan Claire C   Robinson Peter N PN   Moreau Yves Y   Tramontano Anna A   Babbitt Patricia C PC   Brenner Steven E SE   Linial Michal M   Orengo Christine A CA   Rost Burkhard B   Greene Casey S CS   Mooney Sean D SD   Friedberg Iddo I   Radivojac Predrag P  

Genome biology 20160907 1


<h4>Background</h4>A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.<h4>Results</h4>We conducted the second cri  ...[more]

Similar Datasets

| S-EPMC4148315 | biostudies-literature
| S-EPMC2375131 | biostudies-literature
| S-EPMC4074043 | biostudies-literature
| S-EPMC3584181 | biostudies-literature
| S-EPMC4570743 | biostudies-literature
| S-EPMC3547882 | biostudies-literature
| S-EPMC3512156 | biostudies-literature
| S-EPMC3287165 | biostudies-literature
| S-EPMC7011869 | biostudies-literature
| S-EPMC9235490 | biostudies-literature