Unknown

Dataset Information

0

Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.


ABSTRACT:

Motivation

Protein contact prediction is important for protein structure and functional study. Both evolutionary coupling (EC) analysis and supervised machine learning methods have been developed, making use of different information sources. However, contact prediction is still challenging especially for proteins without a large number of sequence homologs.

Results

This article presents a group graphical lasso (GGL) method for contact prediction that integrates joint multi-family EC analysis and supervised learning to improve accuracy on proteins without many sequence homologs. Different from existing single-family EC analysis that uses residue coevolution information in only the target protein family, our joint EC analysis uses residue coevolution in both the target family and its related families, which may have divergent sequences but similar folds. To implement this, we model a set of related protein families using Gaussian graphical models and then coestimate their parameters by maximum-likelihood, subject to the constraint that these parameters shall be similar to some degree. Our GGL method can also integrate supervised learning methods to further improve accuracy. Experiments show that our method outperforms existing methods on proteins without thousands of sequence homologs, and that our method performs better on both conserved and family-specific contacts.

Availability and implementation

See http://raptorx.uchicago.edu/ContactMap/ for a web server implementing the method.

Contact

j3xu@ttic.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Ma J 

PROVIDER: S-EPMC4838177 | biostudies-literature | 2015 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.

Ma Jianzhu J   Wang Sheng S   Wang Zhiyong Z   Xu Jinbo J  

Bioinformatics (Oxford, England) 20150814 21


<h4>Motivation</h4>Protein contact prediction is important for protein structure and functional study. Both evolutionary coupling (EC) analysis and supervised machine learning methods have been developed, making use of different information sources. However, contact prediction is still challenging especially for proteins without a large number of sequence homologs.<h4>Results</h4>This article presents a group graphical lasso (GGL) method for contact prediction that integrates joint multi-family  ...[more]

Similar Datasets

| S-EPMC5820155 | biostudies-literature
| S-EPMC3036623 | biostudies-literature
| S-EPMC4206277 | biostudies-literature
| S-EPMC5871922 | biostudies-literature
| S-EPMC6929375 | biostudies-literature
| S-EPMC7577475 | biostudies-literature
| S-EPMC7161351 | biostudies-literature
| S-EPMC6851476 | biostudies-literature
| S-EPMC7397036 | biostudies-literature
| S-EPMC6401133 | biostudies-literature