Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains.
Ontology highlight
ABSTRACT: A variety of protein domain predictors were developed to predict protein domain boundaries in recent years, but most of them cannot predict discontinuous domains. Considering nearly 40% of multidomain proteins contain one or more discontinuous domains, we have developed DomEx to enable domain boundary predictors to detect discontinuous domains by assembling the continuous domain segments. Discontinuous domains are predicted by matching the sequence profile of concatenated continuous domain segments with the profiles from a single-domain library derived from SCOP and CATH, and Pfam. Then the matches are filtered by similarity to library templates, a symmetric index score and a profile-profile alignment score. DomEx recalled 32.3% discontinuous domains with 86.5% precision when tested on 97 non-homologous protein chains containing 58 continuous and 99 discontinuous domains, in which the predicted domain segments are within ±20 residues of the boundary definitions in CATH 3.5. Compared with our recently developed predictor, ThreaDom, which is the state-of-the-art tool to detect discontinuous-domains, DomEx recalled 26.7% discontinuous domains with 72.7% precision in a benchmark with 29 discontinuous-domain chains, where ThreaDom failed to predict any discontinuous domains. Furthermore, combined with ThreaDom, the method ranked number one among 10 predictors. The source code and datasets are available at https://github.com/xuezhidong/DomEx.
SUBMITTER: Xue Z
PROVIDER: S-EPMC4621036 | biostudies-literature | 2015
REPOSITORIES: biostudies-literature
ACCESS DATA