Unknown

Dataset Information

0

Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields.


ABSTRACT: Accurate identification of intrinsically disordered proteins/regions (IDPs/IDRs) is critical for predicting protein structure and function. Previous studies have shown that IDRs of different lengths have different characteristics, and several classification-based predictors have been proposed for predicting different types of IDRs. Compared with these classification-based predictors, the previously proposed predictor IDP-CRF exhibits state-of-the-art performance for predicting IDPs/IDRs, which is a sequence labeling model based on conditional random fields (CRFs). Motivated by these methods, we propose a predictor called IDP-FSP, which is an ensemble of three CRF-based predictors called IDP-FSP-L, IDP-FSP-S, and IDP-FSP-G. These three predictors are specially designed to predict long, short, and generic disordered regions, respectively, and they are constructed based on different features. To the best of our knowledge, IDP-FSP is the first predictor that combines a sequence labeling algorithm with IDRs of different lengths. Experimental results using two independent test datasets show that IDP-FSP achieves better or at least comparable predictive performance with 26 existing state-of-the-art methods in this field, proving the effectiveness of IDP-FSP.

SUBMITTER: Liu Y 

PROVIDER: S-EPMC6626971 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC6164615 | biostudies-literature
| S-EPMC9250585 | biostudies-literature
| S-EPMC3949125 | biostudies-literature
| S-EPMC6954741 | biostudies-literature
| S-EPMC6699704 | biostudies-literature
| S-EPMC3355724 | biostudies-literature
| S-EPMC5429364 | biostudies-literature
| S-EPMC10585352 | biostudies-literature
| S-EPMC6675126 | biostudies-literature
| S-EPMC3075556 | biostudies-literature