Unknown

Dataset Information

0

DFLpred: High-throughput prediction of disordered flexible linker regions in protein sequences.


ABSTRACT:

Motivation

Disordered flexible linkers (DFLs) are disordered regions that serve as flexible linkers/spacers in multi-domain proteins or between structured constituents in domains. They are different from flexible linkers/residues because they are disordered and longer. Availability of experimentally annotated DFLs provides an opportunity to build high-throughput computational predictors of these regions from protein sequences. To date, there are no computational methods that directly predict DFLs and they can be found only indirectly by filtering predicted flexible residues with predictions of disorder.

Results

We conceptualized, developed and empirically assessed a first-of-its-kind sequence-based predictor of DFLs, DFLpred. This method outputs propensity to form DFLs for each residue in the input sequence. DFLpred uses a small set of empirically selected features that quantify propensities to form certain secondary structures, disordered regions and structured regions, which are processed by a fast linear model. Our high-throughput predictor can be used on the whole-proteome scale; it needs <1?h to predict entire proteome on a single CPU. When assessed on an independent test dataset with low sequence-identity proteins, it secures area under the receiver operating characteristic curve equal 0.715 and outperforms existing alternatives that include methods for the prediction of flexible linkers, flexible residues, intrinsically disordered residues and various combinations of these methods. Prediction on the complete human proteome reveals that about 10% of proteins have a large content of over 30% DFL residues. We also estimate that about 6000 DFL regions are long with ?30 consecutive residues.

Availability and implementation

http://biomine.ece.ualberta.ca/DFLpred/

Contact

lkurgan@vcu.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Meng F 

PROVIDER: S-EPMC4908364 | biostudies-literature | 2016 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

DFLpred: High-throughput prediction of disordered flexible linker regions in protein sequences.

Meng Fanchi F   Kurgan Lukasz L  

Bioinformatics (Oxford, England) 20160601 12


<h4>Motivation</h4>Disordered flexible linkers (DFLs) are disordered regions that serve as flexible linkers/spacers in multi-domain proteins or between structured constituents in domains. They are different from flexible linkers/residues because they are disordered and longer. Availability of experimentally annotated DFLs provides an opportunity to build high-throughput computational predictors of these regions from protein sequences. To date, there are no computational methods that directly pre  ...[more]

Similar Datasets

| S-EPMC1863424 | biostudies-literature
| S-EPMC2671142 | biostudies-literature
| S-EPMC4263755 | biostudies-literature
| S-EPMC5949888 | biostudies-literature
| S-EPMC1933209 | biostudies-literature
| S-EPMC4605291 | biostudies-literature
| S-EPMC2998629 | biostudies-literature
| S-EPMC9825772 | biostudies-literature
| S-EPMC4788683 | biostudies-literature
| S-EPMC4876815 | biostudies-literature