Dataset Information

Tuning intrinsic disorder predictors for virus proteins.

ABSTRACT: Many virus-encoded proteins have intrinsically disordered regions that lack a stable, folded three-dimensional structure. These disordered proteins often play important functional roles in virus replication, such as down-regulating host defense mechanisms. With the widespread availability of next-generation sequencing, the number of new virus genomes with predicted open reading frames is rapidly outpacing our capacity for directly characterizing protein structures through crystallography. Hence, computational methods for structural prediction play an important role. A large number of predictors focus on the problem of classifying residues into ordered and disordered regions, and these methods tend to be validated on a diverse training set of proteins from eukaryotes, prokaryotes, and viruses. In this study, we investigate whether some predictors outperform others in the context of virus proteins and compared our findings with data from non-viral proteins. We evaluate the prediction accuracy of 21 methods, many of which are only available as web applications, on a curated set of 126 proteins encoded by viruses. Furthermore, we apply a random forest classifier to these predictor outputs. Based on cross-validation experiments, this ensemble approach confers a substantial improvement in accuracy, e.g., a mean 36 per cent gain in Matthews correlation coefficient. Lastly, we apply the random forest predictor to severe acute respiratory syndrome coronavirus 2 ORF6, an accessory gene that encodes a short (61 AA) and moderately disordered protein that inhibits the host innate immune response. We show that disorder prediction methods perform differently for viral and non-viral proteins, and that an ensemble approach can yield more robust and accurate predictions.

SUBMITTER: Almog G

PROVIDER: S-EPMC7882063 | biostudies-literature | 2021 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Tuning intrinsic disorder predictors for virus proteins.

Almog Gal G Olabode Abayomi S AS Poon Art F Y AFY

Virus evolution 20210125 1

Many virus-encoded proteins have intrinsically disordered regions that lack a stable, folded three-dimensional structure. These disordered proteins often play important functional roles in virus replication, such as down-regulating host defense mechanisms. With the widespread availability of next-generation sequencing, the number of new virus genomes with predicted open reading frames is rapidly outpacing our capacity for directly characterizing protein structures through crystallography. Hence, ...[more]

PMID: 33614158

Dataset Information

Tuning intrinsic disorder predictors for virus proteins.

Publications

Tuning intrinsic disorder predictors for virus proteins.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins.
| S-EPMC10782001 | biostudies-literature

Functional correlations of respiratory syncytial virus proteins to intrinsic disorder.
| S-EPMC6464112 | biostudies-literature

Intrinsic Disorder in the Host Proteins Entrapped in Rabies Virus Particles.
| S-EPMC11209445 | biostudies-literature

Unreported intrinsic disorder in proteins: Disorder emergency room.
| S-EPMC5314879 | biostudies-literature

Intrinsic Disorder in Tetratricopeptide Repeat Proteins.
| S-EPMC7279152 | biostudies-literature

Enrichment patterns of intrinsic disorder in proteins.
| S-EPMC9842814 | biostudies-literature

Intrinsic disorder in measles virus nucleocapsids.
| S-EPMC3116414 | biostudies-literature

bHLH-PAS Proteins: Their Structure and Intrinsic Disorder.
| S-EPMC6695611 | biostudies-literature

Intrinsic Disorder in Proteins with Pathogenic Repeat Expansions.
| S-EPMC6149999 | biostudies-literature

Intrinsic Disorder-Based Design of Stable Globular Proteins.
| S-EPMC7022990 | biostudies-literature