Unknown

Dataset Information

0

Charge and hydrophobicity are key features in sequence-trained machine learning models for predicting the biophysical properties of clinical-stage antibodies.


ABSTRACT: Improved understanding of properties that mediate protein solubility and resistance to aggregation are important for developing biopharmaceuticals, and more generally in biotechnology and synthetic biology. Recent acquisition of large datasets for antibody biophysical properties enables the search for predictive models. In this report, machine learning methods are used to derive models for 12 biophysical properties. A physicochemical perspective is maintained in analysing the models, leading to the observation that models cluster largely according to charge (cross-interaction measurements) and hydrophobicity (self-interaction methods). These two properties also overlap in some cases, for example in a new interpretation of variation in hydrophobic interaction chromatography. Since the models are developed from differences of antibody variable loops, the next stage is to extend models to more diverse protein sets. Availability:The web application for the sequence-based algorithms are available on the protein-sol webserver, at https://protein-sol.manchester.ac.uk/abpred, with models and virtualisation software available at https://protein-sol.manchester.ac.uk/software.

SUBMITTER: Hebditch M 

PROVIDER: S-EPMC6967001 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Charge and hydrophobicity are key features in sequence-trained machine learning models for predicting the biophysical properties of clinical-stage antibodies.

Hebditch Max M   Warwicker Jim J  

PeerJ 20191218


Improved understanding of properties that mediate protein solubility and resistance to aggregation are important for developing biopharmaceuticals, and more generally in biotechnology and synthetic biology. Recent acquisition of large datasets for antibody biophysical properties enables the search for predictive models. In this report, machine learning methods are used to derive models for 12 biophysical properties. A physicochemical perspective is maintained in analysing the models, leading to  ...[more]

Similar Datasets

| S-EPMC3945335 | biostudies-literature
| S-EPMC8801043 | biostudies-literature
| S-EPMC3188480 | biostudies-literature
| S-EPMC7820740 | biostudies-literature
| S-EPMC7181484 | biostudies-literature
| S-EPMC4602141 | biostudies-literature
| S-EPMC5634454 | biostudies-literature