Dataset Information

Machine learning-guided discovery and design of non-hemolytic peptides.

ABSTRACT: Reducing hurdles to clinical trials without compromising the therapeutic promises of peptide candidates becomes an essential step in peptide-based drug design. Machine-learning models are cost-effective and time-saving strategies used to predict biological activities from primary sequences. Their limitations lie in the diversity of peptide sequences and biological information within these models. Additional outlier detection methods are needed to set the boundaries for reliable predictions; the applicability domain. Antimicrobial peptides (AMPs) constitute an extensive library of peptides offering promising avenues against antibiotic-resistant infections. Most AMPs present in clinical trials are administrated topically due to their hemolytic toxicity. Here we developed machine learning models and outlier detection methods that ensure robust predictions for the discovery of AMPs and the design of novel peptides with reduced hemolytic activity. Our best models, gradient boosting classifiers, predicted the hemolytic nature from any peptide sequence with 95-97% accuracy. Nearly 70% of AMPs were predicted as hemolytic peptides. Applying multivariate outlier detection models, we found that 273 AMPs (~ 9%) could not be predicted reliably. Our combined approach led to the discovery of 34 high-confidence non-hemolytic natural AMPs, the de novo design of 507 non-hemolytic peptides, and the guidelines for non-hemolytic peptide design.

SUBMITTER: Plisson F

PROVIDER: S-EPMC7538962 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Machine learning-guided discovery and design of non-hemolytic peptides.

Plisson Fabien F Ramírez-Sánchez Obed O Martínez-Hernández Cristina C

Scientific reports 20201006 1

Reducing hurdles to clinical trials without compromising the therapeutic promises of peptide candidates becomes an essential step in peptide-based drug design. Machine-learning models are cost-effective and time-saving strategies used to predict biological activities from primary sequences. Their limitations lie in the diversity of peptide sequences and biological information within these models. Additional outlier detection methods are needed to set the boundaries for reliable predictions; the ...[more]

PMID: 33024236

Dataset Information

Machine learning-guided discovery and design of non-hemolytic peptides.

Publications

Machine learning-guided discovery and design of non-hemolytic peptides.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Machine Learning Guided Discovery of Non-Hemolytic Membrane Disruptive Anticancer Peptides.
| S-EPMC9541320 | biostudies-literature

Machine learning designs non-hemolytic antimicrobial peptides.
| S-EPMC8285431 | biostudies-literature

AMPGAN v2: Machine Learning-Guided Design of Antimicrobial Peptides.
| S-EPMC8281497 | biostudies-literature

Machine learning assisted design of highly active peptides for drug discovery.
| S-EPMC4388847 | biostudies-literature

Machine learning guided aptamer refinement and discovery.
| S-EPMC8062585 | biostudies-literature

Machine-Learning-Guided Discovery of Electrochemical Reactions.
| S-EPMC9756344 | biostudies-literature

Design, synthesis, and biological evaluation of stable β<sup>6.3</sup>-Helices: Discovery of non-hemolytic antibacterial peptides.
| S-EPMC8366898 | biostudies-literature

Machine-learning guided discovery of a new thermoelectric material.
| S-EPMC6391459 | biostudies-literature

Aerodynamics-guided machine learning for design optimization of electric vehicles.
| S-EPMC11579422 | biostudies-literature

Machine-Learning Guided Discovery of Bioactive Inhibitors of PD1-PDL1 Interaction.
| S-EPMC9145945 | biostudies-literature