Ontology highlight
ABSTRACT: Summary
ManyFold is a flexible library for protein structure prediction with deep learning that (i) supports models that use both multiple sequence alignments (MSAs) and protein language model (pLM) embedding as inputs, (ii) allows inference of existing models (AlphaFold and OpenFold), (iii) is fully trainable, allowing for both fine-tuning and the training of new models from scratch and (iv) is written in Jax to support efficient batched operation in distributed settings. A proof-of-concept pLM-based model, pLMFold, is trained from scratch to obtain reasonable results with reduced computational overheads in comparison to AlphaFold.Availability and implementation
The source code for ManyFold, the validation dataset and a small sample of training data are available at https://github.com/instadeepai/manyfold.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Villegas-Morcillo A
PROVIDER: S-EPMC9825755 | biostudies-literature | 2023 Jan
REPOSITORIES: biostudies-literature
Villegas-Morcillo Amelia A Robinson Louis L Flajolet Arthur A Barrett Thomas D TD
Bioinformatics (Oxford, England) 20230101 1
<h4>Summary</h4>ManyFold is a flexible library for protein structure prediction with deep learning that (i) supports models that use both multiple sequence alignments (MSAs) and protein language model (pLM) embedding as inputs, (ii) allows inference of existing models (AlphaFold and OpenFold), (iii) is fully trainable, allowing for both fine-tuning and the training of new models from scratch and (iv) is written in Jax to support efficient batched operation in distributed settings. A proof-of-con ...[more]