Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences

ABSTRACT: More than 50 human diseases are characterized by the deposition of specific protein aggregates in the form of insoluble amyloid fibrils. However, only a very small number of proteins are known to form amyloids with high propensity, limiting our ability to understand, predict and engineer amyloid aggregation from sequence. Here we use a massively parallel assay to quantify the amyloid nucleation propensity of >100,000 random 20 amino acid sequences. Approximately 5% of assayed random sequences nucleate the formation of aggregates, generating a very large and diverse training dataset from which to train models to predict amyloid nucleation. We use this dataset to train CANYA, a convolution-attention hybrid neural network that predicts the propensity of any primary sequence to form amyloids. CANYA outperforms previous predictors of protein aggregation on additional random sequences and out-of-sample datasets including human disease-causing amyloids, with very stable performance across diverse prediction tasks. We adapt and extend recent advances in interpretability of genomic neural networks to elucidate CANYA’s decision-making process and learned grammar and to provide mechanistic insights into amyloid formation. Our results demonstrate the power of massive experimental random sequence-space exploration and provide an interpretable and robust neural network model for understanding, predicting and designing amyloid-forming proteins.

ORGANISM(S): Saccharomyces cerevisiae

PROVIDER: GSE268261 | GEO | 2024/07/17

REPOSITORIES: GEO

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Similar Datasets

Massively parallel quantification of mutational impact on IAPP amyloid formation

Project description:Amyloid fibrils formed by the islet amyloid polypeptide (IAPP) cause pancreatic beta-cell damage, resulting in reduced insulin secretion and Type 2 diabetes (T2D). Variations in the primary amino acid sequence of IAPP can influence its aggregation rate and animals expressing IAPP variants that do not form amyloids, do not develop T2D. Conversely, specific single amino -acid changes in IAPP are enough to accelerate its aggregation rate. Understanding how mutations impact IAPP aggregation can help gain mechanistic understanding into the process of pathogenic amyloid formation of this peptide and preventively identify mutations that may contribute to the risk of developing T2D. Here, we employ deep mutational scanning to measure the ability to nucleate amyloids for 1663 IAPP variants, including substitutions, insertions, truncations and deletions and identify variants that increase amyloid formation in all mutation classes. Our results point at a continuous stretch of residues (15-32) which likely is structured in IAPP amyloids and that matches the core of the early aggregated species formed by IAPP in vitro. Inside this region, mutations have a more drastic effect in the 21-27 NNFGAIL segment, suggesting tighter structural constraints for this stretch in IAPP amyloids. Finally, by comparing this mutational atlas to that of another amyloid, Amyloid beta (Aβ42), the peptide that aggregates in Alzheimer’s Disease, we find that the effects of mutations that slow down nucleation correlate between the two amyloids, but that when it comes to mutations that accelerate nucleation one single amyloid dataset cannot be used to predict mutational effects in the other.

2025-05-20 | GSE281555 | GEO

Deep mutagenesis reveals the distinct mutational landscape of ADan and ABri amyloid nucleation

Project description:Different forms of dementia are caused by stop-loss mutations in the ITM2B gene, also known as Bri2, which result in the expression of 34 amino acid long peptides that accumulate as amyloids in human brains. In order to gather mechanistic insights into the formation of amyloids by two of these peptides, ADan and ABri - hallmarks of Danish and British dementia respectively - we employed saturation mutagenesis combined to a massively parallel selection assay that reports on amyloid nucleation. Our results reveal that ADan aggregates into amyloids remarkably faster than both the unextended peptide Bri2 and the extended ABri sequence. The complete mutational landscape of ADan reveals asparagines and charged residues as key players in the nucleation process in addition to aliphatic residues within positions 20-25. What is more, we show that extending Bri2 with just two specific residues is enough to generate a novel amyloid core which we suggest builds the structured core of ADan fibrils. On the other hand, only a handful of mutations can boost the ability of ABri to nucleate amyloids, including a SNV replacing the Bri2 stop codon by a Cys codon. Overall, the remarkably different aggregation profiles and mutational landscapes for the two peptides suggest that different disease mechanisms underlie disease in Danish and British dementia and highlight the importance of accurately measuring the impact of stop extension mutations for these and other sequences across the genome.

2023-10-09 | GSE244612 | GEO

Amyloids “at the border”: deep mutagenesis and random sequence extension reveal an incomplete amyloid-forming motif in Bri2 that turns amyloidogenic upon C-terminal extension

Project description:Stop-loss mutations cause over twenty different diseases. The effects of stop-loss mutations can have multiple consequences that are, however, hard to predict. Stop-loss in ITM2B/BRI2 results in C-terminal extension of the encoded protein and, upon furin cleavage, in the production of two 34 amino acid long peptides, ADan and ABri, that accumulate as amyloids in the brains of patients affected by familial Danish and British Dementia. To systematically explore the consequences of Bri2 C-terminal extension, here, we measure amyloid formation for 676 ADan substitutions and identify the region that forms the putative amyloid core of ADan fibrils, located between positions 20 and 26, where stop-loss occurs. Moreover, we measure amyloid formation for ~18,000 random C-terminal extensions of Bri2 and find that ~32% of these sequences can nucleate amyloids. We find that the amino acid composition of these nucleating sequences varies with peptide length and that short extensions of 2 specific amino acids (Aliphatics, Aromatics and Cysteines) are sufficient to generate novel amyloid cores. Overall, our results show that the C-terminus of Bri2 contains an incomplete amyloid motif that can turn amyloidogenic upon extension. C-terminal extension with de novo formation of amyloid motifs may thus be a widespread pathogenic mechanism resulting from stop-loss, highlighting the importance of determining the impact of these mutations for other sequences across the genome.

2024-08-02 | GSE270792 | GEO

Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences

Project description:Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences

| PRJNA1115911 | ENA

Masel2000 - Drugs to stop prion aggregates and other amyloids

Project description:Masel2000 - Drugs to stop prion aggregates and other amyloids Encoded non-curated model. Issues: - Missing initial concentration for species y, yb and z - Not reproducible figures This model is described in the article: Designing drugs to stop the formation of prion aggregates and other amyloids. Masel J, Jansen VA. Biophys. Chem. 2000 Dec; 88(1-3): 47-59 Abstract: Amyloid protein aggregates are implicated in many neurodegenerative diseases, including Alzheimer's disease and the prion diseases. Therapeutics to block amyloid formation are often tested in vitro, but it is not clear how to extrapolate from these experiments to a clinical setting, where the effective drug dose may be much lower. Here we address this question using a theoretical kinetic model to calculate the growth rate of protein aggregates as a function of the dose of each of three categories of drug. We find that therapeutics which block the growing ends of amyloids are the most promising, as alternative strategies may be ineffective or even accelerate amyloid formation at low drug concentrations. Our mathematical model can be used to identify and optimise an end-blocking drug in vitro. Our model also suggests an alternative explanation for data previously thought to prove the existence of an entity known as protein X. This model is hosted on BioModels Database and identified by: MODEL1410310000. To cite BioModels Database, please use: BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. To the extent possible under law, all copyright and related or neighbouring rights to this encoded model have been dedicated to the public domain worldwide. Please refer to CC0 Public Domain Dedication for more information.

2015-04-15 | MODEL1410310000 | BioModels

Proteomics analysis of amyloid corneal aggregates from lattice corneal dystrophy patients

Project description:TGFBI associated Corneal Dystrophies (CD) are a group of inherited protein folding disorders linked to the mutation in the TGFBI gene. The resultant mutant protein (TGFBIp) is deposited as insoluble protein aggregates in various layers of the cornea leading to corneal opacity and poor vision. Depending on the type of mutation the deposits may be classified as amyloid fibrillar type, amorphous globular aggregates or a mixed form of both fibrils and amorphous aggregates. However, the molecular mechanism of the mutant-induced amyloidosis is not fully understood. This study aimsto characterize truncated peptides enriched in the amyloid aggregates and to identify the protein composition of the corneal aggregates derived from dystrophic patients using LC-MS/MS and compare the data with normal control cornea. We have identified several amyloid associated proteins, non-fibrillar amyloid associated proteins and TGFBIp as the major component of the corneal deposits. The results suggest that Apolipoprotein A-IV, Apolipoprotein E and Serine protease HtrA1 to be significantly enriched in the corneal deposits compared to the normal cornea. Comparative analysis of peptides from corneal deposits of patient and control identified several peptides of TGFBIp which are enriched in patient tissue and may form the core of corneal amyloids. Most of the peptides represent the 4th FAS-1 domain of the protein. Biophysical studies of two such peptides (G515DNRFSMLVAAIQSAGLTETLNR533 and Y571HIGDEILVSGGIGALVR588) demonstrate that they readily form amyloid fibrils under physiological conditions, confirming their intrinsic propensity to form amyloid fibrils. The identification of proteins which are involved in other protein misfolding disorders as well as identification of peptides from TGFBIp which form -amyloid core highlight that the mechanism of amyloid formation may share common molecular pathways.

2022-02-28 | PXD006640 | Pride

Energetic portrait of the amyloid beta transition state

Project description:In this study we use massively parallel combinatorial mutagenesis, a kinetic selection assay, and machine learning to better understand the nucleation reaction of amyloid beta (Aꞵ42), the protein that aggregates as a hallmark of Alzheimer’s disease (AD) and is mutated to cause familial AD.

2024-07-26 | GSE269461 | GEO

An atlas of amyloid aggregation: the impact of substitutions, insertions, deletions and truncations on amyloid beta fibril nucleation

Project description:Multiplexed assays of variant effects (MAVEs) guide clinical variant interpretation and reveal disease mechanisms. To date, MAVEs have focussed on a single mutation type - amino acid (AA) substitutions - despite the diversity of coding variants that cause disease. Here we use Deep Indel Mutagenesis (DIM) to generate the first comprehensive atlas of diverse variant effects for a disease protein, amyloid beta (Aß) that aggregates in Alzheimer’s disease (AD) and is mutated in familial AD (fAD). The atlas identifies known fAD variants and many mutations beyond substitutions that accelerate Aß aggregation. Truncations, substitutions, insertions, single- and multi-AA deletions differ in their propensity to enhance or impair aggregation, but likely pathogenic variants from all classes are strongly enriched in the polar N-terminus of Aß. This first comparative atlas for any disease gene highlights the importance of including diverse mutation types in MAVEs and provides important mechanistic insights into amyloid nucleation.

2022-10-27 | GSE193837 | GEO

p53 amyloid induced cellular transformation and tumor formation in the mouse xenograft model

Project description:p53 amyloid formation is predicted to be involved in cancer initiation, but the direct evidence of how altered p53 acts as an oncogene is lacking. Cells with p53 amyloids show enhanced survival, apoptotic resistance with increased proliferation and migration rates. Proteomic profiling of cells containing p53 aggregates suggests that p53 amyloid formation triggers aberrant expression of pro-oncogenes while downregulating the tumor-suppressive genes. We propose that wild-type p53 amyloid formation can potentially contribute to the initiation of tumor development.

2022-07-16 | PXD019498 | Pride

Clearance of an amyloid-like translational repressor is governed by 14-3-3 proteins

Project description:Amyloids are fibrous protein aggregates associated with age-related diseases. While these aggregates are typically described as irreversible and pathogenic, some cells utilize reversible amyloid-like structures that serve important functions. The RNA-binding protein Rim4 forms amyloid-like assemblies that are essential for translational control during S. cerevisiae meiosis. Rim4 amyloid-like assemblies are disassembled in a phosphorylation-dependent manner at meiosis II onset. By investigating Rim4 clearance, we elucidate co-factors that mediate clearance of amyloid-like assemblies in a physiological setting. We demonstrate that yeast 14-3-3 proteins bind to Rim4 assemblies and facilitate their subsequent phosphorylation and timely clearance. Furthermore, distinct 14-3-3 proteins play non-redundant roles in facilitating phosphorylation and clearance of amyloid-like Rim4. Additionally, we find that 14-3-3 proteins contribute to global protein aggregate homeostasis. Based on the role of 14-3-3 proteins in aggregate homeostasis and their interactions with disease-associated assemblies, we propose that these proteins may protect against pathological protein aggregates.

2022-04-01 | MSV000089188 | MassIVE

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data