DynaDom: structure-based prediction of T cell receptor inter-domain and T cell receptor-peptide-MHC (class I) association angles.
Ontology highlight
ABSTRACT: T cell receptor (TCR) molecules are involved in the adaptive immune response as they distinguish between self- and foreign-peptides, presented in major histocompatibility complex molecules (pMHC). Former studies showed that the association angles of the TCR variable domains (V?/V?) can differ significantly and change upon binding to the pMHC complex. These changes can be described as a rotation of the domains around a general Center of Rotation, characterized by the interaction of two highly conserved glutamine residues.We developed a computational method, DynaDom, for the prediction of TCR V?/V? inter-domain and TCR/pMHC orientations in TCRpMHC complexes, which allows predicting the orientation of multiple protein-domains. In addition, we implemented a new approach to predict the correct orientation of the carboxamide endgroups in glutamine and asparagine residues, which can also be used as an external, independent tool.The approach was evaluated for the remodeling of 75 and 53 experimental structures of TCR and TCRpMHC (class I) complexes, respectively. We show that the DynaDom method predicts the correct orientation of the TCR V?/V? angles in 96 and 89% of the cases, for the poses with the best RMSD and best interaction energy, respectively. For the concurrent prediction of the TCR V?/V? and pMHC orientations, the respective rates reached 74 and 72%. Through an exhaustive analysis, we could show that the pMHC placement can be further improved by a straightforward, yet very time intensive extension of the current approach.The results obtained in the present remodeling study prove the suitability of our approach for interdomain-angle optimization. In addition, the high prediction rate obtained specifically for the energetically highest ranked poses further demonstrates that our method is a powerful candidate for blind prediction. Therefore it should be well suited as part of any accurate atomistic modeling pipeline for TCRpMHC complexes and potentially other large molecular assemblies.
<h4>Background</h4>T cell receptor (TCR) molecules are involved in the adaptive immune response as they distinguish between self- and foreign-peptides, presented in major histocompatibility complex molecules (pMHC). Former studies showed that the association angles of the TCR variable domains (Vα/Vβ) can differ significantly and change upon binding to the pMHC complex. These changes can be described as a rotation of the domains around a general Center of Rotation, characterized by the interactio ...[more]
Project description:The regulatory and effector functions of T cells are initiated by the binding of their cell-surface T cell receptor (TCR) to peptides presented by major histocompatibility complex (MHC) proteins on other cells. The specificity of TCR:peptide-MHC interactions, thus, underlies nearly all adaptive immune responses. Despite intense interest, generalizable predictive models of TCR:peptide-MHC specificity remain out of reach; two key barriers are the diversity of TCR recognition modes and the paucity of training data. Inspired by recent breakthroughs in protein structure prediction achieved by deep neural networks, we evaluated structural modeling as a potential avenue for prediction of TCR epitope specificity. We show that a specialized version of the neural network predictor AlphaFold can generate models of TCR:peptide-MHC interactions that can be used to discriminate correct from incorrect peptide epitopes with substantial accuracy. Although much work remains to be done for these predictions to have widespread practical utility, we are optimistic that deep learning-based structural modeling represents a path to generalizable prediction of TCR:peptide-MHC interaction specificity.
Project description:Establishment of peptide binding to Major Histocompatibility Complex class I (MHCI) is a crucial step in the development of subunit vaccines and prediction of such binding could greatly reduce costs and accelerate the experimental process of identifying immunogenic peptides. Many methods have been applied to the prediction of peptide-MHCI binding, with some achieving outstanding performance. Because of the experimental methods used to measure binding or affinity between peptides and MHCI molecules, however, available datasets are enriched for nonbinders, and thus highly unbalanced. Although there is no consensus on the ideal class distribution for training sets, extremely unbalanced datasets can be detrimental to the performance of prediction algorithms.We have developed a decision-theoretic framework to construct cost-sensitive trees to predict peptide-MHCI binding and have used them to 1) Assess the impact of the training data's class distribution on classifier accuracy, and 2) Compare resampling and cost-sensitive methods as approaches to compensate for training data imbalance. Our results confirm that highly unbalanced training sets can reduce the accuracy of classifier predictions and show that, in the peptide-MHCI binding context, resampling methods do not improve the classifier performance. In contrast, cost-sensitive methods significantly improve accuracy of decision trees. Finally, we propose the use of a training scheme that, when the training set is enriched for nonbinders, consistently improves the overall classifier accuracy compared to cost-insensitive classifiers and, in particular, increases the sensitivity of the classifiers. This method minimizes the expected classification cost for large datasets.Our method consistently improves the performance of decision trees in predicting peptide-MHC class I binding by using cost-balancing techniques to compensate for the imbalance in the training dataset.
Project description:MOTIVATION:The computational modeling of peptide display by class I major histocompatibility complexes (MHCs) is essential for peptide-based therapeutics design. Existing computational methods for peptide-display focus on modeling the peptide-MHC-binding affinity. However, such models are not able to characterize the sequence features for the other cellular processes in the peptide display pathway that determines MHC ligand selection. RESULTS:We introduce a semi-supervised model, DeepLigand that outperforms the state-of-the-art models in MHC Class I ligand prediction. DeepLigand combines a peptide language model and peptide binding affinity prediction to score MHC class I peptide presentation. The peptide language model characterizes sequence features that correspond to secondary factors in MHC ligand selection other than binding affinity. The peptide embedding is learned by pre-training on natural ligands, and can discriminate between ligands and non-ligands in the absence of binding affinity prediction. Although conventional affinity-based models fail to classify peptides with moderate affinities, DeepLigand discriminates ligands from non-ligands with consistently high accuracy. AVAILABILITY AND IMPLEMENTATION:We make DeepLigand available at https://github.com/gifford-lab/DeepLigand. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
Project description:T cell receptors (TCRs) are immune proteins that specifically bind to antigenic molecules, which are often foreign peptides presented by major histocompatibility complex proteins (pMHCs), playing a key role in the cellular immune response. To advance our understanding and modeling of this dynamic immunological event, we assembled a protein-protein docking benchmark consisting of 20 structures of crystallized TCR/pMHC complexes for which unbound structures exist for both TCR and pMHC. We used our benchmark to compare predictive performance using several flexible and rigid backbone TCR/pMHC docking protocols. Our flexible TCR docking algorithm, TCRFlexDock, improved predictive success over the fixed backbone protocol, leading to near-native predictions for 80% of the TCR/pMHC cases among the top 10 models, and 100% of the cases in the top 30 models. We then applied TCRFlexDock to predict the two distinct docking modes recently described for a single TCR bound to two different antigens, and tested several protein modeling scoring functions for prediction of TCR/pMHC binding affinities. This algorithm and benchmark should enable future efforts to predict, and design of uncharacterized TCR/pMHC complexes.
Project description:Binding of peptides to MHC class I (MHC-I) molecules is the most selective event in the processing and presentation of Ags to CTL, and insights into the mechanisms that govern peptide-MHC-I binding should facilitate our understanding of CTL biology. Peptide-MHC-I interactions have traditionally been quantified by the strength of the interaction, that is, the binding affinity, yet it has been shown that the stability of the peptide-MHC-I complex is a better correlate of immunogenicity compared with binding affinity. In this study, we have experimentally analyzed peptide-MHC-I complex stability of a large panel of human MHC-I allotypes and generated a body of data sufficient to develop a neural network-based pan-specific predictor of peptide-MHC-I complex stability. Integrating the neural network predictors of peptide-MHC-I complex stability with state-of-the-art predictors of peptide-MHC-I binding is shown to significantly improve the prediction of CTL epitopes. The method is publicly available at http://www.cbs.dtu.dk/services/NetMHCstabpan.
Project description:Purpose of reviewThe molecular and cellular mechanisms that underlie allorecognition of MHC class II molecules have been the subject of much debate and experimentation in recent decades. In this review, we discuss several aspects of MHC class II structure, peptide acquisition and TcR-MHC-peptide interactions that have particular relevance to recognition of cells bearing allogeneic class II molecules.Recent findingsFirst, MHC polymorphism is heavily biased toward those amino acids that influence stable peptide binding by MHC class II. Second, the peptide repertoire presented by class II molecules is highly diverse and can be edited substantially by the molecular catalyst HLA-DM and by tissue-specific expression of HLA-DO, stress and cytokines. Third, T-cell receptor docking onto MHC peptide consistently involves substantial contacts with the bound peptide in the MHC class II molecule. Finally, there is increasing evidence that T-cell recognition of MHC is, in part, germline encoded through T-cell-receptor V region contacts with MHC class II alpha helices.SummaryTogether, these conclusions support the view that allorecognition of MHC class II molecules is likely to parallel key aspects of conventional CD4 T-cell recognition, with allele-dependent variation in peptide representation accounting in large part for the high precursor frequency of alloreactive CD4 T cells.
Project description:Protein antigens and their specific epitopes are formulation targets for epitope-based vaccines. A number of prediction servers are available for identification of peptides that bind major histocompatibility complex class I (MHC-I) molecules. The lack of standardized methodology and large number of human MHC-I molecules make the selection of appropriate prediction servers difficult. This study reports a comparative evaluation of thirty prediction servers for seven human MHC-I molecules.Of 147 individual predictors 39 have shown excellent, 47 good, 33 marginal, and 28 poor ability to classify binders from non-binders. The classifiers for HLA-A*0201, A*0301, A*1101, B*0702, B*0801, and B*1501 have excellent, and for A*2402 moderate classification accuracy. Sixteen prediction servers predict peptide binding affinity to MHC-I molecules with high accuracy; correlation coefficients ranging from r = 0.55 (B*0801) to r = 0.87 (A*0201).Non-linear predictors outperform matrix-based predictors. Most predictors can be improved by non-linear transformations of their raw prediction scores. The best predictors of peptide binding are also best in prediction of T-cell epitopes. We propose a new standard for MHC-I binding prediction - a common scale for normalization of prediction scores, applicable to both experimental and predicted data. The results of this study provide assistance to researchers in selection of most adequate prediction tools and selection criteria that suit the needs of their projects.
Project description:T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I molecules plays a vital role in the design of peptide vaccines. Many computational methods, for example, the state-of-the-art allele-specific method MHCflurry , have been developed to predict the binding affinities between peptides and MHC molecules. In this manuscript, we develop two allele-specific Convolutional Neural Network-based methods named ConvM and SpConvM to tackle the binding prediction problem. Specifically, we formulate the problem as to optimize the rankings of peptide-MHC bindings via ranking-based learning objectives. Such optimization is more robust and tolerant to the measurement inaccuracy of binding affinities, and therefore enables more accurate prioritization of binding peptides. In addition, we develop a new position encoding method in ConvM and SpConvM to better identify the most important amino acids for the binding events. We conduct a comprehensive set of experiments using the latest Immune Epitope Database (IEDB) datasets. Our experimental results demonstrate that our models significantly outperform the state-of-the-art methods including MHCflurry with an average percentage improvement of 6.70% on AUC and 17.10% on ROC5 across 128 alleles.
Project description:BackgroundComputational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions.ResultsNonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc .ConclusionsWe developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions.
Project description:MotivationThe binding of a peptide antigen to a Class I major histocompatibility complex (MHC) protein is part of a key process that lets the immune system recognize an infected cell or a cancer cell. This mechanism enabled the development of peptide-based vaccines that can activate the patient's immune response to treat cancers. Hence, the ability of accurately predict peptide-MHC binding is an essential component for prioritizing the best peptides for each patient. However, peptide-MHC binding experimental data for many MHC alleles are still lacking, which limited the accuracy of existing prediction models.ResultsIn this study, we presented an improved version of MHCSeqNet that utilized sub-word-level peptide features, a 3D structure embedding for MHC alleles, and an expanded training dataset to achieve better generalizability on MHC alleles with small amounts of data. Visualization of MHC allele embeddings confirms that the model was able to group alleles with similar binding specificity, including those with no peptide ligand in the training dataset. Furthermore, an external evaluation suggests that MHCSeqNet2 can improve the prioritization of T cell epitopes for MHC alleles with small amount of training data.Availability and implementationThe source code and installation instruction for MHCSeqNet2 are available at https://github.com/cmb-chula/MHCSeqNet2.