Deep learning model of somatic hypermutation reveals importance of sequence context beyond hotspot targeting.
Ontology highlight
ABSTRACT: B cells undergo somatic hypermutation (SHM) of the Immunoglobulin (Ig) variable region to generate high-affinity antibodies. SHM relies on the activity of activation-induced deaminase (AID), which mutates C>U preferentially targeting WRC (W=A/T, R=A/G) hotspots. Downstream mutations at WA Polymerase η hotspots contribute further mutations. Computational models of SHM can describe the probability of mutations essential for vaccine responses. Previous studies using short subsequences (k-mers) failed to explain divergent mutability for the same k-mer. We developed the DeepSHM (Deep learning on SHM) model using k-mers of size 5-21, improving accuracy over previous models. Interpretation of DeepSHM identified an extended WWRCT motif with particularly high mutability. Increased mutability was further associated with lower surrounding G content. Our model also discovered a conserved AGYCTGGGGG (Y=C/T) motif within FW1 of IGHV3 family genes with unusually high T>G substitution rates. Thus, a wider sequence context increases predictive power and identifies features that drive mutational targeting.
SUBMITTER: Tang C
PROVIDER: S-EPMC8749460 | biostudies-literature |
REPOSITORIES: biostudies-literature
ACCESS DATA