scholarly journals Deep learning model of somatic hypermutation reveals importance of sequence context beyond targeting of AID and Polη hotspots

2021 ◽  
Author(s):  
Catherine Tang ◽  
Artem Krantsevich ◽  
Thomas MacCarthy

B-cells undergo somatic hypermutation (SHM) of the Immunoglobulin (Ig) variable region to generate high-affinity antibodies. SHM relies on the activity of activation-induced deaminase (AID), which mutates C>U preferentially targeting WRC (W=A/T, R=A/G) hotspots. Downstream mutations at WA Polymerase η hotspots contribute further mutations. Computational models of SHM can describe the probability of mutations essential for vaccine responses. Previous studies using short subsequences (k-mers) failed to explain divergent mutability for the same k-mer. We developed the DeepSHM (Deep learning on SHM) model using k-mers of size 5-21, improving accuracy over previous models. Interpretation of DeepSHM identified an extended DWRCT (D=A/G/T) motif with particularly high mutability. Increased mutability was further associated with lower surrounding G content. Our model also discovered a conserved AGYCTGGGGG (Y=C/T) motif within FW1 of IGHV3 family genes with unusually high T>G substitution rates. Thus, a wider sequence context increases predictive power and identifies novel features that drive mutational targeting.

2020 ◽  
Author(s):  
Jiarui Feng ◽  
Amanda Zeng ◽  
Yixin Chen ◽  
Philip Payne ◽  
Fuhai Li

AbstractUncovering signaling links or cascades among proteins that potentially regulate tumor development and drug response is one of the most critical and challenging tasks in cancer molecular biology. Inhibition of the targets on the core signaling cascades can be effective as novel cancer treatment regimens. However, signaling cascades inference remains an open problem, and there is a lack of effective computational models. The widely used gene co-expression network (no-direct signaling cascades) and shortest-path based protein-protein interaction (PPI) network analysis (with too many interactions, and did not consider the sparsity of signaling cascades) were not specifically designed to predict the direct and sparse signaling cascades. To resolve the challenges, we proposed a novel deep learning model, deepSignalingLinkNet, to predict signaling cascades by integrating transcriptomics data and copy number data of a large set of cancer samples with the protein-protein interactions (PPIs) via a novel deep graph neural network model. Different from the existing models, the proposed deep learning model was trained using the curated KEGG signaling pathways to identify the informative omics and PPI topology features in the data-driven manner to predict the potential signaling cascades. The validation results indicated the feasibility of signaling cascade prediction using the proposed deep learning models. Moreover, the trained model can potentially predict the signaling cascades among the new proteins by transferring the learned patterns on the curated signaling pathways. The code was available at: https://github.com/fuhaililab/deepSignalingPathwayPrediction.


2021 ◽  
Vol 118 (50) ◽  
pp. e2114743118
Author(s):  
Guojun Yu ◽  
Yongwei Zhang ◽  
Varun Gupta ◽  
Jinghang Zhang ◽  
Thomas MacCarthy ◽  
...  

The H3.3 histone variant and its chaperone HIRA are involved in active transcription, but their detailed roles in regulating somatic hypermutation (SHM) of immunoglobulin variable regions in human B cells are not yet fully understood. In this study, we show that the knockout (KO) of HIRA significantly decreased SHM and changed the mutation pattern of the variable region of the immunoglobulin heavy chain (IgH) in the human Ramos B cell line without changing the levels of activation-induced deaminase and other major proteins known to be involved in SHM. Except for H3K79me2/3 and Spt5, many factors related to active transcription, including H3.3, were substantively decreased in HIRA KO cells, and this was accompanied by decreased nascent transcription in the IgH locus. The abundance of ZMYND11 that specifically binds to H3.3K36me3 on the IgH locus was also reduced in the HIRA KO. Somewhat surprisingly, HIRA loss increased the chromatin accessibility of the IgH V region locus. Furthermore, stable expression of ectopic H3.3G34V and H3.3G34R mutants that inhibit both the trimethylation of H3.3K36 and the recruitment of ZMYND11 significantly reduced SHM in Ramos cells, while the H3.3K79M did not. Consistent with the HIRA KO, the H3.3G34V mutant also decreased the occupancy of various elongation factors and of ZMYND11 on the IgH variable and downstream switching regions. Our results reveal an unrecognized role of HIRA and the H3.3K36me3 modification in SHM and extend our knowledge of how transcription-associated chromatin structure and accessibility contribute to SHM in human B cells.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Jiarui Feng ◽  
Jennifer Lee ◽  
Zachary A. Vesoulis ◽  
Fuhai Li

AbstractMortality remains an exceptional burden of extremely preterm birth. Current clinical mortality prediction scores are calculated using a few static variable measurements, such as gestational age, birth weight, temperature, and blood pressure at admission. While these models do provide some insight, numerical and time-series vital sign data are also available for preterm babies admitted to the NICU and may provide greater insight into outcomes. Computational models that predict the mortality risk of preterm birth in the NICU by integrating vital sign data and static clinical variables in real time may be clinically helpful and potentially superior to static prediction models. However, there is a lack of established computational models for this specific task. In this study, we developed a novel deep learning model, DeepPBSMonitor (Deep Preterm Birth Survival Risk Monitor), to predict the mortality risk of preterm infants during initial NICU hospitalization. The proposed deep learning model can effectively integrate time-series vital sign data and fixed variables while resolving the influence of noise and imbalanced data. The proposed model was evaluated and compared with other approaches using data from 285 infants. Results showed that the DeepPBSMonitor model outperforms other approaches, with an accuracy, recall, and AUC score of 0.888, 0.780, and 0.897, respectively. In conclusion, the proposed model has demonstrated efficacy in predicting the real-time mortality risk of preterm infants in initial NICU hospitalization.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bo Pan ◽  
Wei Zheng

Emotion recognition plays an important role in the field of human-computer interaction (HCI). Automatic emotion recognition based on EEG is an important topic in brain-computer interface (BCI) applications. Currently, deep learning has been widely used in the field of EEG emotion recognition and has achieved remarkable results. However, due to the cost of data collection, most EEG datasets have only a small amount of EEG data, and the sample categories are unbalanced in these datasets. These problems will make it difficult for the deep learning model to predict the emotional state. In this paper, we propose a new sample generation method using generative adversarial networks to solve the problem of EEG sample shortage and sample category imbalance. In experiments, we explore the performance of emotion recognition with the frequency band correlation and frequency band separation computational models before and after data augmentation on standard EEG-based emotion datasets. Our experimental results show that the method of generative adversarial networks for data augmentation can effectively improve the performance of emotion recognition based on the deep learning model. And we find that the frequency band correlation deep learning model is more conducive to emotion recognition.


2021 ◽  
Vol 14 ◽  
Author(s):  
Joshua S. Rule ◽  
Maximilian Riesenhuber

Humans quickly and accurately learn new visual concepts from sparse data, sometimes just a single example. The impressive performance of artificial neural networks which hierarchically pool afferents across scales and positions suggests that the hierarchical organization of the human visual system is critical to its accuracy. These approaches, however, require magnitudes of order more examples than human learners. We used a benchmark deep learning model to show that the hierarchy can also be leveraged to vastly improve the speed of learning. We specifically show how previously learned but broadly tuned conceptual representations can be used to learn visual concepts from as few as two positive examples; reusing visual representations from earlier in the visual hierarchy, as in prior approaches, requires significantly more examples to perform comparably. These results suggest techniques for learning even more efficiently and provide a biologically plausible way to learn new visual concepts from few examples.


2021 ◽  
Vol 17 (9) ◽  
pp. e1009323
Author(s):  
Guojun Yu ◽  
Yingru Wu ◽  
Zhi Duan ◽  
Catherine Tang ◽  
Haipeng Xing ◽  
...  

The B cells in our body generate protective antibodies by introducing somatic hypermutations (SHM) into the variable region of immunoglobulin genes (IgVs). The mutations are generated by activation induced deaminase (AID) that converts cytosine to uracil in single stranded DNA (ssDNA) generated during transcription. Attempts have been made to correlate SHM with ssDNA using bisulfite to chemically convert cytosines that are accessible in the intact chromatin of mutating B cells. These studies have been complicated by using different definitions of “bisulfite accessible regions” (BARs). Recently, deep-sequencing has provided much larger datasets of such regions but computational methods are needed to enable this analysis. Here we leveraged the deep-sequencing approach with unique molecular identifiers and developed a novel Hidden Markov Model based Bayesian Segmentation algorithm to characterize the ssDNA regions in the IGHV4-34 gene of the human Ramos B cell line. Combining hierarchical clustering and our new Bayesian model, we identified recurrent BARs in certain subregions of both top and bottom strands of this gene. Using this new system, the average size of BARs is about 15 bp. We also identified potential G-quadruplex DNA structures in this gene and found that the BARs co-locate with G-quadruplex structures in the opposite strand. Using various correlation analyses, there is not a direct site-to-site relationship between the bisulfite accessible ssDNA and all sites of SHM but most of the highly AID mutated sites are within 15 bp of a BAR. In summary, we developed a novel platform to study single stranded DNA in chromatin at a base pair resolution that reveals potential relationships among BARs, SHM and G-quadruplexes. This platform could be applied to genome wide studies in the future.


2019 ◽  
Vol 47 (14) ◽  
pp. 7418-7429 ◽  
Author(s):  
Sandra Tepper ◽  
Oliver Mortusewicz ◽  
Ewelina Członka ◽  
Amanda Bello ◽  
Angelika Schmidt ◽  
...  

Abstract Affinity maturation of the humoral immune response depends on somatic hypermutation (SHM) of immunoglobulin (Ig) genes, which is initiated by targeted lesion introduction by activation-induced deaminase (AID), followed by error-prone DNA repair. Stringent regulation of this process is essential to prevent genetic instability, but no negative feedback control has been identified to date. Here we show that poly(ADP-ribose) polymerase-1 (PARP-1) is a key factor restricting AID activity during somatic hypermutation. Poly(ADP-ribose) (PAR) chains formed at DNA breaks trigger AID-PAR association, thus preventing excessive DNA damage induction at sites of AID action. Accordingly, AID activity and somatic hypermutation at the Ig variable region is decreased by PARP-1 activity. In addition, PARP-1 regulates DNA lesion processing by affecting strand biased A:T mutagenesis. Our study establishes a novel function of the ancestral genome maintenance factor PARP-1 as a critical local feedback regulator of both AID activity and DNA repair during Ig gene diversification.


2020 ◽  
Author(s):  
Heming Zhang ◽  
Jiarui Feng ◽  
Amanda Zeng ◽  
Philip Payne ◽  
Fuhai Li

AbstractDrug combinations targeting multiple targets/pathways are believed to be able to reduce drug resistance. Computational models are essential for novel drug combination discovery. In this study, we proposed a new simplified deep learning model, DeepSignalingSynergy, for drug combination prediction. Compared with existing models that use a large number of chemical-structure and genomics features in densely connected layers, we built the model on a small set of cancer signaling pathways, which can mimic the integration of multi-omics data and drug target/mechanism in a more biological meaningful and explainable manner. The evaluation results of the model using the NCI ALMANAC drug combination screening data indicated the feasibility of drug combination prediction using a small set of signaling pathways. Interestingly, the model analysis suggested the importance of heterogeneity of the 46 signaling pathways, which indicates that some new signaling pathways should be targeted to discover novel synergistic drug combinations.


Sign in / Sign up

Export Citation Format

Share Document