scholarly journals Attention please: modeling global and local context in glycan structure-function relationships

2021 ◽  
Author(s):  
Bowen Dai ◽  
Daniel E Mattox ◽  
Chris Bailey-Kellogg

Glycans are found across the tree of life with remarkable structural diversity enabling critical contributions to diverse biological processes, ranging from facilitating host-pathogen interactions to regulating mitosis & DNA damage repair. While functional motifs within glycan structures are largely responsible for mediating interactions, the contexts in which the motifs are presented can drastically impact these interactions and their downstream effects. Here, we demonstrate the first deep learning method to represent both local and global context in the study of glycan structure-function relationships. Our method, glyBERT, encodes glycans with a branched biochemical language and employs an attention-based deep language model to learn biologically relevant glycan representations focused on the most important components within their global structures. Applying glyBERT to a variety of prediction tasks confirms the value of capturing rich context-dependent patterns in this attention-based model: the same monosaccharides and glycan motifs are represented differently in different contexts and thereby enable improved predictive performance relative to the previous state-of-the-art approaches. Furthermore, glyBERT supports generative exploration of context-dependent glycan structure-function space, moving from one glycan to "nearby" glycans so as to maintain or alter predicted functional properties. In a case study application to altering glycan immunogenicity, this generative process reveals the learned contextual determinants of immunogenicity while yielding both known and novel, realistic glycan structures with altered predicted immunogenicity. In summary, modeling the context dependence of glycan motifs is critical for investigating overall glycan functionality and can enable further exploration of glycan structure-function space to inform new hypotheses and synthetic efforts.

Author(s):  
Ryo Nishikimi ◽  
Eita Nakamura ◽  
Masataka Goto ◽  
Kazuyoshi Yoshii

This paper describes an automatic singing transcription (AST) method that estimates a human-readable musical score of a sung melody from an input music signal. Because of the considerable pitch and temporal variation of a singing voice, a naive cascading approach that estimates an F0 contour and quantizes it with estimated tatum times cannot avoid many pitch and rhythm errors. To solve this problem, we formulate a unified generative model of a music signal that consists of a semi-Markov language model representing the generative process of latent musical notes conditioned on musical keys and an acoustic model based on a convolutional recurrent neural network (CRNN) representing the generative process of an observed music signal from the notes. The resulting CRNN-HSMM hybrid model enables us to estimate the most-likely musical notes from a music signal with the Viterbi algorithm, while leveraging both the grammatical knowledge about musical notes and the expressive power of the CRNN. The experimental results showed that the proposed method outperformed the conventional state-of-the-art method and the integration of the musical language model with the acoustic model has a positive effect on the AST performance.


Author(s):  
Yuta Ojima ◽  
Eita Nakamura ◽  
Katsutoshi Itoyama ◽  
Kazuyoshi Yoshii

This paper describes automatic music transcription with chord estimation for music audio signals. We focus on the fact that concurrent structures of musical notes such as chords form the basis of harmony and are considered for music composition. Since chords and musical notes are deeply linked with each other, we propose joint pitch and chord estimation based on a Bayesian hierarchical model that consists of an acoustic model representing the generative process of a spectrogram and a language model representing the generative process of a piano roll. The acoustic model is formulated as a variant of non-negative matrix factorization that has binary variables indicating a piano roll. The language model is formulated as a hidden Markov model that has chord labels as the latent variables and emits a piano roll. The sequential dependency of a piano roll can be represented in the language model. Both models are integrated through a piano roll in a hierarchical Bayesian manner. All the latent variables and parameters are estimated using Gibbs sampling. The experimental results showed the great potential of the proposed method for unified music transcription and grammar induction.


2011 ◽  
Vol 39 (suppl) ◽  
pp. W357-W361 ◽  
Author(s):  
M. Fischer ◽  
Q. C. Zhang ◽  
F. Dey ◽  
B. Y. Chen ◽  
B. Honig ◽  
...  

2020 ◽  
Author(s):  
Usman Naseem ◽  
Matloob Khushi ◽  
Vinay Reddy ◽  
Sakthivel Rajendran ◽  
Imran Razzak ◽  
...  

Abstract Background: In recent years, with the growing amount of biomedical documents, coupled with advancement in natural language processing algorithms, the research on biomedical named entity recognition (BioNER) has increased exponentially. However, BioNER research is challenging as NER in the biomedical domain are: (i) often restricted due to limited amount of training data, (ii) an entity can refer to multiple types and concepts depending on its context and, (iii) heavy reliance on acronyms that are sub-domain specific. Existing BioNER approaches often neglect these issues and directly adopt the state-of-the-art (SOTA) models trained in general corpora which often yields unsatisfactory results. Results: We propose biomedical ALBERT (A Lite Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) - bioALBERT - an effective domain-specific pre-trained language model trained on huge biomedical corpus designed to capture biomedical context-dependent NER. We adopted self-supervised loss function used in ALBERT that targets on modelling inter-sentence coherence to better learn context-dependent representations and incorporated parameter reduction strategies to minimise memory usage and enhance the training time in BioNER. In our experiments, BioALBERT outperformed comparative SOTA BioNER models on eight biomedical NER benchmark datasets with four different entity types. The performance is increased for; (i) disease type corpora by 7.47% (NCBI-disease) and 10.63% (BC5CDR-disease); (ii) drug-chem type corpora by 4.61% (BC5CDR-Chem) and 3.89 (BC4CHEMD); (iii) gene-protein type corpora by 12.25% (BC2GM) and 6.42% (JNLPBA); and (iv) Species type corpora by 6.19% (LINNAEUS) and 23.71% (Species-800) is observed which leads to a state-of-the-art results. Conclusions: The performance of proposed model on four different biomedical entity types shows that our model is robust and generalizable in recognizing biomedical entities in text. We trained four different variants of BioALBERT models which are available for the research community to be used in future research.


2017 ◽  
Author(s):  
Christopher R Madan ◽  
Marcia L Spetch ◽  
Fernanda Machado ◽  
Alice Mason ◽  
Elliot Andrew Ludvig

Both memory and choice are influenced by context: Memory is enhanced when encoding and retrieval contexts match, and choice is swayed by available options. Here, we assessed how context influences risky choice in an experience-based task. Within a single session, we created two separate contexts by presenting blocks of trials in distinct backgrounds. Risky choices were context-dependent; given the same choice, people chose differently depending on other outcomes experienced in that context. Choices reflected an overweighting of the most extreme outcomes within each local context, rather than the global context of all outcomes. When tested in the non-trained context, people chose according to the context at encoding and not retrieval. In subsequent memory tests, people displayed biases specific to distinct contexts: extreme outcomes from each context were more accessible and judged as more frequent. These results pose a challenge for theories of choice that rely on retrieval as guiding choice.


1989 ◽  
Vol 10 (1) ◽  
pp. 1-12 ◽  
Author(s):  
Susan E. Bryson ◽  
Janet F. Werker

ABSTRACTThis experiment examined the vowel responses of severely disabled readers and normal control children in reading orthographically regular nonwords. The disabled readers were divided into three groups based on their relative Verbal and Performance IQs. Following the rationale of Fowler, Shankweiler, and Liberman (1979), vowel responses were classified as incorrect or correct. Correctness was determined according to either context-free or context-dependent criteria. The main finding was that the vowel responses of two out of three reading disabled groups paralleled those of their reading level peers. However, disabled readers with higher Performance than Verbal IQs made significantly more context-free responses and significantly fewer context-dependent responses than all other groups. Moreover, knowledge of how speech is segmented at the phonemic level predicted performance on the reading task. The findings suggest that disabled readers employ very local (context-independent) strategies in reading; these findings are discussed in terms of the idea that disabled readers suffer a basic deficit in phonological processing (Liberman, Liberman, & Mattingly, 1980) or linguistic processing (Siegel & Ryan, 1984).


2020 ◽  
Vol 295 (10) ◽  
pp. 3115-3133
Author(s):  
Xiaotian Zhong ◽  
Srinath Jagarlapudi ◽  
Yan Weng ◽  
Mellisa Ly ◽  
Jason C. Rouse ◽  
...  

The fortuitously discovered antiaging membrane protein αKlotho (Klotho) is highly expressed in the kidney, and deletion of the Klotho gene in mice causes a phenotype strikingly similar to that of chronic kidney disease (CKD). Klotho functions as a co-receptor for fibroblast growth factor 23 (FGF23) signaling, whereas its shed extracellular domain, soluble Klotho (sKlotho), carrying glycosidase activity, is a humoral factor that regulates renal health. Low sKlotho in CKD is associated with disease progression, and sKlotho supplementation has emerged as a potential therapeutic strategy for managing CKD. Here, we explored the structure-function relationship and post-translational modifications of sKlotho variants to guide the future design of sKlotho-based therapeutics. Chinese hamster ovary (CHO)- and human embryonic kidney (HEK)-derived WT sKlotho proteins had varied activities in FGF23 co-receptor and β-glucuronidase assays in vitro and distinct properties in vivo. Sialidase treatment of heavily sialylated CHO-sKlotho increased its co-receptor activity 3-fold, yet it remained less active than hyposialylated HEK-sKlotho. MS and glycopeptide-mapping analyses revealed that HEK-sKlotho is uniquely modified with an unusual N-glycan structure consisting of N,N′-di-N-acetyllactose diamine at multiple N-linked sites, one of which at Asn-126 was adjacent to a putative GalNAc transfer motif. Site-directed mutagenesis and structural modeling analyses directly implicated N-glycans in Klotho's protein folding and function. Moreover, the introduction of two catalytic glutamate residues conserved across glycosidases into sKlotho enhanced its glucuronidase activity but decreased its FGF23 co-receptor activity, suggesting that these two functions might be structurally divergent. These findings open up opportunities for rational engineering of pharmacologically enhanced sKlotho therapeutics for managing kidney disease.


Sign in / Sign up

Export Citation Format

Share Document