scholarly journals Prediction of Michaelis constants from structural features using deep learning

2020 ◽  
Author(s):  
Alexander Kroll ◽  
David Heckmann ◽  
Martin J. Lercher

ABSTRACTThe Michaelis constant KM describes the affinity of an enzyme for a specific substrate, and is a central parameter in studies of enzyme kinetics and cellular physiology. As measurements of KM are often difficult and time-consuming, experimental estimates exist for only a minority of enzyme-substrate combinations even in model organisms. Here, we build and train an organism-independent model that successfully predicts KM values for natural enzyme-substrate combinations using machine and deep learning methods. Predictions are based on a task-specific molecular fingerprint of the substrate, generated using a graph neural network, and the domain structure of the enzyme. Model predictions can be used to estimate enzyme efficiencies, to relate metabolite concentrations to cellular physiology, and to fill gaps in the parameterization of kinetic models of cellular metabolism.

PLoS Biology ◽  
2021 ◽  
Vol 19 (10) ◽  
pp. e3001402
Author(s):  
Alexander Kroll ◽  
Martin K. M. Engqvist ◽  
David Heckmann ◽  
Martin J. Lercher

The Michaelis constant KM describes the affinity of an enzyme for a specific substrate and is a central parameter in studies of enzyme kinetics and cellular physiology. As measurements of KM are often difficult and time-consuming, experimental estimates exist for only a minority of enzyme–substrate combinations even in model organisms. Here, we build and train an organism-independent model that successfully predicts KM values for natural enzyme–substrate combinations using machine and deep learning methods. Predictions are based on a task-specific molecular fingerprint of the substrate, generated using a graph neural network, and on a deep numerical representation of the enzyme’s amino acid sequence. We provide genome-scale KM predictions for 47 model organisms, which can be used to approximately relate metabolite concentrations to cellular physiology and to aid in the parameterization of kinetic models of cellular metabolism.


2012 ◽  
Vol 6 ◽  
pp. BBI.S9902 ◽  
Author(s):  
Divya P. Syamaladevi ◽  
Margaret S Sunitha ◽  
S. Kalaimathy ◽  
Chandrashekar C. Reddy ◽  
Mohammed Iftekhar ◽  
...  

Myosins are one of the largest protein superfamilies with 24 classes. They have conserved structural features and catalytic domains yet show huge variation at different domains resulting in a variety of functions. Myosins are molecules driving various kinds of cellular processes and motility until the level of organisms. These are ATPases that utilize the chemical energy released by ATP hydrolysis to bring about conformational changes leading to a motor function. Myosins are important as they are involved in almost all cellular activities ranging from cell division to transcriptional regulation. They are crucial due to their involvement in many congenital diseases symptomatized by muscular malfunctions, cardiac diseases, deafness, neural and immunological dysfunction, and so on, many of which lead to death at an early age. We present Myosinome, a database of selected myosin classes (myosin II, V, and VI) from five model organisms. This knowledge base provides the sequences, phylogenetic clustering, domain architectures of myosins and molecular models, structural analyses, and relevant literature of their coiled-coil domains. In the current version of Myosinome, information about 71 myosin sequences belonging to three myosin classes (myosin II, V, and VI) in five model organisms ( Homo Sapiens, Mus musculus, D. melanogaster, C. elegans and S. cereviseae) identified using bioinformatics surveys are presented, and several of them are yet to be functionally characterized. As these proteins are involved in congenital diseases, such a database would be useful in short-listing candidates for gene therapy and drug development. The database can be accessed from http://caps.ncbs.res.in/myosinome .


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Rahil Taujale ◽  
Zhongliang Zhou ◽  
Wayland Yeung ◽  
Kelley W. Moremen ◽  
Sheng Li ◽  
...  

AbstractGlycosyltransferases (GTs) play fundamental roles in nearly all cellular processes through the biosynthesis of complex carbohydrates and glycosylation of diverse protein and small molecule substrates. The extensive structural and functional diversification of GTs presents a major challenge in mapping the relationships connecting sequence, structure, fold and function using traditional bioinformatics approaches. Here, we present a convolutional neural network with attention (CNN-attention) based deep learning model that leverages simple secondary structure representations generated from primary sequences to provide GT fold prediction with high accuracy. The model learns distinguishing secondary structure features free of primary sequence alignment constraints and is highly interpretable. It delineates sequence and structural features characteristic of individual fold types, while classifying them into distinct clusters that group evolutionarily divergent families based on shared secondary structural features. We further extend our model to classify GT families of unknown folds and variants of known folds. By identifying families that are likely to adopt novel folds such as GT91, GT96 and GT97, our studies expand the GT fold landscape and prioritize targets for future structural studies.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Dennis Segebarth ◽  
Matthias Griebel ◽  
Nikolai Stein ◽  
Cora R von Collenberg ◽  
Corinna Martin ◽  
...  

Bioimage analysis of fluorescent labels is widely used in the life sciences. Recent advances in deep learning (DL) allow automating time-consuming manual image analysis processes based on annotated training data. However, manual annotation of fluorescent features with a low signal-to-noise ratio is somewhat subjective. Training DL models on subjective annotations may be instable or yield biased models. In turn, these models may be unable to reliably detect biological effects. An analysis pipeline integrating data annotation, ground truth estimation, and model training can mitigate this risk. To evaluate this integrated process, we compared different DL-based analysis approaches. With data from two model organisms (mice, zebrafish) and five laboratories, we show that ground truth estimation from multiple human annotators helps to establish objectivity in fluorescent feature annotations. Furthermore, ensembles of multiple models trained on the estimated ground truth establish reliability and validity. Our research provides guidelines for reproducible DL-based bioimage analyses.


mSystems ◽  
2019 ◽  
Vol 4 (4) ◽  
Author(s):  
Benjamin C. Creekmore ◽  
Josh H. Gray ◽  
William G. Walton ◽  
Kristen A. Biernat ◽  
Michael S. Little ◽  
...  

ABSTRACT Gut microbial β-glucuronidase (GUS) enzymes play important roles in drug efficacy and toxicity, intestinal carcinogenesis, and mammalian-microbial symbiosis. Recently, the first catalog of human gut GUS proteins was provided for the Human Microbiome Project stool sample database and revealed 279 unique GUS enzymes organized into six categories based on active-site structural features. Because mice represent a model biomedical research organism, here we provide an analogous catalog of mouse intestinal microbial GUS proteins—a mouse gut GUSome. Using metagenome analysis guided by protein structure, we examined 2.5 million unique proteins from a comprehensive mouse gut metagenome created from several mouse strains, providers, housing conditions, and diets. We identified 444 unique GUS proteins and organized them into six categories based on active-site features, similarly to the human GUSome analysis. GUS enzymes were encoded by the major gut microbial phyla, including Firmicutes (60%) and Bacteroidetes (21%), and there were nearly 20% for which taxonomy could not be assigned. No differences in gut microbial gus gene composition were observed for mice based on sex. However, mice exhibited gus differences based on active-site features associated with provider, location, strain, and diet. Furthermore, diet yielded the largest differences in gus composition. Biochemical analysis of two low-fat-associated GUS enzymes revealed that they are variable with respect to their efficacy of processing both sulfated and nonsulfated heparan nonasaccharides containing terminal glucuronides. IMPORTANCE Mice are commonly employed as model organisms of mammalian disease; as such, our understanding of the compositions of their gut microbiomes is critical to appreciating how the mouse and human gastrointestinal tracts mirror one another. GUS enzymes, with importance in normal physiology and disease, are an attractive set of proteins to use for such analyses. Here we show that while the specific GUS enzymes differ at the sequence level, a core GUSome functionality appears conserved between mouse and human gastrointestinal bacteria. Mouse strain, provider, housing location, and diet exhibit distinct GUSomes and gus gene compositions, but sex seems not to affect the GUSome. These data provide a basis for understanding the gut microbial GUS enzymes present in commonly used laboratory mice. Further, they demonstrate the utility of metagenome analysis guided by protein structure to provide specific sets of functionally related proteins from whole-genome metagenome sequencing data.


Author(s):  
Shinji Kawakura ◽  
Ryosuke Shibasaki

In this study, we attempt to develop a deep learning-based self-driving car system to deliver items (e.g., harvested onions, agri-tools, PET bottles) to agricultural (agri-) workers at an agri-workplace. The system is based around a car-shaped robot, JetBot, with an NVIDIA artificial intelligence (AI) oriented board. JetBot can find diverse objects and avoid them. We implemented experimental trials at a real warehouse where various items (glove, boot, sickle (falx), scissors, and hoe), called obstacles, were scattered. The assumed agri-worker was a man suspending dried onions on a beam. Specifically, we developed a system focusing on the function of precisely detecting obstacles with deep learning-based techniques (techs), self-avoidance, and automatic delivery of small items for manual agri-workers and managers. Both the car-shaped figure and the deep learning-based obstacles-avoidance function differ from existing mobile agri-machine techs and products with respect to their main aims and structural features. Their advantages are their low costs in comparison with past similar mechanical systems found in the literature and similar commercial goods. The robot is extremely agile and easily identifies and learns obstacles. Additionally, the JetBot kit is a minimal product and includes a feature allowing users to arbitrarily expand and change functions and mechanical settings. This study consists of six phases: (1) designing and confirming the validity of the entire system, (2) constructing and tuning various minor system settings (e.g., programs and JetBot specifications), (3) accumulating obstacle picture data, (4) executing deep learning, (5) conducting experiments in an indoor warehouse to simulate a real agri-working situation, and (6) assessing and discussing the trial data quantitatively (presenting the success and error rates of the trials) and qualitatively. We consider that from the limited trials, the system can be judged as valid to some extent in certain situations. However, we were unable to perform more broad or generalizable experiments (e.g., execution at mud farmlands and running JetBot on non-flat floor). We present experimental ranges for the success ratio of these trials, particularly noting crashed obstacle types and other error types. We were also able to observe features of the system’s practical operations. The novel achievements of this study lie in the fusion of recent deep learning-based agricultural informatics. In the future, agri-workers and their managers could use the proposed system in real agri-places as a common automatic delivering system. Furthermore, we believe, by combining this application with other existing systems, future agri-fields and other workplaces could become more comfortable and secure (e.g., delivering water bottles could avoid heat (stress) disorders).


2021 ◽  
Author(s):  
Rahil Taujale ◽  
Zhongliang Zhou ◽  
Wayland Yeung ◽  
Kelley W Moremen ◽  
Sheng Li ◽  
...  

Glycosyltransferases (GTs) play fundamental roles in nearly all cellular processes through 10 the biosynthesis of complex carbohydrates and glycosylation of diverse protein and small 11 molecule substrates. The extensive structural and functional diversification of GTs presents a 12 major challenge in mapping the relationships connecting sequence, structure, fold and function 13 using traditional bioinformatics approaches. Here, we present a convolutional neural network 14 with attention (CNN-attention) based deep learning model that leverages simple secondary 15 structure representations generated from primary sequences to provide GT fold prediction with 16 high accuracy. The model learned distinguishing features free of primary sequence alignment 17 constraints and, unlike other models, is highly interpretable and helped identify common 18 secondary structural features shared by divergent families. The model delineated sequence and 19 structural features characteristic of individual fold types, while classifying them into distinct 20 clusters that group evolutionarily divergent families based on shared secondary structural 21 features. We further extend our model to classify GT families of unknown folds and variants of 22 known folds. By identifying families that are likely to adopt novel folds such as GT91, GT96 and 23 GT97, our studies identify targets for future structural studies and expand the GT fold landscape.


2019 ◽  
Author(s):  
Bryce K Allen ◽  
Nagi G Ayad ◽  
Stephan C Schürer

Deep learning is a machine learning technique that attempts to model high-level abstractions in data by utilizing a graph composed of multiple processing layers that experience various linear and non-linear transformations. This technique has been shown to perform well for applications in drug discovery, utilizing structural features of small molecules to predict activity. However, the application of deep learning to discriminating features of kinase inhibitors has not been well explored. Small molecule kinase inhibitors are an important class of anti-cancer agents and have demonstrated impressive clinical efficacy in several different diseases. However, resistance is often observed mediated by adaptive Kinome reprogramming or subpopulation diversity. Therefore, polypharmacology and combination therapies offer potential therapeutic strategies for patients with resistant disease. Their development would benefit from more comprehensive and dense knowledge of small-molecule inhibition across the human Kinome. Because such data is not publicly available, we evaluated multiple machine learning methods to predict small molecule inhibition of 342 kinases using over 650K aggregated bioactivity annotations for over 300K small molecules curated from ChEMBL and the Kinase Knowledge Base (KKB). Our results demonstrated that multi-task deep neural networks outperform classical single-task methods, offering potential towards predicting activity profiles and filling gaps in the available data.


Sign in / Sign up

Export Citation Format

Share Document