scholarly journals Machine learning reveals sequence-function relationships in family 7 glycoside hydrolases

2020 ◽  
Author(s):  
Japheth E. Gado ◽  
Brent E. Harrison ◽  
Mats Sandgren ◽  
Jerry Ståhlberg ◽  
Gregg T. Beckham ◽  
...  

AbstractFamily 7 glycoside hydrolases (GH7) are among the principal enzymes for cellulose degradation in nature and industrially. These important enzymes are often bimodular, comprised of a catalytic domain attached to a carbohydrate binding module (CBM) via a flexible linker, and exhibit a long active site that binds cello-oligomers of up to ten glucosyl moieties. GH7 cellulases consist of two major subtypes: cellobiohydrolases (CBH) and endoglucanases (EG). Despite the critical biological and industrial importance of GH7 enzymes, there remain gaps in our understanding of how GH7 sequence and structure relate to function. Here, we employed machine learning to gain insights into relationships between sequence, structure, and function across the GH7 family. Machine-learning models, using the number of residues in the active-site loops as features, were able discriminate GH7 CBHs and EGs with up to 99% accuracy. The lengths of the A4, B2, B3, and B4 loops were strongly correlated with functional subtype across the GH7 family. Position-specific classification rules were derived such that specific amino acids at 42 different sequence positions predicted the functional subtype with accuracies greater than 87%. A random forest model trained on residues at 19 positions in the catalytic domain predicted the presence of a CBM with 89.5% accuracy. We propose these positions play vital roles in the functional variation of GH7 cellulases. Taken together, our results complement numerous experimental findings and present functional relationships that can be applied when prospecting GH7 cellulases from nature, for sequence annotation, and to understand or manipulate function.

2003 ◽  
Vol 371 (3) ◽  
pp. 1027-1043 ◽  
Author(s):  
Deborah HOGG ◽  
Gavin PELL ◽  
Paul DUPREE ◽  
Florence GOUBET ◽  
Susana M. MARTÍN-ORÚE ◽  
...  

β-1,4-Mannanases (mannanases), which hydrolyse mannans and glucomannans, are located in glycoside hydrolase families (GHs) 5 and 26. To investigate whether there are fundamental differences in the molecular architecture and biochemical properties of GH5 and GH26 mannanases, four genes encoding these enzymes were isolated from Cellvibrio japonicus and the encoded glycoside hydrolases were characterized. The four genes, man5A, man5B, man5C and man26B, encode the mannanases Man5A, Man5B, Man5C and Man26B, respectively. Man26B consists of an N-terminal signal peptide linked via an extended serine-rich region to a GH26 catalytic domain. Man5A, Man5B and Man5C contain GH5 catalytic domains and non-catalytic carbohydrate-binding modules (CBMs) belonging to families 2a, 5 and 10; Man5C in addition contains a module defined as X4 of unknown function. The family 10 and 2a CBMs bound to crystalline cellulose and ivory nut crystalline mannan, displaying very similar properties to the corresponding family 10 and 2a CBMs from Cellvibrio cellulases and xylanases. CBM5 bound weakly to these crystalline polysaccharides. The catalytic domains of Man5A, Man5B and Man26B hydrolysed galactomannan and glucomannan, but displayed no activity against crystalline mannan or cellulosic substrates. Although Man5C was less active against glucomannan and galactomannan than the other mannanases, it did attack crystalline ivory nut mannan. All the enzymes exhibited classic endo-activity producing a mixture of oligosaccharides during the initial phase of the reaction, although their mode of action against manno-oligosaccharides and glucomannan indicated differences in the topology of the respective substrate-binding sites. This report points to a different role for GH5 and GH26 mannanases from C. japonicus. We propose that as the GH5 enzymes contain CBMs that bind crystalline polysaccharides, these enzymes are likely to target mannans that are integral to the plant cell wall, while GH26 mannanases, which lack CBMs and rapidly release mannose from polysaccharides and oligosaccharides, target the storage polysaccharide galactomannan and manno-oligosaccharides.


2020 ◽  
Author(s):  
Kaori Matsuyama ◽  
Naomi Kishine ◽  
Zui Fujimoto ◽  
Naoki Sunagawa ◽  
Toshihisa Kotake ◽  
...  

AbstractArabinogalactan proteins (AGPs) are functional plant proteoglycans, but their functions are largely unexplored, mainly because of the complexity of the sugar moieties, which are generally analyzed with the aid of glycoside hydrolases. In this study, we solved the apo and liganded structures of exo-β-1,3-galactanase from the basidiomycete Phanerochaete chrysosporium (Pc1,3Gal43A), which specifically cleaves AGPs. It is composed of a glycoside hydrolase family 43 subfamily 24 (GH43_sub24) catalytic domain together with a carbohydrate-binding module family (CBM) 35 binding domain. GH43_sub24 lacks the catalytic base Asp that is conserved among other GH43 subfamilies. Crystal structure and kinetic analyses indicated that the tautomerized imidic acid function of Gln263 serves instead as the catalytic base residue. Pc1,3Gal43A has three subsites that continue from the bottom of the catalytic pocket to the solvent. Subsite -1 contains a space that can accommodate the C-6 methylol of Gal, enabling the enzyme to bypass the β-1,6-linked galactan side chains of AGPs. Furthermore, the galactan-binding domain in CBM35 has a different ligand interaction mechanism from other sugar-binding CBM35s. Some of the residues involved in ligand recognition differ from those of galactomannan-binding CBM35, including substitution of Trp for Gly, which affects pyranose stacking, and substitution of Asn for Asp in the lower part of the binding pocket. Pc1,3Gal43A WT and its mutants at residues involved in substrate recognition are expected to be useful tools for structural analysis of AGPs. Our findings should also be helpful in engineering designer enzymes for efficient utilization of various types of biomass.


Author(s):  
Olga V. Moroz ◽  
Elena Blagova ◽  
Andrey A. Lebedev ◽  
Filomeno Sánchez Rodríguez ◽  
Daniel J. Rigden ◽  
...  

β-Galactosidases catalyse the hydrolysis of lactose into galactose and glucose; as an alternative reaction, some β-galactosidases also catalyse the formation of galactooligosaccharides by transglycosylation. Both reactions have industrial importance: lactose hydrolysis is used to produce lactose-free milk, while galactooligosaccharides have been shown to act as prebiotics. For some multi-domain β-galactosidases, the hydrolysis/transglycosylation ratio can be modified by the truncation of carbohydrate-binding modules. Here, an analysis of BbgIII, a multidomain β-galactosidase from Bifidobacterium bifidum, is presented. The X-ray structure has been determined of an intact protein corresponding to a gene construct of eight domains. The use of evolutionary covariance-based predictions made sequence docking in low-resolution areas of the model spectacularly easy, confirming the relevance of this rapidly developing deep-learning-based technique for model building. The structure revealed two alternative orientations of the CBM32 carbohydrate-binding module relative to the GH2 catalytic domain in the six crystallographically independent chains. In one orientation the CBM32 domain covers the entrance to the active site of the enzyme, while in the other orientation the active site is open, suggesting a possible mechanism for switching between the two activities of the enzyme, namely lactose hydrolysis and transgalactosylation. The location of the carbohydrate-binding site of the CBM32 domain on the opposite site of the module to where it comes into contact with the catalytic GH2 domain is consistent with its involvement in adherence to host cells. The role of the CBM32 domain in switching between hydrolysis and transglycosylation modes offers protein-engineering opportunities for selective β-galactosidase modification for industrial purposes in the future.


2020 ◽  
Vol 117 (47) ◽  
pp. 29595-29601
Author(s):  
Łukasz F. Sobala ◽  
Pearl Z. Fernandes ◽  
Zalihe Hakki ◽  
Andrew J. Thompson ◽  
Jonathon D. Howe ◽  
...  

Mammalian protein N-linked glycosylation is critical for glycoprotein folding, quality control, trafficking, recognition, and function. N-linked glycans are synthesized from Glc3Man9GlcNAc2precursors that are trimmed and modified in the endoplasmic reticulum (ER) and Golgi apparatus by glycoside hydrolases and glycosyltransferases. Endo-α-1,2-mannosidase (MANEA) is the soleendo-acting glycoside hydrolase involved in N-glycan trimming and is located within the Golgi, where it allows ER-escaped glycoproteins to bypass the classical N-glycosylation trimming pathway involving ER glucosidases I and II. There is considerable interest in the use of small molecules that disrupt N-linked glycosylation as therapeutic agents for diseases such as cancer and viral infection. Here we report the structure of the catalytic domain of human MANEA and complexes with substrate-derived inhibitors, which provide insight into dynamic loop movements that occur on substrate binding. We reveal structural features of the human enzyme that explain its substrate preference and the mechanistic basis for catalysis. These structures have inspired the development of new inhibitors that disrupt host protein N-glycan processing of viral glycans and reduce the infectivity of bovine viral diarrhea and dengue viruses in cellular models. These results may contribute to efforts aimed at developing broad-spectrum antiviral agents and help provide a more in-depth understanding of the biology of mammalian glycosylation.


2003 ◽  
Vol 185 (15) ◽  
pp. 4362-4370 ◽  
Author(s):  
Jeffrey D. Palumbo ◽  
Raymond F. Sullivan ◽  
Donald Y. Kobayashi

ABSTRACT Lysobacter enzymogenes strain N4-7 produces multiple biochemically distinct extracellular β-1,3-glucanase activities. The gluA, gluB, and gluC genes, encoding enzymes with β-1,3-glucanase activity, were identified by a reverse-genetics approach following internal amino acid sequence determination of β-1,3-glucanase-active proteins partially purified from culture filtrates of strain N4-7. Analysis of gluA and gluC gene products indicates that they are members of family 16 glycoside hydrolases that have significant sequence identity to each other throughout the catalytic domain but that differ structurally by the presence of a family 6 carbohydrate-binding domain within the gluC product. Analysis of the gluB gene product indicates that it is a member of family 64 glycoside hydrolases. Expression of each gene in Escherichia coli resulted in the production of proteins with β-1,3-glucanase activity. Biochemical analyses of the recombinant enzymes indicate that GluA and GluC exhibit maximal activity at pH 4.5 and 45°C and that GluB is most active between pH 4.5 and 5.0 at 41°C. Activity of recombinant proteins against various β-1,3 glucan substrates indicates that GluA and GluC are most active against linear β-1,3 glucans, while GluB is most active against the insoluble β-1,3 glucan substrate zymosan A. These data suggest that the contribution of β-1,3-glucanases to the biocontrol activity of L. enzymogenes may be due to complementary activities of these enzymes in the hydrolysis of β-1,3 glucans from fungal cell walls.


2008 ◽  
Vol 190 (24) ◽  
pp. 8220-8222 ◽  
Author(s):  
Anat Ezer ◽  
Erez Matalon ◽  
Sadanari Jindou ◽  
Ilya Borovok ◽  
Nof Atamna ◽  
...  

ABSTRACT The rumen bacterium Ruminococcus albus binds to and degrades crystalline cellulosic substrates via a unique cellulose degradation system. A unique family of carbohydrate-binding modules (CBM37), located at the C terminus of different glycoside hydrolases, appears to be responsible both for anchoring these enzymes to the bacterial cell surface and for substrate binding.


2014 ◽  
Vol 70 (2) ◽  
pp. 421-435 ◽  
Author(s):  
Dae Gwin Jeong ◽  
Chun Hua Wei ◽  
Bonsu Ku ◽  
Tae Jin Jeon ◽  
Pham Ngoc Chien ◽  
...  

Dual-specificity protein phosphatases (DUSPs), which dephosphorylate both phosphoserine/threonine and phosphotyrosine, play vital roles in immune activation, brain function and cell-growth signalling. A family-wide structural library of human DUSPs was constructed based on experimental structure determination supplemented with homology modelling. The catalytic domain of each individual DUSP has characteristic features in the active site and in surface-charge distribution, indicating substrate-interaction specificity. The active-site loop-to-strand switch occurs in a subtype-specific manner, indicating that the switch process is necessary for characteristic substrate interactions in the corresponding DUSPs. A comprehensive analysis of the activity–inhibition profile and active-site geometry of DUSPs revealed a novel role of the active-pocket structure in the substrate specificity of DUSPs. A structure-based analysis of redox responses indicated that the additional cysteine residues are important for the protection of enzyme activity. The family-wide structures of DUSPs form a basis for the understanding of phosphorylation-mediated signal transduction and the development of therapeutics.


2017 ◽  
Vol 114 (52) ◽  
pp. 13667-13672 ◽  
Author(s):  
Antonella Amore ◽  
Brandon C. Knott ◽  
Nitin T. Supekar ◽  
Asif Shajahan ◽  
Parastoo Azadi ◽  
...  

In nature, many microbes secrete mixtures of glycoside hydrolases, oxidoreductases, and accessory enzymes to deconstruct polysaccharides and lignin in plants. These enzymes are often decorated with N- and O-glycosylation, the roles of which have been broadly attributed to protection from proteolysis, as the extracellular milieu is an aggressive environment. Glycosylation has been shown to sometimes affect activity, but these effects are not fully understood. Here, we examine N- and O-glycosylation on a model, multimodular glycoside hydrolase family 7 cellobiohydrolase (Cel7A), which exhibits an O-glycosylated carbohydrate-binding module (CBM) and an O-glycosylated linker connected to an N- and O-glycosylated catalytic domain (CD)—a domain architecture common to many biomass-degrading enzymes. We report consensus maps for Cel7A glycosylation that include glycan sites and motifs. Additionally, we examine the roles of glycans on activity, substrate binding, and thermal and proteolytic stability. N-glycan knockouts on the CD demonstrate that N-glycosylation has little impact on cellulose conversion or binding, but does have major stability impacts. O-glycans on the CBM have little impact on binding, proteolysis, or activity in the whole-enzyme context. However, linker O-glycans greatly impact cellulose conversion via their contribution to proteolysis resistance. Molecular simulations predict an additional role for linker O-glycans, namely that they are responsible for maintaining separation between ordered domains when Cel7A is engaged on cellulose, as models predict α-helix formation and decreased cellulose interaction for the nonglycosylated linker. Overall, this study reveals key roles for N- and O-glycosylation that are likely broadly applicable to other plant cell-wall–degrading enzymes.


2003 ◽  
Vol 185 (2) ◽  
pp. 391-398 ◽  
Author(s):  
Rachel Gilad ◽  
Larisa Rabinovich ◽  
Sima Yaron ◽  
Edward A. Bayer ◽  
Raphael Lamed ◽  
...  

ABSTRACT The family 9 cellulase gene celI of Clostridium thermocellum, was previously cloned, expressed, and characterized (G. P. Hazlewood, K. Davidson, J. I. Laurie, N. S. Huskisson, and H. J. Gilbert, J. Gen. Microbiol. 139:307-316, 1993). We have recloned and sequenced the entire celI gene and found that the published sequence contained a 53-bp deletion that generated a frameshift mutation, resulting in a truncated and modified C-terminal segment of the protein. The enzymatic properties of the wild-type protein were characterized and found to conform to those of other family 9 glycoside hydrolases with a so-called theme B architecture, where the catalytic module is fused to a family 3c carbohydrate-binding module (CBM3c); CelI also contains a C-terminal CBM3b. The intact recombinant CelI exhibited high levels of activity on all cellulosic substrates tested, with pH and temperature optima of 5.5 and 70°C, respectively, using carboxymethylcellulose as a substrate. Native CelI was capable of solubilizing filter paper, and the distribution of reducing sugar between the soluble and insoluble fractions suggests that the enzyme acts as a processive cellulase. A truncated form of the enzyme, lacking the C terminal CBM3b, failed to bind to crystalline cellulose and displayed reduced activity toward insoluble substrates. A truncated form of the enzyme, in which both the cellulose-binding CBM3b and the fused CBM3c were removed, failed to exhibit significant levels of activity on any of the substrates examined. This study underscores the general nature of this type of enzymatic theme, whereby the fused CBM3c plays a critical accessory role for the family 9 catalytic domain and changes its character to facilitate processive cleavage of recalcitrant cellulose substrates.


Sign in / Sign up

Export Citation Format

Share Document