scholarly journals Establishing Synthesis Pathway-Host Compatibility via Enzyme Solubility

2018 ◽  
Author(s):  
Sara A. Amin ◽  
Venkatesh Endalur Gopinarayanan ◽  
Nikhil U. Nair ◽  
Soha Hassoun

AbstractCurrent pathway synthesis tools identify possible pathways that can be added to a host to produce a desired target molecule through the exploration of abstract metabolic and reaction network space. However, not many of these tools do explore gene-level information required to physically realize the identified synthesis pathways, and none explore enzyme-host compatibility. Developing tools that address this disconnect between abstract reactions/metabolic design space and physical genetic sequence design space will enable expedited experimental efforts that avoid exploring unprofitable synthesis pathways. This work describes a workflow, termed Probabilistic Pathway Assembly with Solubility Scores (ProPASS), which links synthesis pathway construction with the exploration of the physical design space as imposed by the availability of enzymes with characterized activities within the host. Predicted protein solubility propensity scores are used as a confidence level to quantify the compatibility of each pathway enzyme with the host (E. coli). This work also presents a database, termed Protein Solubility Database (ProSol DB), which provides solubility confidence scores inE. colifor 240,016 characterized enzymes obtained fromUniProtKB/Swiss-Prot. The utility ofProPASSis demonstrated by generating genetic implementations of heterologous synthesis pathways inE. colithat target several commercially useful biomolecules.AvailabilityProSol DBdata and code forProPASSare available for download fromhttps://github.com/HassounLab/

2021 ◽  
Author(s):  
Jun Fan ◽  
Enkhtuya Bayar ◽  
Yuanyuan Ren ◽  
Yafang Hu ◽  
Yinghua Chen ◽  
...  

Abstract Tobacco etch virus protease (TEVp) is a useful tool for removing fusion tag, but wild type TEVp shows less oxidative stability, which limits its application under the oxidized redox state to facilitate disulfide bonds formation for refolding disulfide-bonded proteins. Previously, we combined six mutations into the TEVp to generate the TEVp5M for obviously increasing the protein solubility and decreasing the auto-cleavage. In this work, we introduced and combined C19S, C110S and C130S mutations into the TEVp5M to generate seven variants, analyzed protein solubility and the cleavage activity of the constructs in each of three E. coli strains including BL21(DE3), BL21(DE3)pLys, and Rossetta(DE3), and those of the optimized soluble variants in the oxidative cytoplasm of Origami(DE3) under the same induction conditions. The results suggested that desirable protein solubility, cleavage activity and oxidative stability are not combined. Unlike that of the C19S, introduction of the C110S and/or C130S less affected protein solubility but increased tolerance to the oxidative redox state. Use of the TEVp5MC110S/C130S variant, the refolded disulfide-rich bovine enteropeptidase or maize peroxidase was released via cleaving the sequence between the target protein and the cellulose-binding module bound to regenerated amorphous cellulose.


HortScience ◽  
2009 ◽  
Vol 44 (3) ◽  
pp. 866-869 ◽  
Author(s):  
Hyesoon Kim ◽  
Yeh-Jin Ahn

DcHSP17.7, a small heat shock protein from carrot (Daucus carota L.), was expressed in Escherichia coli to examine its functional mechanism under heat stress. When transformed cells expressing DcHSP17.7 were exposed to 50 °C for 1 h, the number of viable cells was ≈4-fold higher than that of control. When the amount of soluble proteins was compared, it was more than twofold higher in transformed cells expressing DcHSP17.7 than that in control, suggesting that DcHSP17.7 may function as a molecular chaperone preventing heat-inducible protein degradation. Native-PAGE followed by immunoblot analysis showed that in transformed E. coli, DcHSP17.7 was present in an oligomeric complex, ≈300 kDa in molecular mass, on isopropyl b-D-thiogalactopyranoside treatment. However, the complex rapidly disappeared when bacterial cells were exposed to heat stress. In carrot, DcHSP17.7 was found in the similar-sized complex (≈300 kDa), but only during heat stress (40 °C), suggesting that the functional structure of DcHSP17.7 may be different in transformed E. coli from that in carrot.


2019 ◽  
Author(s):  
By Xiuyu Ma ◽  
Keegan Korthauer ◽  
Christina Kendziorski ◽  
Michael A. Newton

AbstractOn the problem of scoring genes for evidence of changes in the distribution of single-cell expression, we introduce an empirical Bayesian mixture approach and evaluate its operating characteristics in a range of numerical experiments. The proposed approach leverages cell-subtype structure revealed in cluster analysis in order to boost gene-level information on expression changes. Cell clustering informs gene-level analysis through a specially-constructed prior distribution over pairs of multinomial probability vectors; this prior meshes with available model-based tools that score patterns of differential expression over multiple subtypes. We derive an explicit formula for the posterior probability that a gene has the same distribution in two cellular conditions, allowing for a gene-specific mixture over subtypes in each condition. Advantage is gained by the compositional structure of the model, in which a host of gene-specific mixture components are allowed, but also in which the mixing proportions are constrained at the whole cell level. This structure leads to a novel form of information sharing through which the cell-clustering results support gene-level scoring of differential distribution. The result, according to our numerical experiments, is improved sensitivity compared to several standard approaches for detecting distributional expression changes.


2007 ◽  
Vol 402 (3) ◽  
pp. 429-437 ◽  
Author(s):  
Shimin Jiang ◽  
Chunhong Li ◽  
Weiwen Zhang ◽  
Yuanheng Cai ◽  
Yunliu Yang ◽  
...  

One of the greatest bottlenecks in producing recombinant proteins in Escherichia coli is that over-expressed target proteins are mostly present in an insoluble form without any biological activity. DCase (N-carbamoyl-D-amino acid amidohydrolase) is an important enzyme involved in semi-synthesis of β-lactam antibiotics in industry. In the present study, in order to determine the amino acid sites responsible for solubility of DCase, error-prone PCR and DNA shuffling techniques were applied to randomly mutate its coding sequence, followed by an efficient screening based on structural complementation. Several mutants of DCase with reduced aggregation were isolated. Solubility tests of these and several other mutants generated by site-directed mutagenesis indicated that three amino acid residues of DCase (Ala18, Tyr30 and Lys34) are involved in its protein solubility. In silico structural modelling analyses suggest further that hydrophilicity and/or negative charge at these three residues may be responsible for the increased solubility of DCase proteins in E. coli. Based on this information, multiple engineering designated mutants were constructed by site-directed mutagenesis, among them a triple mutant A18T/Y30N/K34E (named DCase-M3) could be overexpressed in E. coli and up to 80% of it was soluble. DCase-M3 was purified to homogeneity and a comparative analysis with wild-type DCase demonstrated that DCase-M3 enzyme was similar to the native DCase in terms of its kinetic and thermodynamic properties. The present study provides new insights into recombinant protein solubility in E. coli.


2009 ◽  
Vol 388 (2) ◽  
pp. 381-389 ◽  
Author(s):  
Gian Gaetano Tartaglia ◽  
Sebastian Pechmann ◽  
Christopher M. Dobson ◽  
Michele Vendruscolo

2018 ◽  
Author(s):  
Soon Bin Kwon ◽  
Kisun Ryu ◽  
Ahyun Son ◽  
Hotcherl Jeong ◽  
Keo-Heun Lim ◽  
...  

AbstractProtein-folding assistance and aggregation inhibition by cellular factors are largely understood in the context of molecular chaperones. As an alternative and complementary model, we previously proposed that, in general, soluble cellular macromolecules including chaperones with large excluded volume and surface charges exhibit the intrinsic chaperone activity to prevent aggregation of their connected polypeptides, irrespective of the connection types, and thus to aid productive protein folding. As a proof of concept, we here demonstrated that a model soluble protein with an inactive protease domain robustly exerted chaperone activity toward various proteins harboring a short protease-recognition tag of 7 residues in Escherichia coli. The chaperone activity of this protein was similar or even superior to that of representative E. coli chaperones in vivo. Furthermore, in vitro refolding experiments confirmed the in vivo results. Our findings revealed that a soluble protein exhibits the intrinsic chaperone activity, which is manifested, upon binding to aggregation-prone proteins. This study gives new insights into the ubiquitous chaperoning role of cellular macromolecules in protein-folding assistance and aggregation inhibition underlying the maintenance of protein solubility and proteostasis in vivo.


2020 ◽  
Author(s):  
Fatemeh Ashari Ghomi ◽  
Tiia Kittilä ◽  
Ditte Hededam Welner

AbstractUDP-dependent glycosyltransferases (UGTs) are enzymes that glycosylate a wide variety of natural products, thereby modifying their physico-chemical properties, i.e. solubility, stability, reactivity, and function. To successfully leverage the UGTs in biocatalytic processes, we need to be able to screen and characterise them in vitro, which requires efficient heterologous expression in amenable hosts, preferably Escherichia coli. However, many UGTs are insoluble when expressed in standard and attempted optimised E. coli conditions, resulting in many unproductive and costly experiments. To overcome this limitation, we have investigated the performance of 11 existing solubility predictors on a dataset of 57 UGTs expressed in E. coli. We show that SoluProt outperforms other methods in terms of both threshold-independent and threshold-dependent measures. Among the benchmarked methods, only SoluProt is significantly better than random predictors using both measures. Moreover, we show that SoluProt uses a threshold for separating soluble and insoluble proteins that is optimal for our dataset. Hence, we conclude that using SoluProt to select UGT sequences for in vitro investigation will significantly increase the success rate of soluble expression, thereby minimising cost and enabling efficient characterisation efforts for biocatalysis research.


2021 ◽  
Author(s):  
Vineet Thumuluri ◽  
Hannah-Marie Martiny ◽  
Jose J. Almagro Armenteros ◽  
Jesper Salomon ◽  
Henrik Nielsen ◽  
...  

Solubility and expression levels of proteins can be a limiting factor for large-scale studies and industrial production. By determining the solubility and expression directly from the protein sequence, the success rate of wet-lab experiments can be increased. In this study, we focus on predicting the solubility and usability for purification of proteins expressed in Escherichia coli directly from the sequence. Our model NetSolP is based on deep-learning protein language models called transformers and we show that it achieves state-of-the-art performance and improves extrapolation across datasets. As we find current methods are built on biased datasets, we curate existing datasets by using strict sequence-identity partitioning and ensure that there is minimal bias in the sequences.


Sign in / Sign up

Export Citation Format

Share Document