Improving folding properties of computationally designed proteins

2019 ◽  
Vol 32 (3) ◽  
pp. 145-151
Author(s):  
Benjamin Bjerre ◽  
Jakob Nissen ◽  
Mikkel Madsen ◽  
Jūratė Fahrig-Kamarauskaitė ◽  
Rasmus K Norrild ◽  
...  

Abstract While the field of computational protein design has witnessed amazing progression in recent years, folding properties still constitute a significant barrier towards designing new and larger proteins. In order to assess and improve folding properties of designed proteins, we have developed a genetics-based folding assay and selection system based on the essential enzyme, orotate phosphoribosyl transferase from Escherichia coli. This system allows for both screening of candidate designs with good folding properties and genetic selection of improved designs. Thus, we identified single amino acid substitutions in two failed designs that rescued poorly folding and unstable proteins. Furthermore, when these substitutions were transferred into a well-structured design featuring a complex folding profile, the resulting protein exhibited native-like cooperative folding with significantly improved stability. In protein design, a single amino acid can make the difference between folding and misfolding, and this approach provides a useful new platform to identify and improve candidate designs.

2001 ◽  
Vol 281 (4) ◽  
pp. G1034-G1043 ◽  
Author(s):  
Kousei Ito ◽  
Hiroshi Suzuki ◽  
Yuichi Sugiyama

Multidrug resistance-associated protein 3 (MRP3), unlike other MRPs, transports taurocholate (TC). The difference in TC transport activity between rat MRP2 and MRP3 was studied, focusing on the cationic amino acids in the transmembrane domains. For analysis, transport into membrane vesicles from Sf9 cells expressing wild-type and mutated MRP2 was examined. Substitution of Arg at position 586 with Leu and Ile and substitution of Arg at position 1096 with Lys, Leu, and Met resulted in the acquisition of TC transport activity, while retaining transport activity for glutathione and glucuronide conjugates. Substitution of Leu at position 1084 of rat MRP3 (which corresponds to Arg-1096 in rat MRP2) with Lys, but not with Val or Met, resulted in the loss of transport activity for TC and glucuronide conjugates. These results suggest that the presence of the cationic charge at Arg-586 and Arg-1096 in rat MRP2 prevents the transport of TC, whereas the presence of neutral amino acids at the corresponding position of rat MRP3 is required for the transport of substrates.


2013 ◽  
Author(s):  
Eleisha L. Jackson ◽  
Noah Ollikainen ◽  
Arthur W. Covert III ◽  
Tanja Kortemme ◽  
Claus O. Wilke

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.


2020 ◽  
Vol 117 (37) ◽  
pp. 23165-23173 ◽  
Author(s):  
Robert S. Allen ◽  
Christina M. Gregg ◽  
Shoko Okada ◽  
Amratha Menon ◽  
Dawar Hussain ◽  
...  

To engineer Mo-dependent nitrogenase function in plants, expression of the structural proteins NifD and NifK will be an absolute requirement. Although mitochondria have been established as a suitable eukaryotic environment for biosynthesis of oxygen-sensitive enzymes such as NifH, expression of NifD in this organelle has proven difficult due to cryptic NifD degradation. Here, we describe a solution to this problem. Using molecular and proteomic methods, we found NifD degradation to be a consequence of mitochondrial endoprotease activity at a specific motif within NifD. Focusing on this functionally sensitive region, we designed NifD variants comprising between one and three amino acid substitutions and distinguished several that were resistant to degradation when expressed in both plant and yeast mitochondria. Nitrogenase activity assays of these resistant variants in Escherichia coli identified a subset that retained function, including a single amino acid variant (Y100Q). We found that other naturally occurring NifD proteins containing alternate amino acids at the Y100 position were also less susceptible to degradation. The Y100Q variant also enabled expression of a NifD(Y100Q)–linker–NifK translational polyprotein in plant mitochondria, confirmed by identification of the polyprotein in the soluble fraction of plant extracts. The NifD(Y100Q)–linker–NifK retained function in bacterial nitrogenase assays, demonstrating that this polyprotein permits expression of NifD and NifK in a defined stoichiometry supportive of activity. Our results exemplify how protein design can overcome impediments encountered when expressing synthetic proteins in novel environments. Specifically, these findings outline our progress toward the assembly of the catalytic unit of nitrogenase within mitochondria.


2000 ◽  
Vol 182 (21) ◽  
pp. 6049-6054 ◽  
Author(s):  
Carol A. Holland-Staley ◽  
KangSeok Lee ◽  
David P. Clark ◽  
Philip R. Cunningham

ABSTRACT Expression of the alcohol dehydrogenase gene, adhE, inEscherichia coli is anaerobically regulated at both the transcriptional and the translational levels. To study the AdhE protein, the adhE + structural gene was cloned into expression vectors under the control of the lacZ andtrp c promoters. Wild-type AdhE protein produced under aerobic conditions from these constructs was inactive. Constitutive mutants (adhC) that produced high levels of AdhE under both aerobic and anaerobic conditions were previously isolated. When only the adhE structural gene from one of the adhC mutants was cloned into expression vectors, highly functional AdhE protein was isolated under both aerobic and anaerobic conditions. Sequence analysis revealed that the adhE gene from the adhC mutant contained two mutations resulting in two amino acid substitutions, Ala267Thr and Glu568Lys. Thus,adhC strains contain a promoter mutation and two mutations in the structural gene. The mutant structural gene fromadhC strains was designated adhE*. Fragment exchange experiments revealed that the substitution responsible for aerobic expression in the adhE* clones is Glu568Lys. Genetic selection and site-directed mutagenesis experiments showed that virtually any amino acid substitution for Glu568 produced AdhE that was active under both aerobic and anaerobic conditions. These findings suggest that adhE expression is also regulated posttranslationally and that strict regulation of alcohol dehydrogenase activity in E. coli is physiologically significant.


2005 ◽  
Vol 79 (18) ◽  
pp. 11638-11646 ◽  
Author(s):  
Christopher E. Yi ◽  
Lei Ba ◽  
Linqi Zhang ◽  
David D. Ho ◽  
Zhiwei Chen

ABSTRACT Neutralizing antibodies (NAbs) against severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) spike (S) glycoprotein confer protection to animals experimentally infected with the pathogenic virus. We and others previously demonstrated that a major mechanism for neutralizing SARS-CoV was through blocking the interaction between the S glycoprotein and the cellular receptor angiotensin-converting enzyme 2 (ACE2). In this study, we used in vivo electroporation DNA immunization and a pseudovirus-based assay to functionally evaluate immunogenicity and viral entry. We characterized the neutralization and viral entry determinants within the ACE2-binding domain of the S glycoprotein. The deletion of a positively charged region SΔ(422-463) abolished the capacity of the S glycoprotein to induce NAbs in mice vaccinated by in vivo DNA electroporation. Moreover, the SΔ(422-463) pseudovirus was unable to infect HEK293T-ACE2 cells. To determine the specific residues that contribute to related phenotypes, we replaced eight basic amino acids with alanine. We found that a single amino acid substitution (R441A) in the full-length S DNA vaccine failed to induce NAbs and abolished viral entry when pseudoviruses were generated. However, another substitution (R453A) abolished viral entry while retaining the capacity for inducing NAbs. The difference between R441A and R453A suggests that the determinants for immunogenicity and viral entry may not be identical. Our findings provide direct evidence that these basic residues are essential for immunogenicity of the major neutralizing domain and for viral entry. Our data have implications for the rational design of vaccine and antiviral agents as well as for understanding viral tropism.


Blood ◽  
2010 ◽  
Vol 116 (21) ◽  
pp. 634-634 ◽  
Author(s):  
Vasilis Bikos ◽  
Nikos Darzentas ◽  
Anastasia Hadzidimitriou ◽  
Zadie Davis ◽  
Sarah Hockley ◽  
...  

Abstract Abstract 634 We systematically explored the immunoglobulin (IG) gene repertoire in 337 cases with splenic marginal-zone lymphoma (SMZL), by far the largest series yet. To resolve classification uncertainties, we included in the analysis only cases with a diagnosis of SMZL based on spleen histopathological findings or cases fulfilling the 2008 SBLG criteria (Matutes et al. Leukemia 2008). We here report that the IG heavy variable (IGHV) gene repertoire in SMZL is remarkably biased, with only three genes accounting for 45.8% of cases (IGHV1-2, 24.9%; IGHV4-34, 12.8%; IGHV3-23: 8.1%, respectively), significantly extending previous similar observations. Particularly for the IGHV1-2 gene, strong biases became evident at the level of utilization of different alleles, since 79/86 rearrangements (92%) utilized allele *04 vs. only 7/86 rearrangements (8%) that utilized allele *02. This is noteworthy, taking into consideration that these two alleles differ in a single nucleotide, leading to a single amino acid change in framework region (FR)-3. The repertoire biases became more pronounced when the analysis was focused on 171 rearrangements from 163 cases classified as SMZL based on splenic histopathology, according to the 2008 WHO criteria. Within this subgroup, 56/171 cases (32.7%) utilized IGHV1-2*04. Noticeably, only 1/17 cases with a diagnosis of splenic diffuse red pulp lymphoma utilized IGHV1-2*04 (p<0.02 for comparison to SMZL). The IGHV1-2*04 rearrangements carried significantly longer heavy complementarity-determining region-3 (VH CDR3) than all other cases (median, 22 vs. 17 amino acids, respectively; p<0.001). In addition, 52/79 IGHV1-2*04 cases (65.8%) employed one of the IGHD3-3, IGHD3-9 or IGHD3-10 genes. In 28/32 IGHV1-2*04/IGHD3 rearrangements, the IGHD gene was utilized in the same reading frame, leading to VH CDR3s with common “IGHD-derived” amino acid (AA) motifs. Using bioinformatics tools previously applied to CLL, biased associations of IGHV, IGHD and IGHJ genes with stereotyped VH CDR3s were identified in 25/345 sequences (7.2%). Noticeably, only 10/28 IGHV1-2*04/IGHD3-3 rearrangements with “IGHD-derived” VH CDR3 amino acid motifs could be assigned to “stereotyped” clusters. Despite exhibiting restricted usage of the IGHV1-2*04 and IGHD3-3 genes leading to great overall VH domain similarity, the remaining cases did not fulfill the established criteria for VH CDR3 “stereotypy”, as defined in other lymphoid malignancies, namely CLL. Based on somatic hypermutation (SHM) analysis, the sequences were divided into three groups: (i) truly unmutated (100% germline identity, GI): 46/345 sequences (13.3%); (ii) minimally/borderline mutated (97-99.9% GI): 130/345 sequences (37.7%); and (iii) significantly mutated (<97% identity): 169/345 sequences (49%). At the individual gene level, the distribution of rearrangements of IGHV genes according to SHM status varied significantly. In particular, 56/79 IGHV1-2*04 rearrangements (71%) were predominantly “borderline mutated”, whereas the majority (>67%) of rearrangements utilizing the IGHV3-23, IGHV3-30 and IGHV3-7 genes were “significantly mutated”; finally, IGHV4-34 gene rearrangements were evenly distributed to the three mutational subgroups. Shared (“stereotyped”) AA changes were identified for IGHV1-2*04 rearrangements, with certain FR2 and FR3 codons emerging as “hotspots” for recurrent, conservative AA changes. In conclusion, we demonstrate that more than 30% of cases with a histopathological diagnosis of SMZL on the spleen express IGHV1-2*04 receptors with unusually long VH CDR3s, biased usage of the IGHD3-3 gene, leading to shared “IGHD-derived” VH CDR3 motifs, and very precise molecular features of SHM. The biased expression of a distinctive germline-encoded VH specificity might be considered as evidence for heavy chain dominance in the clonogenic IG receptors in SMZL. These findings allude to selection by specific (super)antigenic element(s) in the pathogenesis of at least a major subset of SMZL. In addition, they raise the intriguing possibility that certain subtypes of SMZL could derive from progenitor cell populations adapted to particular antigenic challenges through cellular selection of VH domain specificities. Disclosures: No relevant conflicts of interest to declare.


2016 ◽  
Author(s):  
Eleisha L. Jackson ◽  
Stephanie J. Spielman ◽  
Claus O. Wilke

AbstractProteins evolve through two primary mechanisms: substitution, where mutations alter a protein’s amino-acid sequence, and insertions and deletions (indels), where amino acids are either added to or removed from the sequence. Protein structure has been shown to influence the rate at which substitutions accumulate across sites in proteins, but whether structure similarly constrains the occurrence of indels has not been rigorously studied. Here, we investigate the extent to which structural properties known to covary with protein evolutionary rates might also predict protein tolerance to indels. Specifically, we analyze a publicly available dataset of single–amino-acid deletion mutations in enhanced green fluorescent protein (eGFP) to assess how well the functional effect of deletions can be predicted from protein structure. We find that weighted contact number (WCN), which measures how densely packed a residue is within the protein’s three-dimensional structure, provides the best single predictor for whether eGFP will tolerate a given deletion. We additionally find that using protein design to explicitly model deletions results in improved predictions of functional status when combined with other structural predictors. Our work suggests that structure plays fundamental role in constraining deletions at sites in proteins, and further that similar biophysical constraints influence both substitutions and deletions. This study therefore provides a solid foundation for future work to examine how protein structure influences tolerance of more complex indel events, such as insertions or large deletions.


2013 ◽  
Author(s):  
Eleisha L. Jackson ◽  
Noah Ollikainen ◽  
Arthur W. Covert III ◽  
Tanja Kortemme ◽  
Claus O. Wilke

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.


Sign in / Sign up

Export Citation Format

Share Document