HISTOGRAM-BASED SCORING SCHEMES FOR PROTEIN NMR RESONANCE ASSIGNMENT

2004 ◽  
Vol 02 (04) ◽  
pp. 747-764 ◽  
Author(s):  
XIANG WAN ◽  
THEODORE TEGOS ◽  
GUOHUI LIN

In NMR protein structure determination, after the resonance peaks have been identified and chemical shifts from peaks across multiple spectra have been grouped into spin systems, associating these spin systems to their host residues is the key toward the success of structural information extraction and thus the key to the success of the structure calculation. To achieve accurate enough structure calculation, a near complete and accurate assignment is a prerequisite. There are two pieces of information that can be used into the assignment, one of which is the adjacency information among the spin systems and the other is the signature information of the spin systems. The signature information reflects the fact that, generally speaking, for one type of amino acid residing in a specific local structural environment, the chemical shifts for the atoms inside the amino acid fall into some very narrow distinct ranges. In most of the existing work, normal distributions are assumed with means and standard deviations statistically collected from the available data. In this paper, we followed a simple yet effective histogram-based way to estimate for every spin system the probability that its host is a certain type of amino acid residing in a certain type of secondary structure. We used two combinations of chemical shifts to demonstrate the effectiveness of this type of histogram-based scoring schemes.

2007 ◽  
Vol 05 (02a) ◽  
pp. 313-333 ◽  
Author(s):  
XIANG WAN ◽  
GUOHUI LIN

The success in backbone resonance sequential assignment is fundamental to three dimensional protein structure determination via Nuclear Magnetic Resonance (NMR) spectroscopy. Such a sequential assignment can roughly be partitioned into three separate steps: grouping resonance peaks in multiple spectra into spin systems, chaining the resultant spin systems into strings, and assigning these strings to non-overlapping consecutive amino acid residues in the target protein. Separately dealing with these three steps has been adopted in many existing assignment programs, and it works well on protein NMR data with close-to-ideal quality, while only moderately or even poorly on most real protein datasets, where noises as well as data degeneracies occur frequently. We propose in this work to partition the sequential assignment not by physical steps, but only virtual steps, and use their outputs to cross validate each other. The novelty lies in the places, where the ambiguities at the grouping step will be resolved in finding the highly confident strings at the chaining step, and the ambiguities at the chaining step will be resolved by examining the mappings of strings at the assignment step. In this way, all ambiguities at the sequential assignment will be resolved globally and optimally. The resultant assignment program is called Graph-based Approach for Sequential Assignment (GASA), which has been compared to several recent similar developments including PACES, RANDOM, MARS, and RIBRA. The performance comparisons with these works demonstrated that GASA is more promising for practical use.


2018 ◽  
Author(s):  
Allan J. R. Ferrari ◽  
Fabio C. Gozzo ◽  
Leandro Martinez

<div><p>Chemical cross-linking/Mass Spectrometry (XLMS) is an experimental method to obtain distance constraints between amino acid residues, which can be applied to structural modeling of tertiary and quaternary biomolecular structures. These constraints provide, in principle, only upper limits to the distance between amino acid residues along the surface of the biomolecule. In practice, attempts to use of XLMS constraints for tertiary protein structure determination have not been widely successful. This indicates the need of specifically designed strategies for the representation of these constraints within modeling algorithms. Here, a force-field designed to represent XLMS-derived constraints is proposed. The potential energy functions are obtained by computing, in the database of known protein structures, the probability of satisfaction of a topological cross-linking distance as a function of the Euclidean distance between amino acid residues. The force-field can be easily incorporated into current modeling methods and software. In this work, the force-field was implemented within the Rosetta ab initio relax protocol. We show a significant improvement in the quality of the models obtained relative to current strategies for constraint representation. This force-field contributes to the long-desired goal of obtaining the tertiary structures of proteins using XLMS data. Force-field parameters and usage instructions are freely available at http://m3g.iqm.unicamp.br/topolink/xlff <br></p></div><p></p><p></p>


2021 ◽  
Vol 75 (1) ◽  
pp. 39-70
Author(s):  
Lorna J. Smith ◽  
Wilfred F. van Gunsteren ◽  
Bartosz Stankiewicz ◽  
Niels Hansen

AbstractValues of 3J-couplings as obtained from NMR experiments on proteins cannot easily be used to determine protein structure due to the difficulty of accounting for the high sensitivity of intermediate 3J-coupling values (4–8 Hz) to the averaging period that must cover the conformational variability of the torsional angle related to the 3J-coupling, and due to the difficulty of handling the multiple-valued character of the inverse Karplus relation between torsional angle and 3J-coupling. Both problems can be solved by using 3J-coupling time-averaging local-elevation restraining MD simulation. Application to the protein hen egg white lysozyme using 213 backbone and side-chain 3J-coupling restraints shows that a conformational ensemble compatible with the experimental data can be obtained using this technique, and that accounting for averaging and the ability of the algorithm to escape from local minima for the torsional angle induced by the Karplus relation, are essential for a comprehensive use of 3J-coupling data in protein structure determination.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10381
Author(s):  
Rohit Nandakumar ◽  
Valentin Dinu

Throughout the history of drug discovery, an enzymatic-based approach for identifying new drug molecules has been primarily utilized. Recently, protein–protein interfaces that can be disrupted to identify small molecules that could be viable targets for certain diseases, such as cancer and the human immunodeficiency virus, have been identified. Existing studies computationally identify hotspots on these interfaces, with most models attaining accuracies of ~70%. Many studies do not effectively integrate information relating to amino acid chains and other structural information relating to the complex. Herein, (1) a machine learning model has been created and (2) its ability to integrate multiple features, such as those associated with amino-acid chains, has been evaluated to enhance the ability to predict protein–protein interface hotspots. Virtual drug screening analysis of a set of hotspots determined on the EphB2-ephrinB2 complex has also been performed. The predictive capabilities of this model offer an AUROC of 0.842, sensitivity/recall of 0.833, and specificity of 0.850. Virtual screening of a set of hotspots identified by the machine learning model developed in this study has identified potential medications to treat diseases caused by the overexpression of the EphB2-ephrinB2 complex, including prostate, gastric, colorectal and melanoma cancers which are linked to EphB2 mutations. The efficacy of this model has been demonstrated through its successful ability to predict drug-disease associations previously identified in literature, including cimetidine, idarubicin, pralatrexate for these conditions. In addition, nadolol, a beta blocker, has also been identified in this study to bind to the EphB2-ephrinB2 complex, and the possibility of this drug treating multiple cancers is still relatively unexplored.


Viruses ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2316
Author(s):  
Nodoka Kasajima ◽  
Keita Matsuno ◽  
Hiroko Miyamoto ◽  
Masahiro Kajihara ◽  
Manabu Igarashi ◽  
...  

Viral protein 35 (VP35) of Ebola virus (EBOV) is a multifunctional protein that mainly acts as a viral polymerase cofactor and an interferon antagonist. VP35 interacts with the viral nucleoprotein (NP) and double-stranded RNA for viral RNA transcription/replication and inhibition of type I interferon (IFN) production, respectively. The C-terminal portion of VP35, which is termed the IFN-inhibitory domain (IID), is important for both functions. To further identify critical regions in this domain, we analyzed the physical properties of the surface of VP35 IID, focusing on hydrophobic patches, which are expected to be functional sites that are involved in interactions with other molecules. Based on the known structural information of VP35 IID, three hydrophobic patches were identified on its surface and their biological importance was investigated using minigenome and IFN-β promoter-reporter assays. Site-directed mutagenesis revealed that some of the amino acid substitutions that were predicted to disrupt the hydrophobicity of the patches significantly decreased the efficiency of viral genome replication/transcription due to reduced interaction with NP, suggesting that the hydrophobic patches might be critical for the formation of a replication complex through the interaction with NP. It was also found that the hydrophobic patches were involved in the IFN-inhibitory function of VP35. These results highlight the importance of hydrophobic patches on the surface of EBOV VP35 IID and also indicate that patch analysis is useful for the identification of amino acid residues that directly contribute to protein functions.


2002 ◽  
Vol 184 (8) ◽  
pp. 2225-2234 ◽  
Author(s):  
Jason P. Folster ◽  
Terry D. Connell

ABSTRACT ChiA, an 88-kDa endochitinase encoded by the chiA gene of the gram-negative enteropathogen Vibrio cholerae, is secreted via the eps-encoded main terminal branch of the general secretory pathway (GSP), a mechanism which also transports cholera toxin. To localize the extracellular transport signal of ChiA that initiates transport of the protein through the GSP, a chimera comprised of ChiA fused at the N terminus with the maltose-binding protein (MalE) of Escherichia coli and fused at the C terminus with a 13-amino-acid epitope tag (E-tag) was expressed in strain 569B(chiA::Kanr), a chiA-deficient but secretion-competent mutant of V. cholerae. Fractionation studies revealed that blockage of the natural N terminus and C terminus of ChiA did not prevent secretion of the MalE-ChiA-E-tag chimera. To locate the amino acid sequences which encoded the transport signal, a series of truncations of ChiA were engineered. Secretion of the mutant polypeptides was curtailed only when ChiA was deleted from the N terminus beyond amino acid position 75 or from the C terminus beyond amino acid 555. A mutant ChiA comprised of only those amino acids was secreted by wild-type V. cholerae but not by an epsD mutant, establishing that amino acids 75 to 555 independently harbored sufficient structural information to promote secretion by the GSP of V. cholerae. Cys77 and Cys537, two cysteines located just within the termini of ChiA(75-555), were not required for secretion, indicating that those residues were not essential for maintaining the functional activity of the ChiA extracellular transport signal.


Sign in / Sign up

Export Citation Format

Share Document