ff19SB: Amino-Acid Specific Protein Backbone Parameters Trained Against Quantum Mechanics Energy Surfaces in Solution

<p>Molecular dynamics (MD) simulations have become increasingly popular in studying the motions and functions of biomolecules. The accuracy of the simulation, however, is highly determined by the molecular mechanics (MM) force field (FF), a set of functions with adjustable parameters to compute the potential energies from atomic positions. However, the overall quality of the FF, such as our previously published ff99SB and ff14SB, can be limited by assumptions that were made years ago. In the updated model presented here (ff19SB), we have significantly improved the backbone profiles for all 20 amino acids. We fit coupled ϕ/ψ parameters using 2D ϕ/ψ conformational scans for multiple amino acids, using as reference data the entire 2D quantum mechanics (QM) energy surface. We address the polarization inconsistency during dihedral parameter fitting by using both QM and MM in solution. Finally, we examine possible dependency of the backbone fitting on side chain rotamer. To extensively validate ff19SB parameters, we have performed a total of ~5 milliseconds MD simulations in explicit solvent. Our results show that after amino-acid specific training against QM data with solvent polarization, ff19SB not only reproduces the differences in amino acid specific Protein Data Bank (PDB) Ramachandran maps better, but also shows significantly improved capability to differentiate amino acid dependent properties such as helical propensities. We also conclude that an inherent underestimation of helicity is present in ff14SB, which is (inexactly) compensated by an increase in helical content driven by the TIP3P bias toward overly compact structures. In summary, ff19SB, when combined with a more accurate water model such as OPC, should have better predictive power for modeling sequence-specific behavior, protein mutations, and also rational protein design. </p>

Download Full-text

ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.9b00591 ◽

2019 ◽

Vol 16 (1) ◽

pp. 528-552 ◽

Cited By ~ 30

Author(s):

Chuan Tian ◽

Koushik Kasavajhala ◽

Kellon A. A. Belfon ◽

Lauren Raguette ◽

He Huang ◽

...

Keyword(s):

Quantum Mechanics ◽

Amino Acid ◽

Specific Protein ◽

Protein Backbone ◽

Energy Surfaces

Download Full-text

Method to generate highly stable D-amino acid analogs of bioactive helical peptides using a mirror image of the entire PDB

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1711837115 ◽

2018 ◽

Vol 115 (7) ◽

pp. 1505-1510 ◽

Cited By ~ 34

Author(s):

Michael Garton ◽

Satra Nim ◽

Tracy A. Stone ◽

Kyle Ethan Wang ◽

Charles M. Deber ◽

...

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Data Bank ◽

Mirror Image ◽

Proof Of Concept ◽

Human Gut ◽

Helical Peptides ◽

Peptide Engineering ◽

Small Molecule Drugs ◽

Proteins And Peptides

Biologics are a rapidly growing class of therapeutics with many advantages over traditional small molecule drugs. A major obstacle to their development is that proteins and peptides are easily destroyed by proteases and, thus, typically have prohibitively short half-lives in human gut, plasma, and cells. One of the most effective ways to prevent degradation is to engineer analogs from dextrorotary (D)-amino acids, with up to 105-fold improvements in potency reported. We here propose a general peptide-engineering platform that overcomes limitations of previous methods. By creating a mirror image of every structure in the Protein Data Bank (PDB), we generate a database of ∼2.8 million D-peptides. To obtain a D-analog of a given peptide, we search the (D)-PDB for similar configurations of its critical—“hotspot”—residues. As a proof of concept, we apply our method to two peptides that are Food and Drug Administration approved as therapeutics for diabetes and osteoporosis, respectively. We obtain D-analogs that activate the GLP1 and PTH1 receptors with the same efficacy as their natural counterparts and show greatly increased half-life.

Download Full-text

DLPacker: Deep Learning for Prediction of Amino Acid Side Chain Conformations in Proteins

10.1101/2021.05.23.445347 ◽

2021 ◽

Author(s):

Mikita Misiura ◽

Raghav Shroff ◽

Ross Thyer ◽

Anatoly Kolomeisky

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Protein Design ◽

Structure Prediction ◽

Deep Neural Networks ◽

Side Chain ◽

Amino Acid Side Chain ◽

Image Transformation ◽

Hydrophobic Amino Acids ◽

Chain Conformations

Prediction of side chain conformations of amino acids in proteins (also termed 'packing') is an important and challenging part of protein structure prediction with many interesting applications in protein design. A variety of methods for packing have been developed but more accurate ones are still needed. Machine learning (ML) methods have recently become a powerful tool for solving various problems in diverse areas of science, including structural biology. In this work we evaluate the potential of Deep Neural Networks (DNNs) for prediction of amino acid side chain conformations. We formulate the problem as image-to-image transformation and train a U-net style DNN to solve the problem. We show that our method outperforms other physics-based methods by a significant margin: reconstruction RMSDs for most amino acids are about 20% smaller compared to SCWRL4 and Rosetta Packer with RMSDs for bulky hydrophobic amino acids Phe, Tyr and Trp being up to 50% smaller.

Download Full-text

Complex evolutionary footprints revealed in an analysis of reused protein segments of diverse lengths

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1707642114 ◽

2017 ◽

Vol 114 (44) ◽

pp. 11703-11708 ◽

Cited By ~ 27

Author(s):

Sergey Nepomnyachiy ◽

Nir Ben-Tal ◽

Rachel Kolodny

Keyword(s):

Amino Acid ◽

Protein Design ◽

De Novo ◽

Data Bank ◽

Similar Sequence ◽

Design Data ◽

Structural Domains ◽

Evolutionary Advantage ◽

Protein Alignments

Proteins share similar segments with one another. Such “reused parts”—which have been successfully incorporated into other proteins—are likely to offer an evolutionary advantage over de novo evolved segments, as most of the latter will not even have the capacity to fold. To systematically explore the evolutionary traces of segment “reuse” across proteins, we developed an automated methodology that identifies reused segments from protein alignments. We search for “themes”—segments of at least 35 residues of similar sequence and structure—reused within representative sets of 15,016 domains [Evolutionary Classification of Protein Domains (ECOD) database] or 20,398 chains [Protein Data Bank (PDB)]. We observe that theme reuse is highly prevalent and that reuse is more extensive when the length threshold for identifying a theme is lower. Structural domains, the best characterized form of reuse in proteins, are just one of many complex and intertwined evolutionary traces. Others include long themes shared among a few proteins, which encompass and overlap with shorter themes that recur in numerous proteins. The observed complexity is consistent with evolution by duplication and divergence, and some of the themes might include descendants of ancestral segments. The observed recursive footprints, where the same amino acid can simultaneously participate in several intertwined themes, could be a useful concept for protein design. Data are available at http://trachel-srv.cs.haifa.ac.il/rachel/ppi/themes/.

Download Full-text

Fragger: a protein fragment picker for structural queries

F1000Research ◽

10.12688/f1000research.12486.2 ◽

2018 ◽

Vol 6 ◽

pp. 1722 ◽

Cited By ~ 1

Author(s):

Francois Berenger ◽

David Simoncini ◽

Arnout Voet ◽

Rojan Shrestha ◽

Kam Y.J. Zhang

Keyword(s):

Amino Acid ◽

Protein Design ◽

Data Bank ◽

Amino Acid Sequences ◽

Structural Fragment ◽

Protein Fragment ◽

Distance Threshold ◽

Protein Fragments ◽

Specific Subset ◽

Design Activities

Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

Download Full-text

Atypical Structural Tendencies Among Low-Complexity Domains in the Protein Data Bank Proteome

10.1101/807438 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sean M. Cascarina ◽

Mikaela R. Elder ◽

Eric D. Ross

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Secondary Structure ◽

Physical Properties ◽

Protein Data Bank ◽

Data Bank ◽

Low Complexity ◽

Amino Acid Sequences ◽

Single Amino Acid ◽

Intrinsically Disordered

AbstractA variety of studies have suggested that low-complexity domains (LCDs) tend to be intrinsically disordered and are relatively rare within structured proteins in the protein data bank (PDB). Although LCDs are often treated as a single class, we previously found that LCDs enriched in different amino acids can exhibit substantial differences in protein metabolism and function. Therefore, we wondered whether the structural conformations of LCDs are likewise dependent on which specific amino acids are enriched within each LCD. Here, we directly examined relationships between enrichment of individual amino acids and secondary structure preferences across the entire PDB proteome. Secondary structure preferences varied as a function of the identity of the amino acid enriched and its degree of enrichment. Furthermore, divergence in secondary structure profiles often occurred for LCDs enriched in physicochemically similar amino acids (e.g. valine vs. leucine), indicating that LCDs composed of related amino acids can have distinct secondary structure preferences. Comparison of LCD secondary structure preferences with numerous pre-existing secondary structure propensity scales resulted in relatively poor correlations for certain types of LCDs, indicating that these scales may not capture secondary structure preferences as sequence complexity decreases. Collectively, these observations provide a highly resolved view of structural preferences among LCDs parsed by the nature and magnitude of single amino acid enrichment.Author SummaryThe structures that proteins adopt are directly related to their amino acid sequences. Low-complexity domains (LCDs) in protein sequences are unusual regions made up of only a few different types of amino acids. Although this is the key feature that classifies sequences as LCDs, the physical properties of LCDs will differ based on the types of amino acids that are found in each domain. For example, the sequences “AAAAAAAAAA”, “EEEEEEEEEE”, and “EEKRKEEEKE” will have very different properties, even though they would all be classified as LCDs by traditional methods. In a previous study, we developed a new method to further divide LCDs into categories that more closely reflect the differences in their physical properties. In this study, we apply that approach to examine the structures of LCDs when sorted into different categories based on their amino acids. This allowed us to define relationships between the types of amino acids in the LCDs and their corresponding structures. Since protein structure is closely related to protein function, this has important implications for understanding the basic functions and properties of LCDs in a variety of proteins.

Download Full-text

Hybrid approach to analysis of β-sheet structures based on signal processing and statistical consideration

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2010.0382 ◽

2010 ◽

Vol 467 (2128) ◽

pp. 1052-1072

Author(s):

V. Vojisavljevic ◽

E. Pirogova ◽

D. M. Davidovic ◽

I. Cosic

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Protein Design ◽

Hybrid Approach ◽

Three Dimensional ◽

Dimensional Structure ◽

Spectra Analysis ◽

Sheet Structure ◽

The Relationship ◽

Β Sheet

A number of biotechnology applications are based on protein design. For this design, the relationship between a protein’s primary structure and its conformation is of vital importance. A β-sheet is a common feature of a protein’s two-dimensional structure; therefore, elucidating the principles governing β-sheet structure and its stability is critical for understanding the protein-folding process. In the three-dimensional representation of protein molecules, C α carbon coordinates (carbon atom immediately adjacent to the carboxylate group) have often been employed instead of the complete set of coordinates for the corresponding residues. Using the C α carbon coordinates, we showed that particular amino acids are not randomly distributed within a β-sheet structure. On the basis of a new statistical approach for the analysis of a spatial distribution of amino acids in a protein, presented by their physico-chemical parameters, the electron–ion interaction potential (EIIP) and hydrophobicity, are described here. The relationship between amino acid positions inside the β-sheet and the EIIP and hydrophobicity parameters was established. The correlation between amino acid propensities related to the β-sheet was examined using multiple cross-spectra analysis. We also applied the continuous wavelet transform for the analysis of selected β-sheet structures using the EIIP and hydrophobicity parameters. The findings provide new insight into conformational propensities of amino acids for the adaption of β-sheet structures.

Download Full-text

Topological Water Network Analysis Around Amino Acids

Molecules ◽

10.3390/molecules24142653 ◽

2019 ◽

Vol 24 (14) ◽

pp. 2653 ◽

Cited By ~ 2

Author(s):

Kwang-Eun Choi ◽

Eunkyoung Chae ◽

Anand Balupuri ◽

Hye Ree Yoon ◽

Nam Sook Kang

Keyword(s):

Amino Acids ◽

Free Energy ◽

Md Simulations ◽

Data Bank ◽

Free Energy Calculation ◽

Protein Hydration ◽

Water Molecules ◽

Energy Calculation ◽

Water Network ◽

Energy Perturbation

Water molecules play a key role in protein stability, folding, function and ligand binding. Protein hydration has been studied using free energy perturbation algorithms. However, the study of protein hydration without free energy calculation is also an active field of research. Accordingly, topological water network (TWN) analysis has been carried out instead of free energy calculation in the present work to investigate hydration of proteins. Water networks around 20 amino acids in the aqueous solution were explored through molecular dynamics (MD) simulations. These simulation results were compared with experimental observations. Water molecules from the protein data bank structures showed TWN patterns similar to MD simulations. This work revealed that TWNs are effected by the surrounding environment. TWNs could provide valuable clues about the environment around amino acid residues in the proteins. The findings from this study could be exploited for TWN-based drug discovery and development.

Download Full-text

Fragger: a protein fragment picker for structural queries

F1000Research ◽

10.12688/f1000research.12486.1 ◽

2017 ◽

Vol 6 ◽

pp. 1722

Author(s):

Francois Berenger ◽

David Simoncini ◽

Arnout Voet ◽

Rojan Shrestha ◽

Kam Y.J. Zhang

Keyword(s):

Amino Acid ◽

Protein Design ◽

Data Bank ◽

Amino Acid Sequences ◽

Structural Fragment ◽

Protein Fragment ◽

Distance Threshold ◽

Protein Fragments ◽

Specific Subset ◽

Design Activities

Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

Download Full-text