scholarly journals Artificial Intelligence Guided Conformational Mining of Intrinsically Disordered Proteins

2021 ◽  
Author(s):  
Aayush Gupta ◽  
Souvik Dey ◽  
Huan-Xiang Zhou

Artificial intelligence recently achieved the breakthrough of predicting the three-dimensional structures of proteins. The next frontier is presented by intrinsically disordered proteins (IDPs), which, representing 30% to 50% of proteomes, readily access vast conformational space. Molecular dynamics (MD) simulations are promising in sampling IDP conformations, but only at extremely high computational cost. Here, we developed generative autoencoders that learn from short MD simulations and generate full conformational ensembles. An encoder represents IDP conformations as vectors in a reduced-dimensional latent space. The mean vector and covariance matrix of the training dataset are calculated to define a multivariate Gaussian distribution, from which vectors are sampled and fed to a decoder to generate new conformations. The ensembles of generated conformations cover those sampled by long MD simulations and are validated by small-angle X-ray scattering profile and NMR chemical shifts. This work illustrates the vast potential of artificial intelligence in conformational mining of IDPs.

2019 ◽  
Vol 116 (41) ◽  
pp. 20446-20452 ◽  
Author(s):  
Utsab R. Shrestha ◽  
Puneet Juneja ◽  
Qiu Zhang ◽  
Viswanathan Gurumoorthy ◽  
Jose M. Borreguero ◽  
...  

Intrinsically disordered proteins (IDPs) are abundant in eukaryotic proteomes, play a major role in cell signaling, and are associated with human diseases. To understand IDP function it is critical to determine their configurational ensemble, i.e., the collection of 3-dimensional structures they adopt, and this remains an immense challenge in structural biology. Attempts to determine this ensemble computationally have been hitherto hampered by the necessity of reweighting molecular dynamics (MD) results or biasing simulation in order to match ensemble-averaged experimental observables, operations that reduce the precision of the generated model because different structural ensembles may yield the same experimental observable. Here, by employing enhanced sampling MD we reproduce the experimental small-angle neutron and X-ray scattering profiles and the NMR chemical shifts of the disordered N terminal (SH4UD) of c-Src kinase without reweighting or constraining the simulations. The unbiased simulation results reveal a weakly funneled and rugged free energy landscape of SH4UD, which gives rise to a heterogeneous ensemble of structures that cannot be described by simple polymer theory. SH4UD adopts transient helices, which are found away from known phosphorylation sites and could play a key role in the stabilization of structural regions necessary for phosphorylation. Our findings indicate that adequately sampled molecular simulations can be performed to provide accurate physical models of flexible biosystems, thus rationalizing their biological function.


2019 ◽  
Vol 73 (12) ◽  
pp. 713-725 ◽  
Author(s):  
Ruth Hendus-Altenburger ◽  
Catarina B. Fernandes ◽  
Katrine Bugge ◽  
Micha B. A. Kunze ◽  
Wouter Boomsma ◽  
...  

Abstract Phosphorylation is one of the main regulators of cellular signaling typically occurring in flexible parts of folded proteins and in intrinsically disordered regions. It can have distinct effects on the chemical environment as well as on the structural properties near the modification site. Secondary chemical shift analysis is the main NMR method for detection of transiently formed secondary structure in intrinsically disordered proteins (IDPs) and the reliability of the analysis depends on an appropriate choice of random coil model. Random coil chemical shifts and sequence correction factors were previously determined for an Ac-QQXQQ-NH2-peptide series with X being any of the 20 common amino acids. However, a matching dataset on the phosphorylated states has so far only been incompletely determined or determined only at a single pH value. Here we extend the database by the addition of the random coil chemical shifts of the phosphorylated states of serine, threonine and tyrosine measured over a range of pH values covering the pKas of the phosphates and at several temperatures (www.bio.ku.dk/sbinlab/randomcoil). The combined results allow for accurate random coil chemical shift determination of phosphorylated regions at any pH and temperature, minimizing systematic biases of the secondary chemical shifts. Comparison of chemical shifts using random coil sets with and without inclusion of the phosphoryl group, revealed under/over estimations of helicity of up to 33%. The expanded set of random coil values will improve the reliability in detection and quantification of transient secondary structure in phosphorylation-modified IDPs.


ChemPhysChem ◽  
2013 ◽  
Vol 14 (13) ◽  
pp. 3034-3045 ◽  
Author(s):  
Jaka Kragelj ◽  
Valéry Ozenne ◽  
Martin Blackledge ◽  
Malene Ringkjøbing Jensen

2021 ◽  
Author(s):  
Jakob Toudahl Nielsen ◽  
Frans A.A. Mulder

AbstractNMR chemical shifts (CSs) are delicate reporters of local protein structure, and recent advances in random coil CS (RCCS) prediction and interpretation now offer the compelling prospect of inferring small populations of structure from small deviations from RCCSs. Here, we present CheSPI, a simple and efficient method that provides unbiased and sensitive aggregate measures of local structure and disorder. It is demonstrated that CheSPI can predict even very small amounts of residual structure and robustly delineate subtle differences into four structural classes for intrinsically disordered proteins. For structured regions and proteins, CheSPI can assign up to eight structural classes, which coincide with the well-known DSSP classification. The program is freely available, and can either be invoked from URL www.protein-nmr.org as a web implementation, or run locally from command line as a python program. CheSPI generates comprehensive numeric and graphical output for intuitive annotation and visualization of protein structures. A number of examples are provided.


2020 ◽  
Author(s):  
Alan Hicks ◽  
Cristian A. Escobar ◽  
Timothy A. Cross ◽  
Huan-Xiang Zhou

AbstractIntrinsically disordered proteins (IDPs) account for a significant fraction of any proteome and are central to numerous cellular functions. Yet how sequences of IDPs code for their conformational dynamics is poorly understood. Here we combined NMR spectroscopy, small-angle X-ray scattering (SAXS), and molecular dynamics (MD) simulations to characterize the conformations and dynamics of ChiZ1-64. This IDP is the N-terminal fragment (residues 1-64) of the transmembrane protein ChiZ, a component of the cell division machinery in Mycobacterium tuberculosis. Its N-half contains most of the prolines and all of the anionic residues while the C-half most of the glycines and cationic residues. MD simulations, first validated by SAXS and secondary chemical shift data, found scant α-helices or β-strands but considerable propensity for polyproline II (PPII) torsion angles. Importantly, several blocks of residues (e.g., 11-29) emerge as “correlated segments”, identified by frequent formation of PPII stretches, salt bridges, cation-π interactions, and sidechain-backbone hydrogen bonds. NMR relaxation experiments showed non-uniform transverse relaxation rates (R2s) and nuclear Overhauser enhancements (NOEs) along the sequence (e.g., high R2s and NOEs for residues 11-14 and 23-28). MD simulations further revealed that the extent of segmental correlation is sequence-dependent: segments where internal interactions are more prevalent manifest elevated “collective” motions on the 5-10 ns timescale and suppressed local motions on the sub-ns timescale. Amide proton exchange rates provides corroboration, with residues in the most correlated segment exhibiting the highest protection factors. We propose correlated segment as a defining feature for the conformation and dynamics of IDPs.


2019 ◽  
Author(s):  
Joao Victor de Souza Cunha ◽  
Francesc Sabanes Zariquiey ◽  
Agnieszka K. Bronowska

Intrinsically disordered proteins (IDPs) are molecules without a fixed tertiary structure, exerting crucial roles in cellular signalling, growth and molecular recognition events. Due to their high plasticity, IDPs are very challenging in experimental and computational structural studies. To provide detailed atomic insight in IDPs dynamics governing its functional mechanisms, all-atom molecular dynamics (MD) simulations are widely employed. However, the current generalist force fields and solvent models are unable to generate satisfactory ensembles for IDPs when compared to existing experimental data. In this work, we present a new solvation model, denoted as Charge-Augmented 3 Point Water model for Intrinsically-disordered Proteins (CAIPi3P). CAIPi3P has been generated by performing a systematic scanning of atomic partial charges assigned to the widely popular molecular scaffold of the three-point TIP3P water model. We found that explicit solvent MD simulations employing CAIPi3P solvation considerably improved the SAXS scattering profiles for three different IDPs. Not surprisingly, this improvement was further enhanced by using CAIPi3P water in combination with the protein force field parametrized for IDPs. We have also demonstrated applicability of CAIPi3P to molecular systems containing structured as well as intrinsically disordered regions/domains. Our results highlight the crucial importance of solvent effects for generating molecular ensembles of IDPs which reproduce the experimental data available. Hence, we conclude that our newly developed CAIPi3P solvation model is a valuable tool assisting molecular simulations of intrinsically disordered proteins and assessing their molecular dynamics.


2020 ◽  
Author(s):  
Suman Samantray ◽  
Feng Yin ◽  
Batuhan Kav ◽  
Birgit Strodel

AbstractThe progress towards understanding the molecular basis of Alzheimers’s disease is strongly connected to elucidating the early aggregation events of the amyloid-β (Aβ) peptide. Molecular dynamics (MD) simulations provide a viable technique to study the aggregation of Aβ into oligomers with high spatial and temporal resolution. However, the results of an MD simulation can only be as good as the underlying force field. A recent study by our group showed that none of the force fields tested can distinguish between aggregation-prone and non-aggregating peptide sequences, producing the same and in most cases too fast aggregation kinetics for all peptides. Since then, new force fields specially designed for intrinsically disordered proteins such as Aβ were developed. Here, we assess the applicability of these new force fields to studying peptide aggregation using the Aβ16−22 peptide and mutations of it as test case. We investigate their performance in modeling the monomeric state, the aggregation into oligomers, and the stability of the aggregation end product, i.e., the fibrillar state. A main finding is that changing the force field has a stronger effect on the simulated aggregation pathway than changing the peptide sequence. Also the new force fields are not able to reproduce the experimental aggregation propensity order of the peptides. Dissecting the various energy contributions shows that AMBER99SB-disp overestimates the interactions between the peptides and water, thereby inhibiting peptide aggregation. More promising results are obtained with CHARMM36m and especially its version with increased protein–water interactions. It is thus recommended to use this force field for peptide aggregation simulations and base future reparameterizations on it.


2019 ◽  
Author(s):  
Ruchi Lohia ◽  
Reza Salari ◽  
Grace Brannigan

<p>The role of electrostatic interactions and mutations that change charge states in intrinsically disordered proteins (IDPs) is well-established, but many disease-associated mutations in IDPs are charge-neutral. The Val66Met single nucleotide polymorphism (SNP) in precursor brain-derived neurotrophic factor (BDNF) is one of the earliest SNPs to be associated with neuropsychiatric disorders, and the underlying molecular mechanism is unknown. Here we report on over 250 μs of fully-atomistic, explicit solvent, temperature replica exchange molecular dynamics (MD) simulations of the 91 residue BDNF prodomain, for both the V66 and M66 sequence. The simulations were able to correctly reproduce the location of both local and non-local secondary changes due to the Val66Met mutation when compared with NMR spectroscopy. We find that the change in local structure is mediated via entropic and sequence specific effects. We developed a hierarchical sequence-based framework for analysis and conceptualization, which first identifies “blobs” of 5-15 residues representing local globular regions or linkers. We use this framework within a novel test for enrichment of higher-order (tertiary) structure in disordered proteins; the size and shape of each blob is extracted from MD simulation of the real protein (RP), and used to parameterize a self-avoiding heterogenous polymer (SAHP). The SAHP version of the BDNF prodomain suggested a protein segmented into three regions, with a central long, highly disordered polyampholyte linker separating two globular regions. This effective segmentation was also observed in full simulations of the RP, but the Val66Met substitution significantly increased interactions across the linker, as well as the number of participating residues. The Val66Met substitution replaces β-bridging between Val66 and Val94 (on either side of the linker) with specific side-chain interactions between Met66 and Met95.The protein backbone in the vicinity of Met95 is then free to form β-bridges with residues 31-41 near the N-terminus, which condenses the protein. A significant role for Met/Met interactions is consistent with previously-observed non-local effects of the Val66Met SNP, as well as established interactions between the Met66 sequence and a Met-rich receptor that initiates neuronal growth cone retraction.</p>


2020 ◽  
Vol 21 (17) ◽  
pp. 6166
Author(s):  
Joao V. de Souza ◽  
Francesc Sabanés Zariquiey ◽  
Agnieszka K. Bronowska

Intrinsically disordered proteins (IDPs) are molecules without a fixed tertiary structure, exerting crucial roles in cellular signalling, growth and molecular recognition events. Due to their high plasticity, IDPs are very challenging in experimental and computational structural studies. To provide detailed atomic insight in IDPs’ dynamics governing their functional mechanisms, all-atom molecular dynamics (MD) simulations are widely employed. However, the current generalist force fields and solvent models are unable to generate satisfactory ensembles for IDPs when compared to existing experimental data. In this work, we present a new solvation model, denoted as the Charge-Augmented Three-Point Water Model for Intrinsically Disordered Proteins (CAIPi3P). CAIPi3P has been generated by performing a systematic scan of atomic partial charges assigned to the widely popular molecular scaffold of the three-point TIP3P water model. We found that explicit solvent MD simulations employing CAIPi3P solvation considerably improved the small-angle X-ray scattering (SAXS) scattering profiles for three different IDPs. Not surprisingly, this improvement was further enhanced by using CAIPi3P water in combination with the protein force field parametrized for IDPs. We also demonstrated the applicability of CAIPi3P to molecular systems containing structured as well as intrinsically disordered regions/domains. Our results highlight the crucial importance of solvent effects for generating molecular ensembles of IDPs which reproduce the experimental data available. Hence, we conclude that our newly developed CAIPi3P solvation model is a valuable tool for molecular simulations of intrinsically disordered proteins and assessing their molecular dynamics.


Sign in / Sign up

Export Citation Format

Share Document