scholarly journals Growing Glycans in Rosetta: Accurate de novo glycan modeling, density fitting, and rational sequon design

2021 ◽  
Author(s):  
Jared Adolf-Bryfogle ◽  
Jason W Labonte ◽  
John C Kraft ◽  
Maxim Shapavolov ◽  
Sebastian Raemisch ◽  
...  

Carbohydrates and glycoproteins modulate key biological functions. Computational approaches inform function to aid in carbohydrate structure prediction, structure determination, and design. However, experimental structure determination of sugar polymers is notoriously difficult as glycans can sample a wide range of low energy conformations, thus limiting the study of glycan-mediated molecular interactions. In this work, we expanded the RosettaCarbohydrate framework, developed and benchmarked effective tools for glycan modeling and design, and extended the Rosetta software suite to better aid in structural analysis and benchmarking tasks through the SimpleMetrics framework. We developed a glycan-modeling algorithm, GlycanTreeModeler, that computationally builds glycans layer-by-layer, using adaptive kernel density estimates (KDE) of common glycan conformations derived from data in the Protein Data Bank (PDB) and from quantum mechanics (QM) calculations. After a rigorous optimization of kinematic and energetic considerations to improve near-native sampling enrichment and decoy discrimination, GlycanTreeModeler was benchmarked on a test set of diverse glycan structures, or "trees". Structures predicted by GlycanTreeModeler agreed with native structures at high accuracy for both de novo modeling and experimental density-guided building. GlycanTreeModeler algorithms and associated tools were employed to design de novo glycan trees into a protein nanoparticle vaccine that are able to direct the immune response by shielding regions of the scaffold from antibody recognition. This work will inform glycoprotein model prediction, aid in both X-ray and electron microscopy density solutions and refinement, and help lead the way towards a new era of computational glycobiology.

2014 ◽  
Vol 70 (a1) ◽  
pp. C136-C136 ◽  
Author(s):  
Cory Widdifield ◽  
Maria Baias ◽  
Jean-Nicolas Dumez ◽  
Per H. Svensson ◽  
Hugh Thompson ◽  
...  

State-of-the-art work in the field of NMR crystallography for molecular systems at natural abundance has recently focused on the accurate measurement of 1H chemical shift values. We will show how when coupled with crystal structure prediction (CSP) methods, this protocol is well-suited for solving the crystal structures of small to medium sized organic molecules, including cocaine and the de-novo structure determination of AZD8329.[1,2] As complementary 1D and 2D NMR experiments are needed for the 1H assignment process, other information, such as isotropic 13C chemical shift values (δiso) are measured. Unfortunately, 13C chemical shifts are not generally useful for structure determination. Additional NMR parameters that are sensitive to structure would ensure that the structure determination procedure is robust, and would provide more accurate refinements when studying larger or more challenging systems. Here, we measure 13C chemical shift tensors for a variety of prototypical organic pharmaceuticals and use density functional theory computations under the gauge-including projector augmented-wave (GIPAW) formalism to probe whether these parameters may be discriminatory for unit cell determinations and structure determination (notably when added to the CSP + 1H chemical shifts protocol).


2021 ◽  
Vol 8 ◽  
Author(s):  
Charles Christoffer ◽  
Vijay Bharadwaj ◽  
Ryan Luu ◽  
Daisuke Kihara

Protein-protein docking is a useful tool for modeling the structures of protein complexes that have yet to be experimentally determined. Understanding the structures of protein complexes is a key component for formulating hypotheses in biophysics regarding the functional mechanisms of complexes. Protein-protein docking is an established technique for cases where the structures of the subunits have been determined. While the number of known structures deposited in the Protein Data Bank is increasing, there are still many cases where the structures of individual proteins that users want to dock are not determined yet. Here, we have integrated the AttentiveDist method for protein structure prediction into our LZerD webserver for protein-protein docking, which enables users to simply submit protein sequences and obtain full-complex atomic models, without having to supply any structure themselves. We have further extended the LZerD docking interface with a symmetrical homodimer mode. The LZerD server is available at https://lzerd.kiharalab.org/.


Author(s):  
Luciano A Abriata ◽  
Matteo Dal Peraro

Abstract Residue coevolution estimations coupled to machine learning methods are revolutionizing the ability of protein structure prediction approaches to model proteins that lack clear homologous templates in the Protein Data Bank (PDB). This has been patent in the last round of the Critical Assessment of Structure Prediction (CASP), which presented several very good models for the hardest targets. Unfortunately, literature reporting on these advances often lacks digests tailored to lay end users; moreover, some of the top-ranking predictors do not provide webservers that can be used by nonexperts. How can then end users benefit from these advances and correctly interpret the predicted models? Here we review the web resources that biologists can use today to take advantage of these state-of-the-art methods in their research, including not only the best de novo modeling servers but also datasets of models precomputed by experts for structurally uncharacterized protein families. We highlight their features, advantages and pitfalls for predicting structures of proteins without clear templates. We present a broad number of applications that span from driving forward biochemical investigations that lack experimental structures to actually assisting experimental structure determination in X-ray diffraction, cryo-EM and other forms of integrative modeling. We also discuss issues that must be considered by users yet still require further developments, such as global and residue-wise model quality estimates and sources of residue coevolution other than monomeric tertiary structure.


Author(s):  
Ivan Anishchenko ◽  
Tamuka M. Chidyausiku ◽  
Sergey Ovchinnikov ◽  
Samuel J. Pellock ◽  
David Baker

AbstractThere has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting distance maps, which as expected are quite featureless. We then carried out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (KL-divergence) between the distance distributions predicted by the network and the background distribution. Optimization from different random starting points resulted in a wide range of proteins with diverse sequences and all alpha, all beta sheet, and mixed alpha-beta structures. We obtained synthetic genes encoding 129 of these network hallucinated sequences, expressed and purified the proteins in E coli, and found that 27 folded to monomeric stable structures with circular dichroism spectra consistent with the hallucinated structures. Thus deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute, alongside traditional physically based models, to the de novo design of proteins with new functions.


2021 ◽  
Author(s):  
Ian Kotthoff ◽  
Petras Kundrotas ◽  
Ilya Vakser

Membrane proteins play essential role in cellular mechanisms. Despite that and the major progress in experimental structure determination, they are still significantly underrepresented in Protein Data Bank. Thus, computational approaches to protein structure determination, which are important in general, are especially valuable in the case of membrane proteins and protein-protein assemblies. Due to a number of reasons, not the least of which is much greater availability of structural data, the main focus of structure prediction techniques has been on soluble proteins. Structure prediction of protein-protein complexes is a well-developed field of study. However, because of the differences in physicochemical environment in the membranes and the spatial constraints of the membranes, the generic protein-protein docking approaches are not optimal for the membrane proteins. Thus, specialized computational methods for docking of the membrane proteins must be developed. Development and benchmarking of such methods requires high-quality datasets of membrane protein-protein complexes. In this study we present a new dataset of 456 non-redundant alpha helical binary complexes. The set is significantly larger and more representative than previously developed ones. In the future, this set will become the basis for the development of docking and scoring benchmarks, similar to the ones developed for soluble proteins in the DOCKGROUND resource http://dockground.compbio.ku.edu.


Proteins are essential and are present in all life forms and determining its structure is cumbersome, laborious and time consuming. Hence, over 3-4 decades, researchers have been using computational techniques such as template and template free based protein structure prediction from its sequence. This research focuses on developing a conceptual basis for establishing an invariant fragment library which can be used for protein structure prediction. Based on 20 amino acids, fragments can be classified into lengths of 3 to 41 size. Further, they can be classified based on the identical number of amino acids present in the fragment. This encompasses theoretically the number of fragments that can exist and in no way represent the actual possible fragments that can exist in nature. Invariant fragments are ones which are rigid in structure 3-dimensionally and do not change. A formula was arrived at to determine all possible permutations that can exist for length 3 to 41 based on the 20 amino acids. 100 proteins from the Protein Data Bank were downloaded, broken into fragments of 3 to 41 resulting in a total of 6102,102 fragments using Asynchronous Distributed Processing. Then identical fragments in sequence were superimposed and Root Mean Square Deviation (RMSD) values were obtained resulting in roughly 3.2% of the original framgnets.. t-score and z-scores were obtained from which Skewness, Kurtosis and Excess Kurtosis were determined. For invariance, skewness cutoff was set at + 0.1 and using the excess kurtosis, fragments whose distribution were either leptokurtic or platykurtic and were within + 1 standard deviation of the mean value were considered as invariant i.e., if there were no outliers in the distribution and if most of the t-score or z-score values were centered around its average value. Using these cutoff values, fragments were classified and deposited into an invariant fragment library. Roughly 3,81,799 invariant fragments were obtained which is roughly 6.3% of the total number of initial fragments. This would be way less than the number of fragments that one has to either use in homology or de-novo modelling thereby reducing the design space. Further work is underway to set up the entire invariant fragment library which can then be used to predict protein structure by template-based approach.


Author(s):  
S. Likharev ◽  
A. Kramarenko ◽  
V. Vybornov

At present time the interest is growing considerably for theoretical and experimental analysis of back-scattered electrons (BSE) energy spectra. It was discovered that a special angle and energy nitration of BSE flow could be used for increasing a spatial resolution of BSE mode, sample topography investigations and for layer-by layer visualizing of a depth structure. In the last case it was shown theoretically that in order to obtain suitable depth resolution it is necessary to select a part of BSE flow with the directions of velocities close to inverse to the primary beam and energies within a small window in the high-energy part of the whole spectrum.A wide range of such devices has been developed earlier, but all of them have considerable demerit: they can hardly be used with a standard SEM due to the necessity of sufficient SEM modifications like installation of large accessories in or out SEM chamber, mounting of specialized detector systems, input wires for high voltage supply, screening a primary beam from additional electromagnetic field, etc. In this report we present a new scheme of a compact BSE energy analyzer that is free of imperfections mentioned above.


Author(s):  
А.Р. Зарипова ◽  
Л.Р. Нургалиева ◽  
А.В. Тюрин ◽  
И.Р. Минниахметов ◽  
Р.И. Хусаинова

Проведено исследование гена интерферон индуцированного трансмембранного белка 5 (IFITM5) у 99 пациентов с несовершенным остеогенезом (НО) из 86 неродственных семей. НО - клинически и генетически гетерогенное наследственное заболевание соединительной ткани, основное клиническое проявление которого - множественные переломы, начиная с неонатального периода жизни, зачастую приводящие к инвалидизации с детского возраста. К основным клиническим признакам НО относятся голубые склеры, потеря слуха, аномалия дентина, повышенная ломкость костей, нарушения роста и осанки с развитием характерных инвалидизирующих деформаций костей и сопутствующих проблем, включающих дыхательные, неврологические, сердечные, почечные нарушения. НО встречается как у мужчин, так и у женщин. До сих пор не определена степень генетической гетерогенности заболевания. На сегодняшний день известно 20 генов, вовлеченных в патогенез НО, и исследователи разных стран продолжают искать новые гены. В последнее десятилетие стало известно, что аутосомно-рецессивные, аутосомно-доминантные и Х-сцепленные мутации в широком спектре генов, кодирующих белки, которые участвуют в синтезе коллагена I типа, его процессинге, секреции и посттрансляционной модификации, а также в белках, которые регулируют дифференцировку и активность костеобразующих клеток, вызывают НО. Мутации в гене IFITM5, также называемом BRIL (bone-restricted IFITM-like protein), участвующем в формировании остеобластов, приводят к развитию НО типа V. До 5% пациентов имеют НО типа V, который характеризуется образованием гиперпластического каллуса после переломов, кальцификацией межкостной мембраны предплечья и сетчатым рисунком ламелирования, наблюдаемого при гистологическом исследовании кости. В 2012 г. гетерозиготная мутация (c.-14C> T) в 5’-нетранслируемой области (UTR) гена IFITM5 была идентифицирована как основная причина НО V типа. В представленной работе проведен анализ гена IFITM5 и идентифицирована мутация c.-14C>T, возникшая de novo, у одного пациента с НО, которому впоследствии был установлен V тип заболевания. Также выявлены три известных полиморфных варианта: rs57285449; c.80G>C (p.Gly27Ala) и rs2293745; c.187-45C>T и rs755971385 c.279G>A (p.Thr93=) и один ранее не описанный вариант: c.128G>A (p.Ser43Asn) AGC>AAC (S/D), которые не являются патогенными. В статье уделяется внимание особенностям клинических проявлений НО V типа и рекомендуется определение мутации c.-14C>T в гене IFITM5 при подозрении на данную форму заболевания. A study was made of interferon-induced transmembrane protein 5 gene (IFITM5) in 99 patients with osteogenesis imperfecta (OI) from 86 unrelated families and a search for pathogenic gene variants involved in the formation of the disease phenotype. OI is a clinically and genetically heterogeneous hereditary disease of the connective tissue, the main clinical manifestation of which is multiple fractures, starting from the natal period of life, often leading to disability from childhood. The main clinical signs of OI include blue sclera, hearing loss, anomaly of dentin, increased fragility of bones, impaired growth and posture, with the development of characteristic disabling bone deformities and associated problems, including respiratory, neurological, cardiac, and renal disorders. OI occurs in both men and women. The degree of genetic heterogeneity of the disease has not yet been determined. To date, 20 genes are known to be involved in the pathogenesis of OI, and researchers from different countries continue to search for new genes. In the last decade, it has become known that autosomal recessive, autosomal dominant and X-linked mutations in a wide range of genes encoding proteins that are involved in the synthesis of type I collagen, its processing, secretion and post-translational modification, as well as in proteins that regulate the differentiation and activity of bone-forming cells cause OI. Mutations in the IFITM5 gene, also called BRIL (bone-restricted IFITM-like protein), involved in the formation of osteoblasts, lead to the development of OI type V. Up to 5% of patients have OI type V, which is characterized by the formation of a hyperplastic callus after fractures, calcification of the interosseous membrane of the forearm, and a mesh lamellar pattern observed during histological examination of the bone. In 2012, a heterozygous mutation (c.-14C> T) in the 5’-untranslated region (UTR) of the IFITM5 gene was identified as the main cause of OI type V. In the present work, the IFITM5 gene was analyzed and the de novo c.-14C> T mutation was identified in one patient with OI who was subsequently diagnosed with type V of the disease. Three known polymorphic variants were also identified: rs57285449; c.80G> C (p.Gly27Ala) and rs2293745; c.187-45C> T and rs755971385 c.279G> A (p.Thr93 =) and one previously undescribed variant: c.128G> A (p.Ser43Asn) AGC> AAC (S / D), which were not pathogenic. The article focuses on the features of the clinical manifestations of OI type V, and it is recommended to determine the c.-14C> T mutation in the IFITM5 gene if this form of the disease is suspected.


2018 ◽  
Vol 16 (05) ◽  
pp. 362-368 ◽  
Author(s):  
Federica Sullo ◽  
Agata Polizzi ◽  
Stefano Catanzaro ◽  
Selene Mantegna ◽  
Francesco Lacarrubba ◽  
...  

Cerebellotrigeminal dermal (CTD) dysplasia is a rare neurocutaneous disorder characterized by a triad of symptoms: bilateral parieto-occipital alopecia, facial anesthesia in the trigeminal area, and rhombencephalosynapsis (RES), confirmed by cranial magnetic resonance imaging. CTD dysplasia is also known as Gómez-López-Hernández syndrome. So far, only 35 cases have been described with varying symptomatology. The etiology remains unknown. Either spontaneous dominant mutations or de novo chromosomal rearrangements have been proposed as possible explanations. In addition to its clinical triad of RES, parietal alopecia, and trigeminal anesthesia, CTD dysplasia is associated with a wide range of phenotypic and neurodevelopmental abnormalities.Treatment is symptomatic and includes physical rehabilitation, special education, dental care, and ocular protection against self-induced corneal trauma that causes ulcers and, later, corneal opacification. The prognosis is correlated to the mental development, motor handicap, corneal–facial anesthesia, and visual problems. Follow-up on a large number of patients with CTD dysplasia has never been reported and experience is limited to few cases to date. High degree of suspicion in a child presenting with characteristic alopecia and RES has a great importance in diagnosis of this syndrome.


Sign in / Sign up

Export Citation Format

Share Document