scholarly journals Characterizing disease-associated human proteins without available protein structures or homologues

2021 ◽  
Author(s):  
Neeladri Sen ◽  
Ivan Anishchenko ◽  
Nicola Bordin ◽  
Ian Sillitoe ◽  
Sameer Velankar ◽  
...  

Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques such as RoseTTAFold and AlphaFold, we can predict the structure of these proteins even in the absence of structural homologues. We modeled and extracted the domains from 553 disease-associated human proteins. We noticed that the model quality was higher and the RMSD lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein-protein interfaces, conserved residues and destabilising effects caused by residue mutations in these predicted structures. We then explored whether the disease-associated mutations were in the proximity of these predicted functional sites or if they destabilized the protein structure based on ddG calculations. We could explain 80% of these disease-associated mutations based on proximity to functional sites or structural destabilization. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.

Author(s):  
Sayoni Das ◽  
Harry M Scholes ◽  
Neeladri Sen ◽  
Christine Orengo

Abstract Motivation Identification of functional sites in proteins is essential for functional characterization, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein–protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). Results FunSite’s prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed other publicly available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSite’s performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyze which structural and evolutionary features are most predictive for functional sites. Availabilityand implementation https://github.com/UCL/cath-funsite-predictor. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2011 ◽  
Vol 28 (2) ◽  
pp. 286-287 ◽  
Author(s):  
Chi-Ho Ngan ◽  
David R. Hall ◽  
Brandon Zerbe ◽  
Laurie E. Grove ◽  
Dima Kozakov ◽  
...  

2020 ◽  
Vol 76 (1) ◽  
pp. 51-62 ◽  
Author(s):  
Nigel W. Moriarty ◽  
Pawel A. Janowski ◽  
Jason M. Swails ◽  
Hai Nguyen ◽  
Jane S. Richardson ◽  
...  

The refinement of biomolecular crystallographic models relies on geometric restraints to help to address the paucity of experimental data typical in these experiments. Limitations in these restraints can degrade the quality of the resulting atomic models. Here, an integration of the full all-atom Amber molecular-dynamics force field into Phenix crystallographic refinement is presented, which enables more complete modeling of biomolecular chemistry. The advantages of the force field include a carefully derived set of torsion-angle potentials, an extensive and flexible set of atom types, Lennard–Jones treatment of nonbonded interactions and a full treatment of crystalline electrostatics. The new combined method was tested against conventional geometry restraints for over 22 000 protein structures. Structures refined with the new method show substantially improved model quality. On average, Ramachandran and rotamer scores are somewhat better, clashscores and MolProbity scores are significantly improved, and the modeling of electrostatics leads to structures that exhibit more, and more correct, hydrogen bonds than those refined using traditional geometry restraints. In general it is found that model improvements are greatest at lower resolutions, prompting plans to add the Amber target function to real-space refinement for use in electron cryo-microscopy. This work opens the door to the future development of more advanced applications such as Amber-based ensemble refinement, quantum-mechanical representation of active sites and improved geometric restraints for simulated annealing.


2021 ◽  
Vol 8 ◽  
Author(s):  
Sundeep Chaitanya Vedithi ◽  
Sony Malhotra ◽  
Marta Acebrón-García-de-Eulate ◽  
Modestas Matusevicius ◽  
Pedro Henrique Monteiro Torres ◽  
...  

Leprosy, caused by Mycobacterium leprae (M. leprae), is treated with a multidrug regimen comprising Dapsone, Rifampicin, and Clofazimine. These drugs exhibit bacteriostatic, bactericidal and anti-inflammatory properties, respectively, and control the dissemination of infection in the host. However, the current treatment is not cost-effective, does not favor patient compliance due to its long duration (12 months) and does not protect against the incumbent nerve damage, which is a severe leprosy complication. The chronic infectious peripheral neuropathy associated with the disease is primarily due to the bacterial components infiltrating the Schwann cells that protect neuronal axons, thereby inducing a demyelinating phenotype. There is a need to discover novel/repurposed drugs that can act as short duration and effective alternatives to the existing treatment regimens, preventing nerve damage and consequent disability associated with the disease. Mycobacterium leprae is an obligate pathogen resulting in experimental intractability to cultivate the bacillus in vitro and limiting drug discovery efforts to repositioning screens in mouse footpad models. The dearth of knowledge related to structural proteomics of M. leprae, coupled with emerging antimicrobial resistance to all the three drugs in the multidrug therapy, poses a need for concerted novel drug discovery efforts. A comprehensive understanding of the proteomic landscape of M. leprae is indispensable to unravel druggable targets that are essential for bacterial survival and predilection of human neuronal Schwann cells. Of the 1,614 protein-coding genes in the genome of M. leprae, only 17 protein structures are available in the Protein Data Bank. In this review, we discussed efforts made to model the proteome of M. leprae using a suite of software for protein modeling that has been developed in the Blundell laboratory. Precise template selection by employing sequence-structure homology recognition software, multi-template modeling of the monomeric models and accurate quality assessment are the hallmarks of the modeling process. Tools that map interfaces and enable building of homo-oligomers are discussed in the context of interface stability. Other software is described to determine the druggable proteome by using information related to the chokepoint analysis of the metabolic pathways, gene essentiality, homology to human proteins, functional sites, druggable pockets and fragment hotspot maps.


2015 ◽  
Vol 112 (34) ◽  
pp. 10714-10719 ◽  
Author(s):  
Yun Mou ◽  
Po-Ssu Huang ◽  
Fang-Ciao Hsu ◽  
Shing-Jong Huang ◽  
Stephen L. Mayo

Homodimers are the most common type of protein assembly in nature and have distinct features compared with heterodimers and higher order oligomers. Understanding homodimer interactions at the atomic level is critical both for elucidating their biological mechanisms of action and for accurate modeling of complexes of unknown structure. Computation-based design of novel protein–protein interfaces can serve as a bottom-up method to further our understanding of protein interactions. Previous studies have demonstrated that the de novo design of homodimers can be achieved to atomic-level accuracy by β-strand assembly or through metal-mediated interactions. Here, we report the design and experimental characterization of a α-helix–mediated homodimer with C2 symmetry based on a monomeric Drosophila engrailed homeodomain scaffold. A solution NMR structure shows that the homodimer exhibits parallel helical packing similar to the design model. Because the mutations leading to dimer formation resulted in poor thermostability of the system, design success was facilitated by the introduction of independent thermostabilizing mutations into the scaffold. This two-step design approach, function and stabilization, is likely to be generally applicable, especially if the desired scaffold is of low thermostability.


2020 ◽  
Vol 36 (10) ◽  
pp. 3077-3083
Author(s):  
Wentao Shi ◽  
Jeffrey M Lemoine ◽  
Abd-El-Monsif A Shawky ◽  
Manali Singha ◽  
Limeng Pu ◽  
...  

Abstract Motivation Fast and accurate classification of ligand-binding sites in proteins with respect to the class of binding molecules is invaluable not only to the automatic functional annotation of large datasets of protein structures but also to projects in protein evolution, protein engineering and drug development. Deep learning techniques, which have already been successfully applied to address challenging problems across various fields, are inherently suitable to classify ligand-binding pockets. Our goal is to demonstrate that off-the-shelf deep learning models can be employed with minimum development effort to recognize nucleotide- and heme-binding sites with a comparable accuracy to highly specialized, voxel-based methods. Results We developed BionoiNet, a new deep learning-based framework implementing a popular ResNet model for image classification. BionoiNet first transforms the molecular structures of ligand-binding sites to 2D Voronoi diagrams, which are then used as the input to a pretrained convolutional neural network classifier. The ResNet model generalizes well to unseen data achieving the accuracy of 85.6% for nucleotide- and 91.3% for heme-binding pockets. BionoiNet also computes significance scores of pocket atoms, called BionoiScores, to provide meaningful insights into their interactions with ligand molecules. BionoiNet is a lightweight alternative to computationally expensive 3D architectures. Availability and implementation BionoiNet is implemented in Python with the source code freely available at: https://github.com/CSBG-LSU/BionoiNet. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document