AlphaFill: enriching the AlphaFold models with ligands and co-factors

2021 ◽  
Author(s):  
Maarten L Hekkelman ◽  
Ida de de Vries ◽  
Robbie P Joosten ◽  
Anastassis Perrakis

Artificial intelligence (AI) methods for constructing structural models of proteins on the basis of their sequence are having a transformative effect in biomolecular sciences. The AlphaFold protein structure database makes available hundreds of thousands of protein structures. However, all these structures lack cofactors essential for their structural integrity and molecular function (e.g. hemoglobin lacks a bound heme), key ions essential for structural integrity (e.g. zinc-finger motifs) or catalysis (e.g. Ca2+ or Zn2+ in metalloproteases), and ligands that are important for biological function (e.g. kinase structures lack ADP or ATP). Here, we present AlphaFill, an algorithm based on sequence and structure similarity, to "transplant" such "missing" small molecules and ions from experimentally determined structures to predicted protein models. These publicly available structural annotations are mapped to predicted protein models, to help scientists interpret biological function and design experiments.

Molecules ◽  
2021 ◽  
Vol 26 (14) ◽  
pp. 4250
Author(s):  
Xiao-Jing Pang ◽  
Xiu-Juan Liu ◽  
Yuan Liu ◽  
Wen-Bo Liu ◽  
Yin-Ru Li ◽  
...  

FAK is a nonreceptor intracellular tyrosine kinase which plays an important biological function. Many studies have found that FAK is overexpressed in many human cancer cell lines, which promotes tumor cell growth by controlling cell adhesion, migration, proliferation, and survival. Therefore, targeting FAK is considered to be a promising cancer therapy with small molecules. Many FAK inhibitors have been reported as anticancer agents with various mechanisms. Currently, six FAK inhibitors, including GSK-2256098 (Phase I), VS-6063 (Phase II), CEP-37440 (Phase I), VS-6062 (Phase I), VS-4718 (Phase I), and BI-853520 (Phase I) are undergoing clinical trials in different phases. Up to now, there have been many novel FAK inhibitors with anticancer activity reported by different research groups. In addition, FAK degraders have been successfully developed through “proteolysis targeting chimera” (PROTAC) technology, opening up a new way for FAK-targeted therapy. In this paper, the structure and biological function of FAK are reviewed, and we summarize the design, chemical types, and activity of FAK inhibitors according to the development of FAK drugs, which provided the reference for the discovery of new anticancer agents.


2018 ◽  
Vol 62 (4) ◽  
pp. 575-582
Author(s):  
Francesco Raimondi ◽  
Robert B. Russell

Genetic variants are currently a major component of system-wide investigations into biological function or disease. Approaches to select variants (often out of thousands of candidates) that are responsible for a particular phenomenon have many clinical applications and can help illuminate differences between individuals. Selecting meaningful variants is greatly aided by integration with information about molecular mechanism, whether known from protein structures or interactions or biological pathways. In this review we discuss the nature of genetic variants, and recent studies highlighting what is currently known about the relationship between genetic variation, biomolecular function, and disease.


Medicines ◽  
2019 ◽  
Vol 6 (3) ◽  
pp. 80 ◽  
Author(s):  
Giancarlo Ghiselli

The polyanionic nature and the ability to interact with proteins with different affinities are properties of sulfated glycosaminoglycans (GAGs) that determine their biological function. In designing drugs affecting the interaction of proteins with GAGs the challenge has been to generate agents with high binding specificity. The example to emulated has been a heparin-derived pentasaccharide that binds to antithrombin-III with high affinity. However, the portability of this model to other biological situations is questioned on several accounts. Because of their structural flexibility, oligosaccharides with different sulfation and uronic acid conformation can display the same binding proficiency to different proteins and produce comparable biological effects. This circumstance represents a formidable obstacle to the design of drugs based on the heparin scaffold. The conceptual framework discussed in this article is that through a direct intervention on the heparin-binding functionality of proteins is possible to achieve a high degree of action specificity. This objective is currently pursued through two strategies. The first makes use of small molecules for which in the text we provide examples from past and present literature concerning angiogenic factors and enzymes. The second approach entails the mutagenesis of the GAG-binding site of proteins as a means to generate a new class of biologics of therapeutic interest.


2016 ◽  
Vol 62 ◽  
pp. 541-570 ◽  
Author(s):  
H. A. O. Hill ◽  
A. J. Thomson

Robert J. P. Williams was a pioneer in advancing our understanding of the roles of chemical elements, especially the metals, in biology and in biological evolution. During the first half of his career of more than 60 years at Oxford University he studied the thermodynamic stabilities of transition-metal complexes with organic ligands, their redox properties, magnetism and colour, to understand their biological function. In parallel he collaborated with biologists and biophysicists, for example with Bert Vallee, studying zinc in proteins. Williams was the first to describe how proton gradients could be used to drive the formation of the universal biological fuel, ATP (adenosine triphosphate), a fundamental step in biological energetics. From the late 1960s he studied many proteins that use metal ions for catalysis, for electron transfer and cellular regulation. A leading figure in the establishment of the Oxford Enzyme Group, Williams developed high-field nuclear magnetic resonance (NMR) to study the mobility and dynamics of many protein structures, leading to a deeper understanding of protein function. He held the Royal Society Napier Research Professorship from 1974 until his retirement in 1991. Subsequently he published several books setting out his understanding of the roles of metal ions in biology, and their wider significance in evolution. Bob Williams's deep insights across many disciplines made him a charismatic teacher. His lateral style of thinking never failed to inspire. His legacy lies in the successful careers of his many students and collaborators worldwide and the vigour of the new discipline of bioinorganic chemistry that he helped to establish.


2012 ◽  
Vol 7 (1) ◽  
Author(s):  
Federico Fogolari ◽  
Alessandra Corazza ◽  
Paolo Viglino ◽  
Gennaro Esposito

2018 ◽  
Author(s):  
Rhiju Das

AbstractSummaryBiomolecules shift their structures as a function of temperature and concentrations of protons, ions, small molecules, proteins, and nucleic acids. These transitions impact or underlie biological function and are being monitored at increasingly high throughput. For example, folding transitions for large collections of RNAs can now be monitored at single residue resolution by chemical mapping techniques. LIkelihood-based Fits of Folding Transitions (LIFFT) quantifies these data through well-defined thermodynamic models. LIFFT implements a Bayesian framework that takes into account data at all measured residues and enables visual assessment of modeling uncertainties that can be overlooked in least-squares fits. The framework is appropriate for multimodal techniques ranging from chemical mapping including multi-wavelength spectroscopy.AvailabilityFreely available MATLAB package at https://ribokit.stanford.edu/LIFFT/[email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Janani Durairaj ◽  
Mehmet Akdel ◽  
Dick de Ridder ◽  
Aalt DJ van Dijk

AbstractMotivationAs the number of experimentally solved protein structures rises, it becomes increasingly appealing to use structural information for predictive tasks involving proteins. Due to the large variation in protein sizes, folds, and topologies, an attractive approach is to embed protein structures into fixed-length vectors, which can be used in machine learning algorithms aimed at predicting and understanding functional and physical properties. Many existing embedding approaches are alignment-based, which is both time-consuming and ineffective for distantly related proteins. On the other hand, library- or model-based approaches depend on a small library of fragments or require the use of a trained model, both of which may not generalize well.ResultsWe present Geometricus, a novel and universally applicable approach to embedding proteins in a fixed-dimensional space. The approach is fast, accurate, and interpretable. Geometricus uses a set of 3D moment invariants to discretize fragments of protein structures into shape-mers, which are then counted to describe the full structure as a vector of counts. We demonstrate the applicability of this approach in various tasks, ranging from fast structure similarity search, unsupervised clustering, and structure classification across proteins from different superfamilies as well as within the same family.AvailabilityPython code available at https://git.wur.nl/durai001/[email protected], [email protected]


2022 ◽  
Author(s):  
Adam Zemla ◽  
Jonathan E. Allen ◽  
Dan Kirshner ◽  
Felice C. Lightstone

We present a structure-based method for finding and evaluating structural similarities in protein regions relevant to ligand binding. PDBspheres comprises an exhaustive library of protein structure regions (spheres) adjacent to complexed ligands derived from the Protein Data Bank (PDB), along with methods to find and evaluate structural matches between a protein of interest and spheres in the library. Currently, PDBspheres library contains more than 2 million spheres, organized to facilitate searches by sequence and/or structure similarity of protein-ligand binding sites or interfaces between interacting molecules. PDBspheres uses the LGA structure alignment algorithm as the main engine for detecting structure similarities between the protein of interest and library spheres. An all-atom structure similarity metric ensures that sidechain placement is taken into account in the PDBspheres primary assessment of confidence in structural matches. In this paper, we (1) describe the PDBspheres method, (2) demonstrate how PDBspheres can be used to detect and characterize binding sites in protein structures, (3) compare PDBspheres use for binding site prediction with seven other binding site prediction methods using a curated dataset of 2,528 ligand-bound and ligand-free crystal structures, and (4) use PDBspheres to cluster pockets and assess structural similarities among protein binding sites of the 4,876 structures in the refined set of PDBbind 2019 dataset. The PDBspheres library is made publicly available for download at https://proteinmodel.org/AS2TS/PDBspheres


2021 ◽  
Author(s):  
Chunxiang Peng ◽  
Xiaogen Zhou ◽  
Yuhao Xia ◽  
Yang Zhang ◽  
Guijun Zhang

With the development of protein structure prediction methods and biological experimental determination techniques, the structure of single-domain proteins can be relatively easier to be modeled or experimentally solved. However, more than 80% of eukaryotic proteins and 67% of prokaryotic proteins contain multiple domains. Constructing a unified multi-domain protein structure database will promote the research of multi-domain proteins, especially in the modeling of multi-domain protein structures. In this work, we develop a unified multi-domain protein structure database (MPDB). Based on MPDB, we also develop a server with two functional modules: (1) the culling module, which filters the whole MPDB according to input criteria; (2) the detection module, which identifies structural analogues of the full-chain according to the structural similarity between input domain models and the protein in MPDB. The module can discover the potential analogue structures, which will contribute to high-quality multi-domain protein structure modeling.


2018 ◽  
Vol 47 (2) ◽  
pp. 582-593 ◽  
Author(s):  
Shilpa Nadimpalli Kobren ◽  
Mona Singh

Abstract Domains are fundamental subunits of proteins, and while they play major roles in facilitating protein–DNA, protein–RNA and other protein–ligand interactions, a systematic assessment of their various interaction modes is still lacking. A comprehensive resource identifying positions within domains that tend to interact with nucleic acids, small molecules and other ligands would expand our knowledge of domain functionality as well as aid in detecting ligand-binding sites within structurally uncharacterized proteins. Here, we introduce an approach to identify per-domain-position interaction ‘frequencies’ by aggregating protein co-complex structures by domain and ascertaining how often residues mapping to each domain position interact with ligands. We perform this domain-based analysis on ∼91000 co-complex structures, and infer positions involved in binding DNA, RNA, peptides, ions or small molecules across 4128 domains, which we refer to collectively as the InteracDome. Cross-validation testing reveals that ligand-binding positions for 2152 domains are highly consistent and can be used to identify residues facilitating interactions in ∼63–69% of human genes. Our resource of domain-inferred ligand-binding sites should be a great aid in understanding disease etiology: whereas these sites are enriched in Mendelian-associated and cancer somatic mutations, they are depleted in polymorphisms observed across healthy populations. The InteracDome is available at http://interacdome.princeton.edu.


Sign in / Sign up

Export Citation Format

Share Document