structural coverage
Recently Published Documents


TOTAL DOCUMENTS

65
(FIVE YEARS 16)

H-INDEX

11
(FIVE YEARS 3)

2022 ◽  
Vol 79 (1) ◽  
Author(s):  
Tamás Hegedűs ◽  
Markus Geisler ◽  
Gergely László Lukács ◽  
Bianka Farkas

AbstractTransmembrane (TM) proteins are major drug targets, but their structure determination, a prerequisite for rational drug design, remains challenging. Recently, the DeepMind’s AlphaFold2 machine learning method greatly expanded the structural coverage of sequences with high accuracy. Since the employed algorithm did not take specific properties of TM proteins into account, the reliability of the generated TM structures should be assessed. Therefore, we quantitatively investigated the quality of structures at genome scales, at the level of ABC protein superfamily folds and for specific membrane proteins (e.g. dimer modeling and stability in molecular dynamics simulations). We tested template-free structure prediction with a challenging TM CASP14 target and several TM protein structures published after AlphaFold2 training. Our results suggest that AlphaFold2 performs well in the case of TM proteins and its neural network is not overfitted. We conclude that cautious applications of AlphaFold2 structural models will advance TM protein-associated studies at an unexpected level.


2021 ◽  
Vol 17 (9) ◽  
Author(s):  
Seán I O’Donoghue ◽  
Andrea Schafferhans ◽  
Neblina Sikta ◽  
Christian Stolte ◽  
Sandeep Kaur ◽  
...  

Author(s):  
Pedro Serrano ◽  
Samit K. Dutta ◽  
Andrew Proudfoot ◽  
Biswaranjan Mohanty ◽  
Lukas Susac ◽  
...  

2021 ◽  
Author(s):  
Eduard Porta-Pardo ◽  
Victoria Ruiz-Serra ◽  
Alfonso Valencia

The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 47%, considering experimentally-derived or template-based homology models, elevates up to 75% when including AlphaFold predictions, reducing the fraction of dark proteome from 22% to just 7% and the number of proteins without structural information from 4.832 to just 29. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (70% of ClinVar pathogenic mutations and 74% of oncogenic mutations), AlphaFold models still provide an additional coverage of 2% to 14% of these critically important sets of biomedical genes and mutations. We also provide several examples of disease-associated proteins where AlphaFold provides critical new insights. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications.


2020 ◽  
Vol 49 (D1) ◽  
pp. D266-D273
Author(s):  
Ian Sillitoe ◽  
Nicola Bordin ◽  
Natalie Dawson ◽  
Vaishali P Waman ◽  
Paul Ashford ◽  
...  

Abstract CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.


Author(s):  
Seán I. O’Donoghue ◽  
Andrea Schafferhans ◽  
Neblina Sikta ◽  
Christian Stolte ◽  
Sandeep Kaur ◽  
...  

AbstractIn response to the COVID-19 pandemic, many life scientists are focused on SARS-CoV-2. To help them use available structural data, we systematically modeled all viral proteins using all related 3D structures, generating 872 models that provide detail not available elsewhere. To organise these models, we created a structural coverage map: a novel, one-stop visualization summarizing what is — and is not — known about the 3D structure of the viral proteome. The map highlights structural evidence for viral protein interactions, mimicry, and hijacking; it also helps researchers find 3D models of interest, which can then be mapped with UniProt, PredictProtein, or CATH features. The resulting Aquaria-COVID resource (https://aquaria.ws/covid) helps scientists understand molecular mechanisms underlying coronavirus infection. Based on insights gained using our resource, we propose mechanisms by which the virus may enter immune cells, sense the cell type, then switch focus from viral reproduction to disrupting host immune responses.SignificanceCurrently, much of the COVID-19 viral proteome has unknown molecular structure. To improve this, we generated ∼1,000 structural models, designed to capture multiple states for each viral protein. To organise these models, we created a structure coverage map: a novel, one-stop visualization summarizing what is — and is not — known about viral protein structure. We used these data to create an online resource, designed to help COVID-19 researchers gain insight into the key molecular processes that drive infection. Based on insights gained using our resource, we speculate that the virus may sense the type of cells it infects and, within certain cells, it may switch from reproduction to disruption of the immune system.


2020 ◽  
Vol 10 (12) ◽  
pp. 4248 ◽  
Author(s):  
Eduardo M. Bruch ◽  
Stéphanie Petrella ◽  
Marco Bellinzoni

Structure-based and computer-aided drug design approaches are commonly considered to have been successful in the fields of cancer and antiviral drug discovery but not as much for antibacterial drug development. The search for novel anti-tuberculosis agents is indeed an emblematic example of this trend. Although huge efforts, by consortiums and groups worldwide, dramatically increased the structural coverage of the Mycobacterium tuberculosis proteome, the vast majority of candidate drugs included in clinical trials during the last decade were issued from phenotypic screenings on whole mycobacterial cells. We developed here three selected case studies, i.e., the serine/threonine (Ser/Thr) kinases—protein kinase (Pkn) B and PknG, considered as very promising targets for a long time, and the DNA gyrase of M. tuberculosis, a well-known, pharmacologically validated target. We illustrated some of the challenges that rational, target-based drug discovery programs in tuberculosis (TB) still have to face, and, finally, discussed the perspectives opened by the recent, methodological developments in structural biology and integrative techniques.


2020 ◽  
Author(s):  
Krishna Praneeth Kilambi ◽  
Qifang Xu ◽  
Guruharsha Kuthethur Gururaj ◽  
Kejie Li ◽  
Spyros Artavanis-Tsakonas ◽  
...  

AbstractA high-quality map of the human protein–protein interaction (PPI) network can help us better understand complex genotype–phenotype relationships. Each edge between two interacting proteins supported through an interface in a three-dimensional (3D) structure of a protein complex adds credibility to the biological relevance of the interaction. Such structure-supported interactions would augment an interaction map primarily built using high-throughput cell-based biophysical methods. Here, we integrate structural information with the human PPI network to build the structure-supported human interactome, a subnetwork of PPI between proteins that contain domains or regions known to form interfaces in the 3D structures of protein complexes. We expand the coverage of our structure-supported human interactome by using Pfam-based domain definitions, whereby we include homologous interactions if a human complex structure is unavailable. The structure-supported interactome predicts one-eighth of the total network PPI to interact through domain–domain interfaces. It identifies with higher resolution the interacting subunits in multi-protein complexes and enables us to characterize functional and disease-relevant neighborhoods in the network map with higher accuracy, allowing for structural insights into disease-associated genes and pathways. We expand the structural coverage beyond domain–domain interfaces by identifying the most common non-enzymatic peptide-binding domains with structural support. Adding these interactions between protein domains on one side and peptide regions on the other approximately doubles the number of structure-supported PPI. The human structure-supported interactome is a resource to prioritize investigations of smaller-scale context-specific experimental PPI neighborhoods of biological or clinical significance.Short abstractA high-quality map of the human protein–protein interaction (PPI) network can help us better understand genotype–phenotype relationships. Each edge between two interacting proteins supported through an interface in a three-dimensional structure of a protein complex adds credibility to the biological relevance of the interaction aiding experimental prioritization. Here, we integrate structural information with the human interactome to build the structure-supported human interactome, a subnetwork of PPI between proteins that contain domains or regions known to form interfaces in the structures of protein complexes. The structure-supported interactome predicts one-eighth of the total PPI to interact through domain–domain interfaces. It identifies with higher resolution the interacting subunits in multi-protein complexes and enables us to structurally characterize functional, disease-relevant network neighborhoods. We also expand the structural coverage by identifying PPI between non-enzymatic peptide-binding domains on one side and peptide regions on the other, thereby doubling the number of structure-supported PPI.


2019 ◽  
Vol 48 (D1) ◽  
pp. D383-D388 ◽  
Author(s):  
Matthew I J Raybould ◽  
Claire Marks ◽  
Alan P Lewis ◽  
Jiye Shi ◽  
Alexander Bujotzek ◽  
...  

Abstract The Therapeutic Structural Antibody Database (Thera-SAbDab; http://opig.stats.ox.ac.uk/webapps/therasabdab) tracks all antibody- and nanobody-related therapeutics recognized by the World Health Organisation (WHO), and identifies any corresponding structures in the Structural Antibody Database (SAbDab) with near-exact or exact variable domain sequence matches. Thera-SAbDab is synchronized with SAbDab to update weekly, reflecting new Protein Data Bank entries and the availability of new sequence data published by the WHO. Each therapeutic summary page lists structural coverage (with links to the appropriate SAbDab entries), alignments showing where any near-matches deviate in sequence, and accompanying metadata, such as intended target and investigated conditions. Thera-SAbDab can be queried by therapeutic name, by a combination of metadata, or by variable domain sequence - returning all therapeutics that are within a specified sequence identity over a specified region of the query. The sequences of all therapeutics listed in Thera-SAbDab (461 unique molecules, as of 5 August 2019) are downloadable as a single file with accompanying metadata.


Sign in / Sign up

Export Citation Format

Share Document