scholarly journals Open Natural Products Research: Curation and Dissemination of Biological Occurrences of Chemical Structures through Wikidata

2021 ◽  
Author(s):  
Adriano Rutz ◽  
Maria Sorokina ◽  
Jakub Galgonek ◽  
Daniel Mietchen ◽  
Egon Willighagen ◽  
...  

As contemporary bioinformatic and chemoinformatic capabilities are reshaping natural products research, major benefits could result from an open database of referenced structure-organism pairs. Those pairs allow the identification of distinct molecular structures found as components of heterogeneous chemical matrices originating from living organisms. Current databases with such information suffer from paywall restrictions, limited taxonomic scope, poorly standardized fields, and lack of interoperability. To ensure data quality, references to the work that describes the structure-organism relationship are mandatory. To fill this void, we collected and curated a set of structure-organism pairs from publicly available natural products databases to yield LOTUS (naturaL prOducTs occUrrences databaSe), which contains over 500,000 curated and referenced structure-organism pairs. All the programs developed for data collection, curation, and dissemination are publicly available. To provide unlimited access as well as standardized linking to other resources, LOTUS data is both hosted on Wikidata and regularly mirrored on https://lotus.naturalproducts.net. The diffusion of these referenced structure-organism pairs within the Wikidata framework addresses many of the limitations of currently-available databases and facilitates linkage to existing biological and chemical data resources. This resource represents an important advancement in the design and deployment of a comprehensive and collaborative natural products knowledge base.

Molecules ◽  
2021 ◽  
Vol 26 (21) ◽  
pp. 6360
Author(s):  
Iglika Lessigiarska ◽  
Yunhui Peng ◽  
Ivanka Tsakovska ◽  
Petko Alov ◽  
Nathalie Lagarde ◽  
...  

The aim of this study was to investigate the chemical space and interactions of natural compounds with sulfotransferases (SULTs) using ligand- and structure-based in silico methods. An in-house library of natural ligands (hormones, neurotransmitters, plant-derived compounds and their metabolites) reported to interact with SULTs was created. Their chemical structures and properties were compared to those of compounds of non-natural (synthetic) origin, known to interact with SULTs. The natural ligands interacting with SULTs were further compared to other natural products for which interactions with SULTs were not known. Various descriptors of the molecular structures were calculated and analyzed. Statistical methods (ANOVA, PCA, and clustering) were used to explore the chemical space of the studied compounds. Similarity search between the compounds in the different groups was performed with the ROCS software. The interactions with SULTs were additionally analyzed by docking into different experimental and modeled conformations of SULT1A1. Natural products with potentially strong interactions with SULTs were outlined. Our results contribute to a better understanding of chemical space and interactions of natural compounds with SULT enzymes and help to outline new potential ligands of these enzymes.


2018 ◽  
Author(s):  
William A. Shirley ◽  
Brian P. Kelley ◽  
Yohann Potier ◽  
John H. Koschwanez ◽  
Robert Bruccoleri ◽  
...  

This pre-print explores ensemble modeling of natural product targets to match chemical structures to precursors found in large open-source gene cluster repository antiSMASH. Commentary on method, effectiveness, and limitations are enclosed. All structures are public domain molecules and have been reviewed for release.


2020 ◽  
Vol 26 ◽  
Author(s):  
Shaik Ibrahim Khalivulla ◽  
Arifullah Mohammed ◽  
Kokkanti Mallikarjuna

Background: Diabetes is a chronic disease affecting a large population worldwide and stands as one of the major global health challenges to be tackled. According to World Health Organization, about 400 million are having diabetes worldwide and it is the seventh leading cause of deaths in 2016. Plant based natural products had been in use from ancient time as ethnomedicine for the treatment of several diseases including diabetes. As a result of that, there are several reports on plant based natural products displaying antidiabetic activity. In the current review, such antidiabetic potential compounds reported from all plant sources along with their chemical structures are collected, presented and discussed. This kind of reports are essential to pool the available information to one source followed by statistical analysis and screening to check the efficacy of all known compounds in a comparative sense. This kind of analysis can give rise to few numbers of potential compounds from hundreds, whom can further be screened through in vitro and in vivo studies, and human trails leading to the drug development. Methods: Phytochemicals along with their potential antidiabetic property were classified according to their basic chemical skeleton. The chemical structures of all the compounds with antidiabetic activities were elucidated in the present review. In addition to this, the distribution and their other remarkable pharmacological activities of each species is also included. Results: The scrutiny of literature led to identification of 44 plants with antidiabetic compounds (70) and other pharmacological activities. For the sake of information, the distribution of each species in the world is given. Many plant derivatives may exert antidiabetic properties by improving or mimicking the insulin production or action. Different classes of compounds including sulfur compounds (1-4), alkaloids (5-11), phenolic compounds (12-17), tannins (18-23), phenylpropanoids (24-27), xanthanoids (28-31), amino acid (32), stilbenoid (33), benzofuran (34), coumarin (35), flavonoids (36-49) and terpenoids (50-70) were found to be active potential compounds for antidiabetic activity. Of the 70 listed compounds, majorly 17 compounds are from triterpenoids, 13 flavonoids and 7 are from alkaloids. Among all the 44 plant species, maximum number (7) of compounds are reported from Lagerstroemia speciosa followed by Momordica charantia (6) and S. oblonga with 5 compounds. Conclusion: This is the first paper to summarize the established chemical structures of phytochemicals that have been successfully screened for antidiabetic potential and their mechanisms of inhibition. The reported compounds could be considered as potential lead molecules for the treatment of type-2 diabetes. Further, molecular and clinical trials are required to select and establish the therapeutic drug candidates.


2019 ◽  
Vol 16 (10) ◽  
pp. 1130-1137
Author(s):  
Hayrettin Ozan Gulcan ◽  
Serkan Yigitkan ◽  
Ilkay Erdogan Orhan

High cholesterol and triglyceride levels are mainly related to further generation of lifethreating metabolism disorders including cardiovascular system diseases. Therefore, hypercholesterolemia (i.e., also referred to as hyperlipoproteinemia) is a serious disease state, which must be controlled. Currently, the treatment of hypercholesterolemia is mainly achieved through the employment of statins in the clinic, although there are alternative drugs (e.g., ezetimibe, cholestyramine). In fact, the original statins are natural products directly obtained from fungi-like molds and mushrooms and they are potent inhibitors of hydroxymethylglutaryl-CoA reductase, the key enzyme in the biosynthesis of cholesterol. This review focuses on the first identification of natural statins, their synthetic and semi-synthetic analogues, and the validation of hydroxymethylglutaryl-CoA reductase as a target in the treatment of hypercholesterolemia. Furthermore, other natural products that have been shown to possess the potential to inhibit hydroxymethylglutaryl-CoA reductase are also reviewed with respect to their chemical structures.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Janna Hastings ◽  
Martin Glauer ◽  
Adel Memariani ◽  
Fabian Neuhaus ◽  
Till Mossakowski

AbstractChemical data is increasingly openly available in databases such as PubChem, which contains approximately 110 million compound entries as of February 2021. With the availability of data at such scale, the burden has shifted to organisation, analysis and interpretation. Chemical ontologies provide structured classifications of chemical entities that can be used for navigation and filtering of the large chemical space. ChEBI is a prominent example of a chemical ontology, widely used in life science contexts. However, ChEBI is manually maintained and as such cannot easily scale to the full scope of public chemical data. There is a need for tools that are able to automatically classify chemical data into chemical ontologies, which can be framed as a hierarchical multi-class classification problem. In this paper we evaluate machine learning approaches for this task, comparing different learning frameworks including logistic regression, decision trees and long short-term memory artificial neural networks, and different encoding approaches for the chemical structures, including cheminformatics fingerprints and character-based encoding from chemical line notation representations. We find that classical learning approaches such as logistic regression perform well with sets of relatively specific, disjoint chemical classes, while the neural network is able to handle larger sets of overlapping classes but needs more examples per class to learn from, and is not able to make a class prediction for every molecule. Future work will explore hybrid and ensemble approaches, as well as alternative network architectures including neuro-symbolic approaches.


Author(s):  
Christopher D O’Connor ◽  
John Ng ◽  
Dallas Hill ◽  
Tyler Frederick

Policing is increasingly being shaped by data collection and analysis. However, we still know little about the quality of the data police services acquire and utilize. Drawing on a survey of analysts from across Canada, this article examines several data collection, analysis, and quality issues. We argue that as we move towards an era of big data policing it is imperative that police services pay more attention to the quality of the data they collect. We conclude by discussing the implications of ignoring data quality issues and the need to develop a more robust research culture in policing.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Michelle Amri ◽  
Christina Angelakis ◽  
Dilani Logan

Abstract Objective Through collating observations from various studies and complementing these findings with one author’s study, a detailed overview of the benefits and drawbacks of asynchronous email interviewing is provided. Through this overview, it is evident there is great potential for asynchronous email interviews in the broad field of health, particularly for studies drawing on expertise from participants in academia or professional settings, those across varied geographical settings (i.e. potential for global public health research), and/or in circumstances when face-to-face interactions are not possible (e.g. COVID-19). Results Benefits of asynchronous email interviewing and additional considerations for researchers are discussed around: (i) access transcending geographic location and during restricted face-to-face communications; (ii) feasibility and cost; (iii) sampling and inclusion of diverse participants; (iv) facilitating snowball sampling and increased transparency; (v) data collection with working professionals; (vi) anonymity; (vii) verification of participants; (viii) data quality and enhanced data accuracy; and (ix) overcoming language barriers. Similarly, potential drawbacks of asynchronous email interviews are also discussed with suggested remedies, which centre around: (i) time; (ii) participant verification and confidentiality; (iii) technology and sampling concerns; (iv) data quality and availability; and (v) need for enhanced clarity and precision.


2021 ◽  
Vol 13 (6) ◽  
pp. 3320
Author(s):  
Amy R. Villarosa ◽  
Lucie M. Ramjan ◽  
Della Maneze ◽  
Ajesh George

The COVID-19 pandemic has resulted in many changes, including restrictions on indoor gatherings and visitation to residential aged care facilities, hospitals and certain communities. Coupled with potential restrictions imposed by health services and academic institutions, these changes may significantly impact the conduct of population health research. However, the continuance of population health research is beneficial for the provision of health services and sometimes imperative. This paper discusses the impact of COVID-19 restrictions on the conduct of population health research. This discussion unveils important ethical considerations, as well as potential impacts on recruitment methods, face-to-face data collection, data quality and validity. In addition, this paper explores potential recruitment and data collection methods that could replace face-to-face methods. The discussion is accompanied by reflections on the challenges experienced by the authors in their own research at an oral health service during the COVID-19 pandemic and alternative methods that were utilised in place of face-to-face methods. This paper concludes that, although COVID-19 presents challenges to the conduct of population health research, there is a range of alternative methods to face-to-face recruitment and data collection. These alternative methods should be considered in light of project aims to ensure data quality is not compromised.


Metabolites ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 48
Author(s):  
Marc Feuermann ◽  
Emmanuel Boutet ◽  
Anne Morgat ◽  
Kristian Axelsen ◽  
Parit Bansal ◽  
...  

The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.


2020 ◽  
Vol 5 (8) ◽  
Author(s):  
Fidele Ntie-Kang ◽  
Daniel Svozil

AbstractThe discovery of a new drug is a multidisciplinary and very costly task. One of the major steps is the identification of a lead compound, i.e. a compound with a certain degree of potency and that can be chemically modified to improve its activity, metabolic properties, and pharmacokinetics profiles. Terrestrial sources (plants and fungi), microbes and marine organisms are abundant resources for the discovery of new structurally diverse and biologically active compounds. In this chapter, an attempt has been made to quantify the numbers of known published chemical structures (available in chemical databases) from natural sources. Emphasis has been laid on the number of unique compounds, the most abundant compound classes and the distribution of compounds in terrestrial and marine habitats. It was observed, from the recent investigations, that ~500,000 known natural products (NPs) exist in the literature. About 70 % of all NPs come from plants, terpenoids being the most represented compound class (except in bacteria, where amino acids, peptides, and polyketides are the most abundant compound classes). About 2,000 NPs have been co-crystallized in PDB structures.


Sign in / Sign up

Export Citation Format

Share Document