PubChem and ChEMBL Beyond Lipinski

Author(s):  
Jean-Louis Reymond ◽  
Mahendra Awale ◽  
Daniel Probst ◽  
Alice Capecchi

<p>Seven million of the currently 94 million entries in the PubChem database break at least one of the four Lipinski constraints for oral bioavailability, 183,185 of which are also found in the ChEMBL database. These non-Lipinski PubChem (NLP) and ChEMBL (NLC) subsets are interesting because they contain new modalities that can display biological properties not accessible to small molecule drugs. Unfortunately, the current search tools in PubChem and ChEMBL are designed for small molecules and are not well suited to explore these subsets, which therefore remain poorly appreciated. Herein we report MXFP (macromolecule extended atom-pair fingerprint), a 217-D fingerprint tailored to analyze large molecules in terms of molecular shape and pharmacophores. We implement MXFP in two web-based applications, the first one to visualize NLP and NLC interactively using Faerun (http://faerun.gdb.tools/), the second one to perform MXFP nearest neighbor searches in NLP (http://similaritysearch.gdb.tools/). We show that these tools provide a meaningful insight into the diversity of large molecules in NLP and NLC. The interactive tools presented here are publicly available at http://gdb.unibe.ch and can be used freely to explore and better understand the diversity of non-Lipinski molecules in PubChem and ChEMBL.</p>

2019 ◽  
Author(s):  
Alice Capecchi ◽  
Mahendra Awale ◽  
Daniel Probst ◽  
Jean-Louis Reymond

<p>Seven million of the currently 94 million entries in the PubChem database break at least one of the four Lipinski constraints for oral bioavailability, 183,185 of which are also found in the ChEMBL database. These non-Lipinski PubChem (NLP) and ChEMBL (NLC) subsets are interesting because they contain new modalities that can display biological properties not accessible to small molecule drugs. Unfortunately, the current search tools in PubChem and ChEMBL are designed for small molecules and are not well suited to explore these subsets, which therefore remain poorly appreciated. Herein we report MXFP (macromolecule extended atom-pair fingerprint), a 217-D fingerprint tailored to analyze large molecules in terms of molecular shape and pharmacophores. We implement MXFP in two web-based applications, the first one to visualize NLP and NLC interactively using Faerun (http://faerun.gdb.tools/), the second one to perform MXFP nearest neighbor searches in NLP (http://similaritysearch.gdb.tools/). We show that these tools provide a meaningful insight into the diversity of large molecules in NLP and NLC. The interactive tools presented here are publicly available at http://gdb.unibe.ch and can be used freely to explore and better understand the diversity of non-Lipinski molecules in PubChem and ChEMBL.</p>


2019 ◽  
Author(s):  
Alice Capecchi ◽  
Mahendra Awale ◽  
Daniel Probst ◽  
Jean-Louis Reymond

<p>Seven million of the currently 94 million entries in the PubChem database break at least one of the four Lipinski constraints for oral bioavailability, 183,185 of which are also found in the ChEMBL database. These non-Lipinski PubChem (NLP) and ChEMBL (NLC) subsets are interesting because they contain new modalities that can display biological properties not accessible to small molecule drugs. Unfortunately, the current search tools in PubChem and ChEMBL are designed for small molecules and are not well suited to explore these subsets, which therefore remain poorly appreciated. Herein we report MXFP (macromolecule extended atom-pair fingerprint), a 217-D fingerprint tailored to analyze large molecules in terms of molecular shape and pharmacophores. We implement MXFP in two web-based applications, the first one to visualize NLP and NLC interactively using Faerun (http://faerun.gdb.tools/), the second one to perform MXFP nearest neighbor searches in NLP (http://similaritysearch.gdb.tools/). We show that these tools provide a meaningful insight into the diversity of large molecules in NLP and NLC. The interactive tools presented here are publicly available at http://gdb.unibe.ch and can be used freely to explore and better understand the diversity of non-Lipinski molecules in PubChem and ChEMBL.</p>


2011 ◽  
Vol 44 (4) ◽  
pp. 878-881 ◽  
Author(s):  
Hwanho Choi ◽  
Hongsuk Kang ◽  
Hwangseo Park

MetLigDB (http://silver.sejong.ac.kr/MetLigDB) is a publicly accessible web-based database through which the interactions between a variety of chelating groups and various central metal ions in the active site of metalloproteins can be explored in detail. Additional information can also be retrieved, including protein and inhibitor names, the amino acid residues coordinated to the central metal ion, and the binding affinity of the inhibitor for the target metalloprotein. Although many metalloproteins have been considered promising targets for drug discovery, it is difficult to discover new inhibitors because of the difficulty in designing a suitable chelating moiety to impair the catalytic activity of the central metal ion. Because both common and specific chelating groups can be identified for varying metal ions and the associated coordination environments, MetLigDB is expected to give users insight into designing new inhibitors of metalloproteins for drug discovery.


2021 ◽  
Vol 22 (9) ◽  
pp. 4308
Author(s):  
Chayanaphat Chokradjaroen ◽  
Jiangqi Niu ◽  
Gasidit Panomsuwan ◽  
Nagahiro Saito

Sustainability and environmental concerns have persuaded researchers to explore renewable materials, such as nature-derived polysaccharides, and add value by changing chemical structures with the aim to possess specific properties, like biological properties. Meanwhile, finding methods and strategies that can lower hazardous chemicals, simplify production steps, reduce time consumption, and acquire high-purified products is an important task that requires attention. To break through these issues, electrical discharging in aqueous solutions at atmospheric pressure and room temperature, referred to as the “solution plasma process”, has been introduced as a novel process for modification of nature-derived polysaccharides like chitin and chitosan. This review reveals insight into the electrical discharge in aqueous solutions and scientific progress on their application in a modification of chitin and chitosan, including degradation and deacetylation. The influencing parameters in the plasma process are intensively explained in order to provide a guideline for the modification of not only chitin and chitosan but also other nature-derived polysaccharides, aiming to address economic aspects and environmental concerns.


Sensors ◽  
2018 ◽  
Vol 18 (9) ◽  
pp. 2936 ◽  
Author(s):  
Xianghao Zhan ◽  
Xiaoqing Guan ◽  
Rumeng Wu ◽  
Zhan Wang ◽  
You Wang ◽  
...  

As alternative herbal medicine gains soar in popularity around the world, it is necessary to apply a fast and convenient means for classifying and evaluating herbal medicines. In this work, an electronic nose system with seven classification algorithms is used to discriminate between 12 categories of herbal medicines. The results show that these herbal medicines can be successfully classified, with support vector machine (SVM) and linear discriminant analysis (LDA) outperforming other algorithms in terms of accuracy. When principal component analysis (PCA) is used to lower the number of dimensions, the time cost for classification can be reduced while the data is visualized. Afterwards, conformal predictions based on 1NN (1-Nearest Neighbor) and 3NN (3-Nearest Neighbor) (CP-1NN and CP-3NN) are introduced. CP-1NN and CP-3NN provide additional, yet significant and reliable, information by giving the confidence and credibility associated with each prediction without sacrificing of accuracy. This research provides insight into the construction of a herbal medicine flavor library and gives methods and reference for future works.


2021 ◽  
Author(s):  
◽  
Taitusi Taufa

<p>Over the course of this study, various species of Tongan marine sponges were investigated using an NMR-based screening method and has resulted in the discovery of three new sesterterpenes and 11 known compounds. Examination of the sponge Fascaplysinopsis sp. resulted in the isolation of two novel sesterterpenes, isoluffariellolide (46) and 1-O-methylisoluffariellolide (47). Compounds 46 and 47 share the same backbone pattern as the known luffariellolide (45) and 25-Omethylluffariellolide (107) respectively, and differ only in the substitution pattern of the butenolide rings. Isoluffariellolide (46) was found to be approximately six times less cytotoxic than 1-O-methylisoluffariellolide (47). Interestingly, these results suggested that the 1-O-methyl group in compound 47 plays an important role in the cytotoxicity of the compound. Secothorectolide (49), a new ring-opened and geometric isomer of the known compound thorectolide (48), was obtained from a sponge of the order Dictyoceratida. This ring closure and opening relationship was also observed between manoalide (109) and secomanoalide (110), as well as luffariellins A (141) and B (142). Despite the different carbon skeleton, the functional groups in 141 and 142 are similar with those in 109 and 110, respectively, and not surprisingly the biological properties are almost identical. The biological activities of compounds 48 and 49 were almost the same, which would give an insight into the structure-activity relationship (SAR) between these types of compounds.</p>


2021 ◽  
Author(s):  
◽  
Taitusi Taufa

<p>Over the course of this study, various species of Tongan marine sponges were investigated using an NMR-based screening method and has resulted in the discovery of three new sesterterpenes and 11 known compounds. Examination of the sponge Fascaplysinopsis sp. resulted in the isolation of two novel sesterterpenes, isoluffariellolide (46) and 1-O-methylisoluffariellolide (47). Compounds 46 and 47 share the same backbone pattern as the known luffariellolide (45) and 25-Omethylluffariellolide (107) respectively, and differ only in the substitution pattern of the butenolide rings. Isoluffariellolide (46) was found to be approximately six times less cytotoxic than 1-O-methylisoluffariellolide (47). Interestingly, these results suggested that the 1-O-methyl group in compound 47 plays an important role in the cytotoxicity of the compound. Secothorectolide (49), a new ring-opened and geometric isomer of the known compound thorectolide (48), was obtained from a sponge of the order Dictyoceratida. This ring closure and opening relationship was also observed between manoalide (109) and secomanoalide (110), as well as luffariellins A (141) and B (142). Despite the different carbon skeleton, the functional groups in 141 and 142 are similar with those in 109 and 110, respectively, and not surprisingly the biological properties are almost identical. The biological activities of compounds 48 and 49 were almost the same, which would give an insight into the structure-activity relationship (SAR) between these types of compounds.</p>


2009 ◽  
Vol 36 (3) ◽  
pp. 7280-7287 ◽  
Author(s):  
Wu He ◽  
Feng-Kwei Wang ◽  
Tawnya Means ◽  
Li Da Xu

Sign in / Sign up

Export Citation Format

Share Document