chemical space Latest Research Papers

Micellar photocatalysis enables divergent C-H arylation and N dealkylation of benzamides via N-acyliminium cations

10.33774/chemrxiv-2021-fm09s-v2 ◽

2022 ◽

Author(s):

Martyna Cybularczyk-Cecotka ◽

Jędrzej Predygier ◽

Stefano Crespi ◽

Joanna Szczepanik ◽

Maciej Giedyk

Keyword(s):

Methylene Blue ◽

Light Source ◽

Catalytic System ◽

Chemical Space ◽

Mechanistic Studies ◽

Mild Conditions ◽

Reducing Conditions ◽

Benzamide Derivatives

Micellar photocatalysis has recently opened new avenues to activate strong carbon halide bonds. So far, however, it has mainly explored strongly reducing conditions restricting the available chemical space to radical or anionic reactivity. Here, we demonstrate a radical-polar crossover process involving cationic intermediates, which enables chemodivergent modification of chlorinated benzamide derivatives via either C H arylation or N dealkylation. The catalytic system operates under mild conditions employing methylene blue as a photocatalyst and blue LEDs as the light source. Factors determining the reactivity of substrates and preliminary mechanistic studies are presented.

LEADD: Lamarckian evolutionary algorithm for de novo drug design

Journal of Cheminformatics ◽

10.1186/s13321-022-00582-y ◽

2022 ◽

Vol 14 (1) ◽

Author(s):

Alan Kerstjens ◽

Hans De Winter

Keyword(s):

Drug Design ◽

Objective Function ◽

Evolutionary Algorithm ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

Genetic Operators ◽

Computationally Efficient ◽

De Novo Drug Design ◽

Efficient Manner

AbstractGiven an objective function that predicts key properties of a molecule, goal-directed de novo molecular design is a useful tool to identify molecules that maximize or minimize said objective function. Nonetheless, a common drawback of these methods is that they tend to design synthetically unfeasible molecules. In this paper we describe a Lamarckian evolutionary algorithm for de novo drug design (LEADD). LEADD attempts to strike a balance between optimization power, synthetic accessibility of designed molecules and computational efficiency. To increase the likelihood of designing synthetically accessible molecules, LEADD represents molecules as graphs of molecular fragments, and limits the bonds that can be formed between them through knowledge-based pairwise atom type compatibility rules. A reference library of drug-like molecules is used to extract fragments, fragment preferences and compatibility rules. A novel set of genetic operators that enforce these rules in a computationally efficient manner is presented. To sample chemical space more efficiently we also explore a Lamarckian evolutionary mechanism that adapts the reproductive behavior of molecules. LEADD has been compared to both standard virtual screening and a comparable evolutionary algorithm using a standardized benchmark suite and was shown to be able to identify fitter molecules more efficiently. Moreover, the designed molecules are predicted to be easier to synthesize than those designed by other evolutionary algorithms. Graphical Abstract

Chemical Space Expansion of Flavonoids: Induction of Mitotic Inhibition by Replacing Ring B with a 10π-Electron System, Benzo[b]thiophene

Journal of Natural Products ◽

10.1021/acs.jnatprod.1c00867 ◽

2022 ◽

Author(s):

Sachika Hirazawa ◽

Yohei Saito ◽

Momoko Sagano ◽

Masuo Goto ◽

Kyoko Nakagawa-Goto

Keyword(s):

Chemical Space ◽

Electron System ◽

Mitotic Inhibition ◽

Space Expansion

Non-Covalent Interactions Atlas Benchmark Data Sets 5: London Dispersion in an Extended Chemical Space

10.26434/chemrxiv-2022-pl3r8 ◽

2022 ◽

Author(s):

Jan Řezáč

Keyword(s):

Chemical Space ◽

Data Sets ◽

Dft Methods ◽

Data Set ◽

Data Points ◽

Comprehensive Test ◽

Non Covalent Interactions ◽

London Dispersion ◽

Dissociation Curves ◽

Covalent Interactions

The Non-Covalent Interactions Atlas (www.nciatlas.org) has been extended with two data sets of benchmark interaction energies in complexes dominated by London dispersion. The D1200 data set of equilibrium geometries provides a thorough sampling of an extended chemical space, while the D442×10 set features dissociation curves for selected complexes. In total, they provide 5,178 new CCSD(T)/CBS data points of the highest quality. The new data have been combined with previous NCIA data sets in a comprehensive test of dispersion-corrected DFT methods, identifying the ones that achieve high accuracy in all types of non-covalent interactions in a broad chemical space. Additional tests of dispersion-corrected MP2 and semiempirical QM methods are also reported.

Machine learned calibrations to high-throughput molecular excited state calculations

10.26434/chemrxiv-2022-08jm9 ◽

2022 ◽

Author(s):

Shomik Verma ◽

Miguel Rivera ◽

David O. Scanlon ◽

Aron Walsh

Keyword(s):

Machine Learning ◽

Excited State ◽

High Throughput ◽

High Speed ◽

Large Scale ◽

Chemical Space ◽

False Negative ◽

Screening Method ◽

Calibration Model ◽

Linear Calibration

Understanding the excited state properties of molecules provides insights into how they interact with light. These interactions can be exploited to design compounds for photochemical applications, including enhanced spectral conversion of light to increase the efficiency of photovoltaic cells. While chemical discovery is time- and resource-intensive experimentally, computational chemistry can be used to screen large-scale databases for molecules of interest in a procedure known as high-throughput virtual screening. The first step usually involves a high-speed but low-accuracy method to screen large numbers of molecules (potentially millions) so only the best candidates are evaluated with expensive methods. However, use of a coarse first-pass screening method can potentially result in high false positive or false negative rates. Therefore, this study uses machine learning to calibrate a high-throughput technique (xTB-sTDA) against a higher accuracy one (TD-DFT). Testing the calibration model shows a ~5-fold decrease in error in-domain and a ~3-fold decrease out-of-domain. The resulting mean absolute error of ~0.14 eV is in line with previous work in machine learning calibrations and out-performs previous work in linear calibration of xTB-sTDA. We then apply the calibration model to screen a 250k molecule database and map inaccuracies of xTB-sTDA in chemical space. We also show generalizability of the workflow by calibrating against a higher-level technique (CC2), yielding a similarly low error. Overall, this work demonstrates machine learning can be used to develop a both cheap and accurate method for large-scale excited state screening, enabling accelerated molecular discovery across a variety of disciplines.

Improving machine learning performance on small chemical reaction data with unsupervised contrastive pretraining

10.26434/chemrxiv-2021-xr8tf-v2 ◽

2022 ◽

Author(s):

Mingjian Wen ◽

Samuel M. Blau ◽

Xiaowei Xie ◽

Shyam Dwaraknath ◽

Kristin A. Persson

Keyword(s):

Machine Learning ◽

Chemical Reaction ◽

Chemical Space ◽

Relevant Information ◽

Learning Performance ◽

Reaction Data ◽

Unlabelled Data ◽

Graph Neural Networks ◽

Small Chemical ◽

Fine Tune

Machine learning (ML) methods have great potential to transform chemical discovery by accelerating the exploration of chemical space and drawing scientific insights from data. However, modern chemical reaction ML models, such as those based on graph neural networks (GNNs), must be trained on a large amount of labelled data in order to avoid overfitting the data and thus possessing low accuracy and transferability. In this work, we propose a strategy to leverage unlabelled data to learn accurate ML models for small labelled chemical reaction data. We focus on an old and prominent problem—classifying reactions into distinct families—and build a GNN model for this task. We first pretrain the model on unlabelled reaction data using unsupervised contrastive learning and then fine-tune it on a small number of labelled reactions. The contrastive pretraining learns by making the representations of two augmented versions of a reaction similar to each other but distinct from other reactions. We propose chemically consistent reaction augmentation methods that protect the reaction center and find they are the key for the model to extract relevant information from unlabelled data to aid the reaction classification task. The transfer learned model outperforms a supervised model trained from scratch by a large margin. Further, it consistently performs better than models based on traditional rule-driven reaction fingerprints, which have long been the default choice for small datasets. In addition to reaction classification, the effectiveness of the strategy is tested on regression datasets; the learned GNN-based reaction fingerprints can also be used to navigate the chemical reaction space, which we demonstrate by querying for similar reactions. The strategy can be readily applied to other predictive reaction problems to uncover the power of unlabelled data for learning better models with a limited supply of labels.

Entering Chemical Space with Theoretical Underpinning of the Mechanistic Pathways in the Chan–Lam Amination

ACS Catalysis ◽

10.1021/acscatal.1c04479 ◽

2022 ◽

pp. 1461-1474

Author(s):

Sanjoy Bose ◽

Sayan Dutta ◽

Debasis Koley

Keyword(s):

Chemical Space ◽

Theoretical Underpinning

Bioactive Marine Xanthones: A Review

Marine Drugs ◽

10.3390/md20010058 ◽

2022 ◽

Vol 20 (1) ◽

pp. 58

Author(s):

José X. Soares ◽

Daniela R. P. Loureiro ◽

Ana Laura Dias ◽

Salete Reis ◽

Madalena M. M. Pinto ◽

...

Keyword(s):

Molecular Descriptors ◽

Chemical Space ◽

Biological Activities ◽

Bioactive Molecules ◽

Specialized Metabolites ◽

Structural Variety ◽

Xanthone Derivatives ◽

Comprehensive Literature Review ◽

Starting Point ◽

Chemical Class

The marine environment is an important source of specialized metabolites with valuable biological activities. Xanthones are a relevant chemical class of specialized metabolites found in this environment due to their structural variety and their biological activities. In this work, a comprehensive literature review of marine xanthones reported up to now was performed. A large number of bioactive xanthone derivatives (169) were identified, and their structures, biological activities, and natural sources were described. To characterize the chemical space occupied by marine-derived xanthones, molecular descriptors were calculated. For the analysis of the molecular descriptors, the xanthone derivatives were grouped into five structural categories (simple, prenylated, O-heterocyclic, complex, and hydroxanthones) and six biological activities (antitumor, antibacterial, antidiabetic, antifungal, antiviral, and miscellaneous). Moreover, the natural product-likeness and the drug-likeness of marine xanthones were also assessed. Marine xanthone derivatives are rewarding bioactive compounds and constitute a promising starting point for the design of other novel bioactive molecules.

Bio-inspired Chemical Space Exploration of Terpenoids

10.26434/chemrxiv-2022-0l482 ◽

2022 ◽

Author(s):

tao zeng ◽

B. Andes Hess ◽

fan zhang ◽

ruibo wu

Keyword(s):

Natural Products ◽

Chemical Space ◽

Reaction Pathway ◽

Reaction Network ◽

Structural Diversity ◽

Reaction Heat ◽

Heterologous Biosynthesis ◽

Synthetic Accessibility ◽

Dynamics Simulations ◽

Intrinsic Feature

Many computational methods are used to expand the open-ended border of chemical spaces. Natural products and their derivatives are an important source for drug discovery, and some algorithms are devoted to rapidly generating pseudo-natural products, while their accessibility and chemical interpretation were often ignored or underestimated, thus hampering experimental synthesis in practice. Herein, a bio-inspired strategy (named TeroGen) is proposed, in which the cyclization and decoration stage of terpenoid biosynthesis were mimicked by meta-dynamics simulations and deep learning models respectively, to explore their chemical space. In the protocol of TeroGen, the synthetic accessibility is validated by reaction energetics (reaction barrier and reaction heat) based on the GFN2-xTB methods. Chemical interpretation is an intrinsic feature as the reaction pathway is bioinspired and triggered by the RMSD-PP method in conjunction with an encoder-decoder architecture. This is quite distinct from conventional library/fragment-based or rule-based strategies, by using TeroGen, new reaction routes are feasibly explored to increase the structural diversity. For example, only a rather limited number of sesterterpenoids in our training set is included in this work, but our TeroGen would predict more than 30000 sesterterpenoids and map out the reaction network with super efficiency, ten times as many as the known sesterterpenoids (less than 2500). In sum, TeroGen not only greatly expands the chemical space of terpenoids but also provides various plausible biosynthetic pathways, which are crucial clues for heterologous biosynthesis, bio-mimic and chemical synthesis of complicated terpenoids.

Intelligent pharmaceutical patent search on a near-term gate-based quantum computer

Scientific Reports ◽

10.1038/s41598-021-04031-y ◽

2022 ◽

Vol 12 (1) ◽

Author(s):

Pei-Hua Wang ◽

Jen-Hao Chen ◽

Yufeng Jane Tseng

Keyword(s):

Quantum Computer ◽

Chemical Space ◽

Quantum Circuit ◽

Patent Analysis ◽

Pharmaceutical Companies ◽

Pharmaceutical Patent ◽

Product Protection ◽

Quantum Simulator ◽

Near Term ◽

Markush Structures

AbstractPharmaceutical patent analysis is the key to product protection for pharmaceutical companies. In patent claims, a Markush structure is a standard chemical structure drawing with variable substituents. Overlaps between apparently dissimilar Markush structures are nearly unrecognizable when the structures span a broad chemical space. We propose a quantum search-based method which performs an exact comparison between two non-enumerated Markush structures with a constraint satisfaction oracle. The quantum circuit is verified with a quantum simulator and the real effect of noise is estimated using a five-qubit superconductivity-based IBM quantum computer. The possibilities of measuring the correct states can be increased by improving the connectivity of the most computation intensive qubits. Depolarizing error is the most influential error. The quantum method to exactly compares two patents is hard to simulate classically and thus creates a quantum advantage in patent analysis.

chemical space
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Micellar photocatalysis enables divergent C-H arylation and N dealkylation of benzamides via N-acyliminium cations

LEADD: Lamarckian evolutionary algorithm for de novo drug design

Chemical Space Expansion of Flavonoids: Induction of Mitotic Inhibition by Replacing Ring B with a 10π-Electron System, Benzo[b]thiophene

Non-Covalent Interactions Atlas Benchmark Data Sets 5: London Dispersion in an Extended Chemical Space

Machine learned calibrations to high-throughput molecular excited state calculations

Improving machine learning performance on small chemical reaction data with unsupervised contrastive pretraining

Entering Chemical Space with Theoretical Underpinning of the Mechanistic Pathways in the Chan–Lam Amination

Bioactive Marine Xanthones: A Review

Bio-inspired Chemical Space Exploration of Terpenoids

Intelligent pharmaceutical patent search on a near-term gate-based quantum computer

Export Citation Format

chemical spaceRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Micellar photocatalysis enables divergent C-H arylation and N dealkylation of benzamides via N-acyliminium cations

LEADD: Lamarckian evolutionary algorithm for de novo drug design

Chemical Space Expansion of Flavonoids: Induction of Mitotic Inhibition by Replacing Ring B with a 10π-Electron System, Benzo[b]thiophene

Non-Covalent Interactions Atlas Benchmark Data Sets 5: London Dispersion in an Extended Chemical Space

Machine learned calibrations to high-throughput molecular excited state calculations

Improving machine learning performance on small chemical reaction data with unsupervised contrastive pretraining

Entering Chemical Space with Theoretical Underpinning of the Mechanistic Pathways in the Chan–Lam Amination

Bioactive Marine Xanthones: A Review

Bio-inspired Chemical Space Exploration of Terpenoids

Intelligent pharmaceutical patent search on a near-term gate-based quantum computer

chemical space
Recently Published Documents