scholarly journals RetCL: A Selection-based Approach for Retrosynthesis via Contrastive Learning

Author(s):  
Hankook Lee ◽  
Sungsoo Ahn ◽  
Seung-Woo Seo ◽  
You Young Song ◽  
Eunho Yang ◽  
...  

Retrosynthesis, of which the goal is to find a set of reactants for synthesizing a target product, is an emerging research area of deep learning. While the existing approaches have shown promising results, they currently lack the ability to consider availability (e.g., stability or purchasability) of the reactants or generalize to unseen reaction templates (i.e., chemical reaction rules). In this paper, we propose a new approach that mitigates the issues by reformulating retrosynthesis into a selection problem of reactants from a candidate set of commercially available molecules. To this end, we design an efficient reactant selection framework, named RetCL (retrosynthesis via contrastive learning), for enumerating all of the candidate molecules based on selection scores computed by graph neural networks. For learning the score functions, we also propose a novel contrastive training scheme with hard negative mining. Extensive experiments demonstrate the benefits of the proposed selection-based approach. For example, when all 671k reactants in the USPTO database are given as candidates, our RetCL achieves top-1 exact match accuracy of 71.3% for the USPTO-50k benchmark, while a recent transformer-based approach achieves 59.6%. We also demonstrate that RetCL generalizes well to unseen templates in various settings in contrast to template-based approaches.

Author(s):  
J. Joshua Thomas ◽  
Tran Huu Ngoc Tran ◽  
Gilberto Pérez Lechuga ◽  
Bahari Belaton

Applying deep learning to the pervasive graph data is significant because of the unique characteristics of graphs. Recently, substantial amounts of research efforts have been keen on this area, greatly advancing graph-analyzing techniques. In this study, the authors comprehensively review different kinds of deep learning methods applied to graphs. They discuss with existing literature into sub-components of two: graph convolutional networks, graph autoencoders, and recent trends including chemoinformatics research area including molecular fingerprints and drug discovery. They further experiment with variational autoencoder (VAE) analyze how these apply in drug target interaction (DTI) and applications with ephemeral outline on how they assist the drug discovery pipeline and discuss potential research directions.


2021 ◽  
Vol 11 (24) ◽  
pp. 12116
Author(s):  
Shanza Abbas ◽  
Muhammad Umair Khan ◽  
Scott Uk-Jin Lee ◽  
Asad Abbas

Natural language interfaces to databases (NLIDB) has been a research topic for a decade. Significant data collections are available in the form of databases. To utilize them for research purposes, a system that can translate a natural language query into a structured one can make a huge difference. Efforts toward such systems have been made with pipelining methods for more than a decade. Natural language processing techniques integrated with data science methods are researched as pipelining NLIDB systems. With significant advancements in machine learning and natural language processing, NLIDB with deep learning has emerged as a new research trend in this area. Deep learning has shown potential for rapid growth and improvement in text-to-SQL tasks. In deep learning NLIDB, closing the semantic gap in predicting users’ intended columns has arisen as one of the critical and fundamental problems in this research field. Contributions toward this issue have consisted of preprocessed feature inputs and encoding schema elements afore of and more impactful to the targeted model. Various significant work contributed towards this problem notwithstanding, this has been shown to be one of the critical issues for the task of developing NLIDB. Working towards closing the semantic gap between user intention and predicted columns, we present an approach for deep learning text-to-SQL tasks that includes previous columns’ occurrences scores as an additional input feature. Overall exact match accuracy can also be improved by emphasizing the improvement of columns’ prediction accuracy, which depends significantly on column prediction itself. For this purpose, we extract the query fragments from previous queries’ data and obtain the columns’ occurrences and co-occurrences scores. Column occurrences and co-occurrences scores are processed as input features for the encoder–decoder-based text to the SQL model. These scores contribute, as a factor, the probability of having already used columns and tables together in the query history. We experimented with our approach on the currently popular text-to-SQL dataset Spider. Spider is a complex data set containing multiple databases. This dataset includes query–question pairs along with schema information. We compared our exact match accuracy performance with a base model using their test and training data splits. It outperformed the base model’s accuracy, and accuracy was further boosted in experiments with the pretrained language model BERT.


2020 ◽  
Vol 28 (2) ◽  
pp. 707-715 ◽  
Author(s):  
Leszek Koziol ◽  
Michal Koziol

AbstractThe paper presents key issues related to motivation in the workplace and its methodological aspects, giving special attention to an analysis of the classification of motivation factors. It describes the characteristics of major selected factors. The author proposes a new approach to factors based on the concept of the trichotomy of motivation factors in the workplace (work environment and work situations) which extends Herzberg’s two-factor theory. The concept identifies three groups of factors: “motivators” which, when they occur, lead to satisfaction, “hygiene factors” which, when they do not occur, lead to dissatisfaction, and “demotivators” which, when they occur, lead to dissatisfaction. Their vectors of impact are totally different, although they occur simultaneously in the workplace. Therefore, the presented concept constitutes a methodological directive which suggests the extension of the research area by including an analysis of factors which reduce motivation to work.


2011 ◽  
pp. 65-82
Author(s):  
Siti Khadijah Ali ◽  
Rahmita Wirza O. K. Rahmat ◽  
Fatimah Khalid ◽  
Seng Beng Ng

Pemudahan permukaan telah menjadi topik kajian yang menarik dalam beberapa dekad ini. Sesetengah penyelidik memperkenalkan kaedah yang baharu manakala penyelidik lain melakukan pengubahsuaian ataupun penambahbaikan terhadap kaedah yang sedia ada. Walau bagaimanapun, model yang telah dipermudahkan mempunyai kecenderungan untuk kehilangan sebahagian daripada bentuk asalnya. Dalam pada itu, untuk mengatasi segala masalah ini, satu algoritma baru diperkenalkan, di mana formula nisbah digunakan untuk mengenal pasti calon elemen yang layak untuk dibuang. Elemen yang digunakan dalam algoritma baru ini adalah segitiga. Keputusan kajian menunjukkan bahawa algoritma yang dicadangkan dapat mengekalkan bentuk permukaan model asal. Kata kunci: Permudahan permukaan; persamaan rasional; berjenis; segitiga kontraksi; penipisan permukaan Mesh simplification has become an interesting research area over the past few decades. Some researchers have introduced various new methods while others have made some modifications or improvements to the existing method. However, the modified model tended to lose some of its original shape. Hence, to solve these arising problems, a new decimation algorithm is introduced, in which rational equation is used to select elements as candidates to be removed. The proposed algorithm also uses triangular mesh as the element. The results show that our proposed algorithm can preserve the shape of the mesh. Key words: Mesh simplification; rational equation; manifold; triangle contraction; mesh decimation


2019 ◽  
Vol 487 (1) ◽  
pp. 45-48
Author(s):  
M. I. Alymov ◽  
B. S. Seplyarskii ◽  
R. A. Kochetkov ◽  
T. G. Lisina

For the first time, thermally coupled SHS processes in granulated mixture were experimentally investigated. Mixtures of Ni+Al and Ti+C granules differ in combustion rate and temperature. It has been established that ignition of Ni+Al acceptor granules occurs in the combustion front. It is shown that the use of granular mixtures allows to obtain combustion products in the form of easily destructible sample. Release of the target product turned possible, though the melting point of the product of acceptor mixture interaction is less than the adiabatic temperature of combustion.


Author(s):  
D. Selvathi ◽  
S. Thamarai Selvi ◽  
C. Loorthu Sahaya Malar

SURE-LET Approach is used for reducing or removing noise in brain Magnetic Resonance Images (MRI). Removing or reducing noise is an active research area in image processing. Rician noise is the dominant noise in MRIs. Due to this type of noise, the abnormal tissue (cancerous tissue) may be misclassified as normal tissue and introduces bias into MRI measurements that can have signi?cant impact on the shapes and orientations of tensors in di?usion tensor MRIs. SURE is a new approach to Orthonormal wavelet image denoising. It is an image-domain minimization of an estimate of the mean squared error—Stein’s unbiased risk estimates (SURE). Here, the denoising process can be expressed as a linear combination of elementary denoising processes-linear expansion of thresholds (LET). Different Shrinkage functions such as Soft and Hard and Shrinkage rules and Universal and BayesShrink are used to remove noise and the performance of these results are compared. The algorithm is applied on brain MRIs with different noisy conditions by varying standard deviation of noise. The performance of this approach is compared with performance of the Curvelet transform.


Energies ◽  
2021 ◽  
Vol 14 (3) ◽  
pp. 624
Author(s):  
Jarosław Stryczek ◽  
Piotr Stryczek

Gerotor technology is an important research area in the field of hydraulics which attracts the attention of both academic scientists and industry. Despite the numerous publications announced by academics, as well as a considerable number of projects made by industry, the subject has not been exhausted. This paper presents a new approach to gerotor technology which has been formed by gathering the authors’ knowledge of gerotors in a synthetic form. The following scientific and technical results have been obtained: (1) A uniform system of parameters and basic concepts regarding toothing and cycloidal gearing (z, m, λ, v, g) which is consistently used to describe the geometry, kinematics, hydraulics and manufacture of those elements; (2) description of the geometry and kinematics of the epicycloidal and hypocycloidal gears with the use of the adopted system of parameters. Additionally, the epicycloidal/hypocycloidal double gearing is presented, which is an original idea of the authors; (3) description of the hydraulics of the gerotor and orbital machines, and in particular: (i) determination of equations for delivery (capacity) q and irregularity of delivery (capacity) δ using the above-mentioned system of basic parameters; (ii) formulation of the principles of designing internal channels and clearances in the gerotor machines and presentation of the original disc distributor in the epicycloidal/hypocycloidal orbital motor; (iii) presentation of the methods of manufacturing the epicycloidal and hypocycloidal gearings with 12 examples of the systems implemented in practice; (4) presentation of the research methods applied for the examination of the gerotor machines, combining computer simulation and experimental research into a coherent and cohesive whole which results in the effect of research synergy. Such a synthesis of knowledge may serve the improvement, creation and investigation of gerotor and orbital machines carried out by academics and industry.


2013 ◽  
Vol 634-638 ◽  
pp. 1211-1214
Author(s):  
Jin Hua Zhou ◽  
Hai Li Gao ◽  
Han Zhou Sun ◽  
Yu Xiong Wu

A new practical and efficient route was developed for the synthesis of 6,7-dihydro-5H-cyclopenta[b]pyridine, which is a key intermediate of cefpirome. Leading to the formation of the corresponding product, nucleophilic addition, acetylization, Vilsmeier cyclization reaction and dechlorination were employed under mild reaction conditions by using commercially available cyclopentanone and benzylamine as raw materials. The total yield of this newly developed synthetic route for the target product was 43.15% with 99.7% of purity (HPLC). The structure of target molecular was confirmed by LC-MS and 1H NMR spectrum.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3894 ◽  
Author(s):  
Qijie Wang ◽  
Wenyan Yu ◽  
Bing Xu ◽  
Guoguang Wei

The Generic Atmospheric Correction Online Service (GACOS) products for interferometric synthetic aperture radar (InSAR) are widely used near-real-time and global-coverage atmospheric delay products which provide a new approach for the atmospheric correction of repeat-pass InSAR. However, it has not been determined whether these products can improve the accuracy of InSAR deformation monitoring. In this paper, GACOS products were used to correct atmospheric errors in short baseline subset (SBAS)-InSAR. Southern California in the U.S. was selected as the research area, and the effect of GACOS-based SBAS-InSAR was analyzed by comparing with classical SBAS-InSAR results and external global positioning system (GPS) data. The results showed that the accuracy of deformation monitoring was improved in the whole study area after GACOS correction, and the mean square error decreased from 0.34 cm/a to 0.31 cm/a. The improvement of the mid-altitude (15–140 m) point was the most obvious after GACOS correction, and the accuracy was increased by about 23%. The accuracy for low- and high-altitude areas was roughly equal and there was no significant improvement. Additionally, GACOS correction may increase the error for some points, which may be related to the low accuracy of GACOS turbulence data.


2021 ◽  
Author(s):  
XIAOYAN Zhang ◽  
Alvaro E. Ulloa Cerna ◽  
Joshua V. Stough ◽  
Yida Chen ◽  
Brendan J. Carry ◽  
...  

Abstract Use of machine learning for automated annotation of heart structures from echocardiographic videos is an active research area, but understanding of comparative, generalizable performance among models is lacking. This study aimed to 1) assess the generalizability of five state-of-the-art machine learning-based echocardiography segmentation models within a large clinical dataset, and 2) test the hypothesis that a quality control (QC) method based on segmentation uncertainty can further improve segmentation results. Five models were applied to 47,431 echocardiography studies that were independent from any training samples. Chamber volume and mass from model segmentations were compared to clinically-reported values. The median absolute errors (MAE) in left ventricular (LV) volumes and ejection fraction exhibited by all five models were comparable to reported inter-observer errors (IOE). MAE for left atrial volume and LV mass were similarly favorable to respective IOE for models trained for those tasks. A single model consistently exhibited the lowest MAE in all five clinically-reported measures. We leveraged the 10-fold cross-validation training scheme of this best-performing model to quantify segmentation uncertainty for potential application as QC. We observed that filtering segmentations with high uncertainty improved segmentation results, leading to decreased volume/mass estimation errors. The addition of contour-convexity filters further improved QC efficiency. In conclusion, five previously published echocardiography segmentation models generalized to a large, independent clinical dataset—segmenting one or multiple cardiac structures with overall accuracy comparable to manual analyses—with variable performance. Convexity-reinforced uncertainty QC efficiently improved segmentation performance and may further facilitate the translation of such models.


Sign in / Sign up

Export Citation Format

Share Document