scholarly journals Molecular generation by Fast Assembly of (Deep)SMILES fragments

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Francois Berenger ◽  
Koji Tsuda

Abstract Background In recent years, in silico molecular design is regaining interest. To generate on a computer molecules with optimized properties, scoring functions can be coupled with a molecular generator to design novel molecules with a desired property profile. Results In this article, a simple method is described to generate only valid molecules at high frequency ($$>300,000$$ > 300 , 000 molecule/s using a single CPU core), given a molecular training set. The proposed method generates diverse SMILES (or DeepSMILES) encoded molecules while also showing some propensity at training set distribution matching. When working with DeepSMILES, the method reaches peak performance ($$>340,000$$ > 340 , 000 molecule/s) because it relies almost exclusively on string operations. The “Fast Assembly of SMILES Fragments” software is released as open-source at https://github.com/UnixJunkie/FASMIFRA. Experiments regarding speed, training set distribution matching, molecular diversity and benchmark against several other methods are also shown.

2018 ◽  
Vol 8 (1) ◽  
pp. 16
Author(s):  
Ilaria Lucrezia Amerise ◽  
Agostino Tarsitano

The objective of this research is to develop a fast, simple method for detecting and replacing extreme spikes in high-frequency time series data. The method primarily consists  of a nonparametric procedure that pursues a balance between fidelity to observed data and smoothness. Furthermore, through examination of the absolute difference between original and smoothed values, the technique is also able to detect and, where necessary, replace outliers with less extreme data. Unlike other filtering procedures found in the literature, our method does not require a model to be specified for the data. Additionally, the filter makes only a single pass through the time series. Experiments  show that the new method can be validly used as a data preparation tool to ensure that time series modeling is supported by clean data, particularly in a complex context such as one with high-frequency data.


1969 ◽  
Vol 44 (12) ◽  
pp. 1738-1741 ◽  
Author(s):  
Silvano Bonotto ◽  
Eliane Bonnijns-Van Gelder

Author(s):  
Oleksii Prykhodko ◽  
Simon Viet Johansson ◽  
Panagiotis-Christos Kotsias ◽  
Josep Arús-Pous ◽  
Esben Jannik Bjerrum ◽  
...  

<p> </p><p>Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases: sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.</p><p> </p>


2011 ◽  
Vol 2011 ◽  
pp. 1-6 ◽  
Author(s):  
C. W. Mason ◽  
A. M. Kannan

A simple method to prepare a durable, low platinum-loading catalyst layer for the cathode in a proton exchange membrane fuel cell is tested and described. Multiwalled carbon nanotubes (MWCNTs) are functionalized with citric acid and then suspended in ethylene glycol. Here, platinum nanoparticles (~4 nm) are loaded onto the surface of the MWCNTs after hexachloroplatinic acid is reduced by aqueous sodium formate. A peak performance of 813 mW⋅cm−2 was achieved with a total membrane electrode assembly (MEA) platinum catalyst loading of 0.2 mg⋅cm−2 (0.1 mg⋅cm−2 anode/0.1 mg⋅cm−2 cathode), in H2/O2 (ambient pressure), at 80°C, with a Nafion 212 membrane. Peak power density only decreased by 23% after 1500 potentials cycles (ranged from 0.1 to 1.2 V, and vice versa, with a 50 mV/s scan rate, flowing H2/N2 at 80°C). Transmission electron microscopy (TEM) images show the morphology and distribution of the platinum nanoparticles loaded onto the surface of the MWCNTs.


2021 ◽  
Author(s):  
Yixin He ◽  
Zhichun Shangguan ◽  
Zhao-Yang Zhang ◽  
Mingchen Xie ◽  
Chunyang Yu ◽  
...  

Azobenzenes <a>are one of the most attractive class of </a>molecular photoswitches. In recent endeavors of molecular design, replacing one or both phenyl rings by heteroaromatic ones is emerging as a strategy to expand the molecular diversity and to access improved photoswitch properties. However, the currently available heteroaryl azo switches generally show limitations on <i>E </i>⇆ <i>Z</i> photoisomerization yields and/or <i>Z</i>-isomer stability. Here we report a family of azobispyrazoles as new photoswitches, which combine (near-)quantitative bidirectional photoconversions and widely tunable <i>Z</i>-isomer thermal half-lives (<i>t</i><sub>1/2</sub>) from hours to years. A visible-light-activated photoswitch is also obtained. Systematic experimental and theoretical investigations reveal the different geometric and electronic structures of azobispyrazoles from those of phenylazopyrazoles, overcoming the conflict existing in the latter between effective photoconversion and <i>Z</i>-isomer stability. Our work shows the great potential of azobispyrazoles in developing photoresponsive systems and can inspire the rational design of new photoswitches making use of bis-heteroaryl azo architecture.


2020 ◽  
Author(s):  
Hesan Luo ◽  
Shao-Fu Huang ◽  
Hong-Yao Xu ◽  
Xu-Yuan Li ◽  
Sheng-Xi Wu ◽  
...  

Abstract Purpose: To develop and validate a nomogram model to predict complete response (CR) after concurrent chemoradiotherapy (CCRT) in esophageal squamous cell carcinoma (ESCC) patients using pretreatment CT radiomic features. Methods: Data of patients diagnosed as ESCC and treated with CCRT in Shantou Central Hospital during the period from January 2013 to December 2015 were retrospectively collected. Eligible patients were included in this study and randomize divided into a training set and a validation set after successive screening. The least absolute shrinkage and selection operator (LASSO) with logistic regression to select radiomics features calculating Rad-score in the training set. The logistic regression analysis was performed to identify the predictive clinical factors for developing a nomogram model. The area under the receiver operating characteristic curves (AUC) was used to assess the performance of the predictive nomogram model and decision curve was used to analyze the impact of the nomogram model on clinical treatment decisions. Results: A total of 226 patients were included and randomly divided into two groups, 160 patients in training set and 66 patients in validation set. After LASSO analysis, seven radiomics features were screened out to develop a radiomics signature Rad-score. The AUC of Rad-score was 0.812 (95%CI: 0.742-0.869, p<0.001) in the training set and 0.744 (95%CI: 0.632-0.851, p=0.003) in the validation set. Multivariate analysis showed that Rad-score and clinical staging were independent predictors of CR status, with P values of 0.035 and 0.023, respectively. A nomogram model incorporating Rad-socre and clinical staging was developed and validated, with an AUC of 0.844 (95%CI: 0.779-0.897) in the training set and 0.807 (95%CI: 0.691-0.894) in the validation set.Delong test showed that the nomogram model was significantly superior to the clinical staging, with P<0.001 in the training set and P=0.026 in the validation set. The decision curve showed that the nomogram model was superior to the clinical staging when the risk threshold was greater than 25%. Conclusion: We developed and validated a nomogram model for predicting CR status of ESCC patients after CCRT. The nomogram model was combined radiomics signature Rad-score and clinical staging. This model provided us with an economical and simple method for evaluating the response of chemoradiotherapy for patients with ESCC.


Author(s):  
Chao Shen ◽  
Ye Hu ◽  
Zhe Wang ◽  
Xujun Zhang ◽  
Haiyang Zhong ◽  
...  

Abstract How to accurately estimate protein–ligand binding affinity remains a key challenge in computer-aided drug design (CADD). In many cases, it has been shown that the binding affinities predicted by classical scoring functions (SFs) cannot correlate well with experimentally measured biological activities. In the past few years, machine learning (ML)-based SFs have gradually emerged as potential alternatives and outperformed classical SFs in a series of studies. In this study, to better recognize the potential of classical SFs, we have conducted a comparative assessment of 25 commonly used SFs. Accordingly, the scoring power was systematically estimated by using the state-of-the-art ML methods that replaced the original multiple linear regression method to refit individual energy terms. The results show that the newly-developed ML-based SFs consistently performed better than classical ones. In particular, gradient boosting decision tree (GBDT) and random forest (RF) achieved the best predictions in most cases. The newly-developed ML-based SFs were also tested on another benchmark modified from PDBbind v2007, and the impacts of structural and sequence similarities were evaluated. The results indicated that the superiority of the ML-based SFs could be fully guaranteed when sufficient similar targets were contained in the training set. Moreover, the effect of the combinations of features from multiple SFs was explored, and the results indicated that combining NNscore2.0 with one to four other classical SFs could yield the best scoring power. However, it was not applicable to derive a generic target-specific SF or SF combination.


Geophysics ◽  
1986 ◽  
Vol 51 (2) ◽  
pp. 424-426 ◽  
Author(s):  
M. H. Safar

The water gun, which is becoming a popular seismic source, has proven to be an important development in marine oil prospecting. The principal reason is that, unlike the air gun, the pressure signature radiated by the water gun consists of a single bubble pulse and contains a high level of high‐frequency signal. These important features make the water gun a suitable seismic source for high‐resolution surveys. Water guns currently used are the S80, which has been used by Horizon since 1977, and the P400, introduced in 1983. The S80 and P400 water guns were developed by Sodera.™


Sign in / Sign up

Export Citation Format

Share Document