scholarly journals Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Yuanyuan Jiang ◽  
Zongwei Yang ◽  
Jiali Guo ◽  
Hongzhen Li ◽  
Yijing Liu ◽  
...  

AbstractCocrystal engineering have been widely applied in pharmaceutical, chemistry and material fields. However, how to effectively choose coformer has been a challenging task on experiments. Here we develop a graph neural network (GNN) based deep learning framework to quickly predict formation of the cocrystal. In order to capture main driving force to crystallization from 6819 positive and 1052 negative samples reported by experiments, a feasible GNN framework is explored to integrate important prior knowledge into end-to-end learning on the molecular graph. The model is strongly validated against seven competitive models and three challenging independent test sets involving pharmaceutical cocrystals, π–π cocrystals and energetic cocrystals, exhibiting superior performance with accuracy higher than 96%, confirming its robustness and generalization. Furthermore, one new energetic cocrystal predicted is successfully synthesized, showcasing high potential of the model in practice. All the data and source codes are available at https://github.com/Saoge123/ccgnet for aiding cocrystal community.

2018 ◽  
Vol 35 (13) ◽  
pp. 2216-2225 ◽  
Author(s):  
Abdurrahman Elbasir ◽  
Balasubramanian Moovarkumudalvan ◽  
Khalid Kunji ◽  
Prasanna R Kolatkar ◽  
Raghvendra Mall ◽  
...  

Abstract Motivation Protein structure determination has primarily been performed using X-ray crystallography. To overcome the expensive cost, high attrition rate and series of trial-and-error settings, many in-silico methods have been developed to predict crystallization propensities of proteins based on their sequences. However, the majority of these methods build their predictors by extracting features from protein sequences, which is computationally expensive and can explode the feature space. We propose DeepCrystal, a deep learning framework for sequence-based protein crystallization prediction. It uses deep learning to identify proteins which can produce diffraction-quality crystals without the need to manually engineer additional biochemical and structural features from sequence. Our model is based on convolutional neural networks, which can exploit frequently occurring k-mers and sets of k-mers from the protein sequences to distinguish proteins that will result in diffraction-quality crystals from those that will not. Results Our model surpasses previous sequence-based protein crystallization predictors in terms of recall, F-score, accuracy and Matthew’s correlation coefficient (MCC) on three independent test sets. DeepCrystal achieves an average improvement of 1.4, 12.1% in recall, when compared to its closest competitors, Crysalis II and Crysf, respectively. In addition, DeepCrystal attains an average improvement of 2.1, 6.0% for F-score, 1.9, 3.9% for accuracy and 3.8, 7.0% for MCC w.r.t. Crysalis II and Crysf on independent test sets. Availability and implementation The standalone source code and models are available at https://github.com/elbasir/DeepCrystal and a web-server is also available at https://deeplearning-protein.qcri.org. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Yuanyuan Jiang ◽  
Jiali Guo ◽  
Yjing Liu ◽  
Yanzhi Guo ◽  
Menglong Li ◽  
...  

<p>Cocrystal plays an important role in various fields. However, how to choose coformer remains a challenge on experiments. In this work, we develop a novel graph neural network (GNN) based deep learning framework to rapidly predict formation of the cocrystal. A large and reliable data set is first constructed, which contains 7871 samples. A complementary feature representation is proposed by combining molecular graph and molecular descriptors from priori knowledge. A new GNN learning architecture is then explored to effectively embed the priori knowledge into the “endto-end” learning on the molecular graph, in which multi-head attention mechanism is introduced to further optimize the feature space. Consequently, the performance of our model achieves 98.86% accuracy, greatly surpassing some traditional machine learning models and classic GNN models. Furthermore, the out-of-distribution prediction on energetic cocrystals is also high up to 97.11% accuracy, showing strong generalization.</p><br>


2021 ◽  
Author(s):  
Yuanyuan Jiang ◽  
Jiali Guo ◽  
Yjing Liu ◽  
Yanzhi Guo ◽  
Menglong Li ◽  
...  

<p>Cocrystal plays an important role in various fields. However, how to choose coformer remains a challenge on experiments. In this work, we develop a novel graph neural network (GNN) based deep learning framework to rapidly predict formation of the cocrystal. A large and reliable data set is first constructed, which contains 7871 samples. A complementary feature representation is proposed by combining molecular graph and molecular descriptors from priori knowledge. A new GNN learning architecture is then explored to effectively embed the priori knowledge into the “endto-end” learning on the molecular graph, in which multi-head attention mechanism is introduced to further optimize the feature space. Consequently, the performance of our model achieves 98.86% accuracy, greatly surpassing some traditional machine learning models and classic GNN models. Furthermore, the out-of-distribution prediction on energetic cocrystals is also high up to 97.11% accuracy, showing strong generalization.</p><br>


Author(s):  
Tai D. Nguyen ◽  
Ronald Gronsky ◽  
Jeffrey B. Kortright

Nanometer period Ru/C multilayers are one of the prime candidates for normal incident reflecting mirrors at wavelengths < 10 nm. Superior performance, which requires uniform layers and smooth interfaces, and high stability of the layered structure under thermal loadings are some of the demands in practical applications. Previous studies however show that the Ru layers in the 2 nm period Ru/C multilayer agglomerate upon moderate annealing, and the layered structure is no longer retained. This agglomeration and crystallization of the Ru layers upon annealing to form almost spherical crystallites is a result of the reduction of surface or interfacial energy from die amorphous high energy non-equilibrium state of the as-prepared sample dirough diffusive arrangements of the atoms. Proposed models for mechanism of thin film agglomeration include one analogous to Rayleigh instability, and grain boundary grooving in polycrystalline films. These models however are not necessarily appropriate to explain for the agglomeration in the sub-nanometer amorphous Ru layers in Ru/C multilayers. The Ru-C phase diagram shows a wide miscible gap, which indicates the preference of phase separation between these two materials and provides an additional driving force for agglomeration. In this paper, we study the evolution of the microstructures and layered structure via in-situ Transmission Electron Microscopy (TEM), and attempt to determine the order of occurence of agglomeration and crystallization in the Ru layers by observing the diffraction patterns.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Narjes Rohani ◽  
Changiz Eslahchi

Abstract Drug-Drug Interaction (DDI) prediction is one of the most critical issues in drug development and health. Proposing appropriate computational methods for predicting unknown DDI with high precision is challenging. We proposed "NDD: Neural network-based method for drug-drug interaction prediction" for predicting unknown DDIs using various information about drugs. Multiple drug similarities based on drug substructure, target, side effect, off-label side effect, pathway, transporter, and indication data are calculated. At first, NDD uses a heuristic similarity selection process and then integrates the selected similarities with a nonlinear similarity fusion method to achieve high-level features. Afterward, it uses a neural network for interaction prediction. The similarity selection and similarity integration parts of NDD have been proposed in previous studies of other problems. Our novelty is to combine these parts with new neural network architecture and apply these approaches in the context of DDI prediction. We compared NDD with six machine learning classifiers and six state-of-the-art graph-based methods on three benchmark datasets. NDD achieved superior performance in cross-validation with AUPR ranging from 0.830 to 0.947, AUC from 0.954 to 0.994 and F-measure from 0.772 to 0.902. Moreover, cumulative evidence in case studies on numerous drug pairs, further confirm the ability of NDD to predict unknown DDIs. The evaluations corroborate that NDD is an efficient method for predicting unknown DDIs. The data and implementation of NDD are available at https://github.com/nrohani/NDD.


2021 ◽  
Vol 11 (6) ◽  
pp. 2838
Author(s):  
Nikitha Johnsirani Venkatesan ◽  
Dong Ryeol Shin ◽  
Choon Sung Nam

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2117
Author(s):  
Hui Han ◽  
Zhiyuan Ren ◽  
Lin Li ◽  
Zhigang Zhu

Automatic modulation classification (AMC) is playing an increasingly important role in spectrum monitoring and cognitive radio. As communication and electronic technologies develop, the electromagnetic environment becomes increasingly complex. The high background noise level and large dynamic input have become the key problems for AMC. This paper proposes a feature fusion scheme based on deep learning, which attempts to fuse features from different domains of the input signal to obtain a more stable and efficient representation of the signal modulation types. We consider the complementarity among features that can be used to suppress the influence of the background noise interference and large dynamic range of the received (intercepted) signals. Specifically, the time-series signals are transformed into the frequency domain by Fast Fourier transform (FFT) and Welch power spectrum analysis, followed by the convolutional neural network (CNN) and stacked auto-encoder (SAE), respectively, for detailed and stable frequency-domain feature representations. Considering the complementary information in the time domain, the instantaneous amplitude (phase) statistics and higher-order cumulants (HOC) are extracted as the statistical features for fusion. Based on the fused features, a probabilistic neural network (PNN) is designed for automatic modulation classification. The simulation results demonstrate the superior performance of the proposed method. It is worth noting that the classification accuracy can reach 99.8% in the case when signal-to-noise ratio (SNR) is 0 dB.


Author(s):  
Chen Qi ◽  
Shibo Shen ◽  
Rongpeng Li ◽  
Zhifeng Zhao ◽  
Qing Liu ◽  
...  

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yu Li ◽  
Zeling Xu ◽  
Wenkai Han ◽  
Huiluo Cao ◽  
Ramzan Umarov ◽  
...  

Abstract Background The spread of antibiotic resistance has become one of the most urgent threats to global health, which is estimated to cause 700,000 deaths each year globally. Its surrogates, antibiotic resistance genes (ARGs), are highly transmittable between food, water, animal, and human to mitigate the efficacy of antibiotics. Accurately identifying ARGs is thus an indispensable step to understanding the ecology, and transmission of ARGs between environmental and human-associated reservoirs. Unfortunately, the previous computational methods for identifying ARGs are mostly based on sequence alignment, which cannot identify novel ARGs, and their applications are limited by currently incomplete knowledge about ARGs. Results Here, we propose an end-to-end Hierarchical Multi-task Deep learning framework for ARG annotation (HMD-ARG). Taking raw sequence encoding as input, HMD-ARG can identify, without querying against existing sequence databases, multiple ARG properties simultaneously, including if the input protein sequence is an ARG, and if so, what antibiotic family it is resistant to, what resistant mechanism the ARG takes, and if the ARG is an intrinsic one or acquired one. In addition, if the predicted antibiotic family is beta-lactamase, HMD-ARG further predicts the subclass of beta-lactamase that the ARG is resistant to. Comprehensive experiments, including cross-fold validation, third-party dataset validation in human gut microbiota, wet-experimental functional validation, and structural investigation of predicted conserved sites, demonstrate not only the superior performance of our method over the state-of-art methods, but also the effectiveness and robustness of the proposed method. Conclusions We propose a hierarchical multi-task method, HMD-ARG, which is based on deep learning and can provide detailed annotations of ARGs from three important aspects: resistant antibiotic class, resistant mechanism, and gene mobility. We believe that HMD-ARG can serve as a powerful tool to identify antibiotic resistance genes and, therefore mitigate their global threat. Our method and the constructed database are available at http://www.cbrc.kaust.edu.sa/HMDARG/.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
S Gao ◽  
D Stojanovski ◽  
A Parker ◽  
P Marques ◽  
S Heitner ◽  
...  

Abstract Background Correctly identifying views acquired in a 2D echocardiographic examination is paramount to post-processing and quantification steps often performed as part of most clinical workflows. In many exams, particularly in stress echocardiography, microbubble contrast is used which greatly affects the appearance of the cardiac views. Here we present a bespoke, fully automated convolutional neural network (CNN) which identifies apical 2, 3, and 4 chamber, and short axis (SAX) views acquired with and without contrast. The CNN was tested in a completely independent, external dataset with the data acquired in a different country than that used to train the neural network. Methods Training data comprised of 2D echocardiograms was taken from 1014 subjects from a prospective multisite, multi-vendor, UK trial with the number of frames in each view greater than 17,500. Prior to view classification model training, images were processed using standard techniques to ensure homogenous and normalised image inputs to the training pipeline. A bespoke CNN was built using the minimum number of convolutional layers required with batch normalisation, and including dropout for reducing overfitting. Before processing, the data was split into 90% for model training (211,958 frames), and 10% used as a validation dataset (23,946 frames). Image frames from different subjects were separated out entirely amongst the training and validation datasets. Further, a separate trial dataset of 240 studies acquired in the USA was used as an independent test dataset (39,401 frames). Results Figure 1 shows the confusion matrices for both validation data (left) and independent test data (right), with an overall accuracy of 96% and 95% for the validation and test datasets respectively. The accuracy for the non-contrast cardiac views of &gt;99% exceeds that seen in other works. The combined datasets included images acquired across ultrasound manufacturers and models from 12 clinical sites. Conclusion We have developed a CNN capable of automatically accurately identifying all relevant cardiac views used in “real world” echo exams, including views acquired with contrast. Use of the CNN in a routine clinical workflow could improve efficiency of quantification steps performed after image acquisition. This was tested on an independent dataset acquired in a different country to that used to train the model and was found to perform similarly thus indicating the generalisability of the model. Figure 1. Confusion matrices Funding Acknowledgement Type of funding source: Private company. Main funding source(s): Ultromics Ltd.


Sign in / Sign up

Export Citation Format

Share Document