scholarly journals A merged molecular representation learning for molecular properties prediction with a web-based service

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hyunseob Kim ◽  
Jeongcheol Lee ◽  
Sunil Ahn ◽  
Jongsuk Ruth Lee

AbstractDeep learning has brought a dramatic development in molecular property prediction that is crucial in the field of drug discovery using various representations such as fingerprints, SMILES, and graphs. In particular, SMILES is used in various deep learning models via character-based approaches. However, SMILES has a limitation in that it is hard to reflect chemical properties. In this paper, we propose a new self-supervised method to learn SMILES and chemical contexts of molecules simultaneously in pre-training the Transformer. The key of our model is learning structures with adjacency matrix embedding and learning logics that can infer descriptors via Quantitative Estimation of Drug-likeness prediction in pre-training. As a result, our method improves the generalization of the data and achieves the best average performance by benchmarking downstream tasks. Moreover, we develop a web-based fine-tuning service to utilize our model on various tasks.

2020 ◽  
Author(s):  
Hyunseob Kim ◽  
Jeongcheol Lee ◽  
Sunil Ahn ◽  
Jongsuk Lee

Abstract Deep learning has brought a dramatic development in molecular property prediction which is crucial in the field of drug discovery. Various methods such as fingerprint, SMILES, graphs have been proposed for representing molecules. Recently, unlabeled molecule data is used to improve performance for various pre-training methods. The main challenge of molecular properties predictions is designing a data representation and model that can show good performance for various datasets. However, performance deviation due to scarcity of dataset exists in constructing the model. We propose a new self-supervised method to learn the characteristics and structures of molecules by integrating existing methods. The key of our model is learning structures with matrix embedding and learning logics that can infer descriptors via QED prediction. As a result, our method improves the generalization of the data and achieves the best average performance by benchmarking downstream tasks. Moreover, we develop a web-based fine-tuning service to utilize our model on various tasks.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2021 ◽  
Author(s):  
Benjamin Kellenberger ◽  
Devis Tuia ◽  
Dan Morris

<p>Ecological research like wildlife censuses increasingly relies on data on the scale of Terabytes. For example, modern camera trap datasets contain millions of images that require prohibitive amounts of manual labour to be annotated with species, bounding boxes, and the like. Machine learning, especially deep learning [3], could greatly accelerate this task through automated predictions, but involves expansive coding and expert knowledge.</p><p>In this abstract we present AIDE, the Annotation Interface for Data-driven Ecology [2]. In a first instance, AIDE is a web-based annotation suite for image labelling with support for concurrent access and scalability, up to the cloud. In a second instance, it tightly integrates deep learning models into the annotation process through active learning [7], where models learn from user-provided labels and in turn select the most relevant images for review from the large pool of unlabelled ones (Fig. 1). The result is a system where users only need to label what is required, which saves time and decreases errors due to fatigue.</p><p><img src="https://contentmanager.copernicus.org/fileStorageProxy.php?f=gnp.0402be60f60062057601161/sdaolpUECMynit/12UGE&app=m&a=0&c=131251398e575ac9974634bd0861fadc&ct=x&pn=gnp.elif&d=1" alt=""></p><p><em>Fig. 1: AIDE offers concurrent web image labelling support and uses annotations and deep learning models in an active learning loop.</em></p><p>AIDE includes a comprehensive set of built-in models, such as ResNet [1] for image classification, Faster R-CNN [5] and RetinaNet [4] for object detection, and U-Net [6] for semantic segmentation. All models can be customised and used without having to write a single line of code. Furthermore, AIDE accepts any third-party model with minimal implementation requirements. To complete the package, AIDE offers both user annotation and model prediction evaluation, access control, customisable model training, and more, all through the web browser.</p><p>AIDE is fully open source and available under https://github.com/microsoft/aerial_wildlife_detection.</p><p> </p><p><strong>References</strong></p>


Stroke ◽  
2021 ◽  
Vol 52 (Suppl_1) ◽  
Author(s):  
Yannan Yu ◽  
Soren Christensen ◽  
Yuan Xie ◽  
Enhao Gong ◽  
Maarten G Lansberg ◽  
...  

Objective: Ischemic core prediction from CT perfusion (CTP) remains inaccurate compared with gold standard diffusion-weighted imaging (DWI). We evaluated if a deep learning model to predict the DWI lesion from MR perfusion (MRP) could facilitate ischemic core prediction on CTP. Method: Using the multi-center CRISP cohort of acute ischemic stroke patient with CTP before thrombectomy, we included patients with major reperfusion (TICI score≥2b), adequate image quality, and follow-up MRI at 3-7 days. Perfusion parameters including Tmax, mean transient time, cerebral blood flow (CBF), and cerebral blood volume were reconstructed by RAPID software. Core lab experts outlined the stroke lesion on the follow-up MRI. A previously trained MRI model in a separate group of patients was used as a starting point, which used MRP parameters as input and RAPID ischemic core on DWI as ground truth. We fine-tuned this model, using CTP parameters as input, and follow-up MRI as ground truth. Another model was also trained from scratch with only CTP data. 5-fold cross validation was used. Performance of the models was compared with ischemic core (rCBF≤30%) from RAPID software to identify the presence of a large infarct (volume>70 or >100ml). Results: 94 patients in the CRISP trial met the inclusion criteria (mean age 67±15 years, 52% male, median baseline NIHSS 18, median 90-day mRS 2). Without fine-tuning, the MRI model had an agreement of 73% in infarct >70ml, and 69% in >100ml; the MRI model fine-tuned on CT improved the agreement to 77% and 73%; The CT model trained from scratch had agreements of 73% and 71%; All of the deep learning models outperformed the rCBF segmentation from RAPID, which had agreements of 51% and 64%. See Table and figure. Conclusions: It is feasible to apply MRP-based deep learning model to CT. Fine-tuning with CTP data further improves the predictions. All deep learning models predict the stroke lesion after major recanalization better than thresholding approaches based on rCBF.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2021 ◽  
pp. 1-38
Author(s):  
Wenya Wang ◽  
Sinno Jialin Pan

Abstract Nowadays, deep learning models have been widely adopted and achieved promising results on various application domains. Despite of their intriguing performance, most deep learning models function as black-boxes, lacking explicit reasoning capabilities and explanations, which are usually essential for complex problems. Take joint inference in information extraction as an example. This task requires the identification of multiple structured knowledge from texts, which is inter-correlated, including entities, events and the relationships between them. Various deep neural networks have been proposed to jointly perform entity extraction and relation prediction, which only propagate information implicitly via representation learning. However, they fail to encode the intensive correlations between entity types and relations to enforce their co-existence. On the other hand, some approaches adopt rules to explicitly constrain certain relational facts. However, the separation of rules with representation learning usually restrains the approaches with error propagation. Moreover, the pre-defined rules are inflexible and might bring negative effects when data is noisy. To address these limitations, we propose a variational deep logic network that incorporates both representation learning and relational reasoning via the variational EM algorithm. The model consists of a deep neural network to learn high-level features with implicit interactions via the self-attention mechanism and a relational logic network to explicitly exploit target interactions. These two components are trained interactively to bring the best of both worlds. We conduct extensive experiments ranging from fine-grained sentiment terms extraction, end-to-end relation prediction to end-to-end event extraction to demonstrate the effectiveness of our proposed method.


Author(s):  
Yi-Ning Juan ◽  
Yi-Shyuan Chiang ◽  
Shang-Chuan Liu ◽  
Ming-Feng Tsai ◽  
Chuan-Ju Wang

In this demonstration, we develop an interactive tool, HIVE, to demonstrate the ability and versatility of an explainable risk ranking model with a special focus on financial use cases. HIVE is a web-based tool that provides users with automated highlighted financial statements, and HIVE is designed for making comparing statements rather more efficient. Moreover, with the proposed tool, users can find related reports at ease, and we believe that HIVE can benefit both academics and practitioners in finance as they can work around deep learning models with their newly gained insights.


Author(s):  
Sukkrit Sharma ◽  
Vineet Batta ◽  
Malathy Chidambaranathan ◽  
Prabhakaran Mathialagan ◽  
Gayathri Mani ◽  
...  

2020 ◽  
Author(s):  
Cuong Q. Nguyen ◽  
Constantine Kreatsoulas ◽  
Kim M. Branson

Building in silico models to predict chemical properties and activities is a crucial step in drug discovery. However, drug discovery projects are often characterized by limited labeled data, hindering the applications of deep learning in this setting. Meanwhile advances in meta-learning have enabled state-of-the-art performances in few-shot learning benchmarks, naturally prompting the question: Can meta-learning improve deep learning performance in low-resource drug discovery projects? In this work, we assess the efficiency of the Model-Agnostic Meta-Learning (MAML) algorithm – along with its variants FO-MAML and ANIL – at learning to predict chemical properties and activities. Using the ChEMBL20 dataset to emulate low-resource settings, our benchmark shows that meta-initializations perform comparably to or outperform multi-task pre-training baselines on 16 out of 20 in-distribution tasks and on all out-of-distribution tasks, providing an average improvement in AUPRC of 7.2% and 14.9% respectively. Finally, we observe that meta-initializations consistently result in the best performing models across fine-tuning sets with k ∈ {16, 32, 64, 128, 256} instances.<br>


Sign in / Sign up

Export Citation Format

Share Document