COVID-19 Multi-Targeted Drug Repurposing Using Few-Shot Learning

The life-threatening disease COVID-19 has inspired significant efforts to discover novel therapeutic agents through repurposing of existing drugs. Although multi-targeted (polypharmacological) therapies are recognized as the most efficient approach to system diseases such as COVID-19, computational multi-targeted compound screening has been limited by the scarcity of high-quality experimental data and difficulties in extracting information from molecules. This study introduces MolGNN, a new deep learning model for molecular property prediction. MolGNN applies a graph neural network to computational learning of chemical molecule embedding. Comparing to state-of-the-art approaches heavily relying on labeled experimental data, our method achieves equivalent or superior prediction performance without manual labels in the pretraining stage, and excellent performance on data with only a few labels. Our results indicate that MolGNN is robust to scarce training data, and hence a powerful few-shot learning tool. MolGNN predicted several multi-targeted molecules against both human Janus kinases and the SARS-CoV-2 main protease, which are preferential targets for drugs aiming, respectively, at alleviating cytokine storm COVID-19 symptoms and suppressing viral replication. We also predicted molecules potentially inhibiting cell death induced by SARS-CoV-2. Several of MolGNN top predictions are supported by existing experimental and clinical evidence, demonstrating the potential value of our method.

Download Full-text

Comparative Study of Neural Network-Based Approaches for QRS Segmentation

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2020100105 ◽

2020 ◽

Vol 11 (4) ◽

pp. 80-103

Author(s):

George Kolokolnikov ◽

Anna Borde ◽

Victor Skuratov ◽

Roman Gaponov ◽

Anastasiya Rumyantseva

Keyword(s):

Deep Learning ◽

Signal Analysis ◽

State Of The Art ◽

Learning Approach ◽

Review Section ◽

Advantages And Disadvantages ◽

Holter Monitor ◽

Life Threatening ◽

Deep Learning Model ◽

Segmentation Problem

The paper is devoted to the development of QRS segmentation system based on deep learning approach. The considered segmentation problem plays an important role in the automatic analysis of heart rhythms, which makes it possible to identify life-threatening pathologies. The main goal of the research is to choose the best segmentation pipeline in terms of accuracy and time-efficiency. Process of ECG-signal analysis is described, and the problem of QRS segmentation is discussed. State-of-the-art algorithms are analyzed in literature review section and the most prominent are chosen for further research. In the course of the research, four hypotheses about appropriate deep learning model are checked: LSTM-based model, 2-input 1-dimensional CNN model, “signal-to-picture” approach based on 2-dimensional CNN, and the simplest 1-dimensional CNN model. All the architectures are tested, and their advantages and disadvantages are discussed. The proposed ECG segmentation pipeline is developed for Holter monitor software.

Download Full-text

Predictive Modeling of Multiplex Chemical Phenomics for Novel Cells and Patients: Applied to Personalized Alzheimer's Disease Drug Repurposing

10.1101/2021.08.09.455708 ◽

2021 ◽

Author(s):

Qiao Liu ◽

Yue Qiu ◽

Lei Xie

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

State Of The Art ◽

Drug Repurposing ◽

Response Curves ◽

Training Strategy ◽

Learning Framework ◽

Teacher Student ◽

Compound Screening ◽

Student Training

Chemical phenomics which measures multiplex chemical-induced phenotypic response of cells or patients, particularly dose-dependent transcriptomics and drug-response curves, provides new opportunities for in silico mechanism-driven phenotype-based drug discovery. However, state-of-the-art computational methods only focus on predicting a single phenotypic readout and are less successful in screening compounds for novel cells or individual patients. We designed a new deep learning model, MultiDCP, to enable high-throughput compound screening based on multiplex chemical phenomics for the first time, and further expand the scope of chemical phenomics to unexplored cells and patients. The novelties of MultiDCP lie in a multi-task learning framework with a novel knowledge-driven autoencoder to integrate incoherent labeled and unlabeled omics data, and a teacher-student training strategy to exploit unreliable data. MultiDCP significantly outperforms the state-of-the-art for novel cell lines. The predicted chemical transcriptomics demonstrate a stronger predictive power than noisy experimental data for downstream tasks. We applied MultiDCP to repurpose individualized drugs for Alzheimer's disease, suggesting that MultiDCP is a potentially powerful tool for personalized medicine.

Download Full-text

Drug Repurposing Studies Targeting SARS-nCoV2: An Ensemble Docking Approach on Drug Target 3C-like Protease (3CLpro)

10.26434/chemrxiv.12228831 ◽

2020 ◽

Author(s):

Shruti Koulgi ◽

Vinod Jani ◽

Mallikarjunachari Uppuladinne ◽

Uddhavesh Sonavane ◽

Asheet Kumar Nath ◽

...

Keyword(s):

Drug Repurposing ◽

Drug Binding ◽

Ensemble Docking ◽

Drug Molecules ◽

Drug Binding Site ◽

Main Protease ◽

Docking Approach ◽

Novel Coronavirus ◽

Markov State Modeling ◽

Viral Polyprotein

The COVID-19 pandemic has been responsible for several deaths worldwide. The causative agent behind this disease is the Severe Acute Respiratory Syndrome – novel Coronavirus 2 (SARS-nCoV2). SARS-nCoV2 belongs to the category of RNA viruses. The main protease, responsible for the cleavage of the viral polyprotein is considered as one of the hot targets for treating COVID-19. Earlier reports suggest the use of HIV anti-viral drugs for targeting the main protease of SARS-CoV, which caused SARS in the year 2002-03. Hence, drug repurposing approach may prove to be useful in targeting the main protease of SARS-nCoV2. The high-resolution crystal structure of 3CLpro (main protease) of SARS-nCoV2 (PDB ID: 6LU7) was used as the target. The Food and Drug Administration (FDA) approved and SWEETLEAD database of drug molecules were screened. The apo form of the main protease was simulated for a cumulative of 150 ns and 10 μs open source simulation data was used, to obtain conformations for ensemble docking. The representative structures for docking were selected using RMSD-based clustering and Markov State Modeling analysis. This ensemble docking approach for main protease helped in exploring the conformational variation in the drug binding site of the main protease leading to efficient binding of more relevant drug molecules. The drugs obtained as best hits from the ensemble docking possessed anti-bacterial and anti-viral properties. Small molecules with these properties may prove to be useful to treat symptoms exhibited in COVID-19. This in-silico ensemble docking approach would support identification of potential candidates for repurposing against COVID-19.

Download Full-text

Identifying Physico-Chemical Laws from the Robotically Collected Data

10.26434/chemrxiv.8490149 ◽

2019 ◽

Author(s):

Liwei Cao ◽

Danilo Russo ◽

Vassilios S. Vassiliadis ◽

Alexei Lapkin

Keyword(s):

Experimental Data ◽

Numerical Models ◽

Predictor Variable ◽

Physical Models ◽

Training Data ◽

Mixed Integer ◽

Physico Chemical ◽

Data Points ◽

Future Work ◽

The Relationship

A mixed-integer nonlinear programming (MINLP) formulation for symbolic regression was proposed to identify physical models from noisy experimental data. The formulation was tested using numerical models and was found to be more efficient than the previous literature example with respect to the number of predictor variables and training data points. The globally optimal search was extended to identify physical models and to cope with noise in the experimental data predictor variable. The methodology was coupled with the collection of experimental data in an automated fashion, and was proven to be successful in identifying the correct physical models describing the relationship between the shear stress and shear rate for both Newtonian and non-Newtonian fluids, and simple kinetic laws of reactions. Future work will focus on addressing the limitations of the formulation presented in this work, by extending it to be able to address larger complex physical models.

Download Full-text

Synthesizing Conjunctive & Disjunctive Linear Invariants by K-means++ and SVM

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/6/3 ◽

2020 ◽

Vol 17 (6) ◽

pp. 847-856

Author(s):

Shengbing Ren ◽

Xiang Zhang

Keyword(s):

Software Verification ◽

State Of The Art ◽

Positive Sample ◽

Machine Learning Algorithms ◽

Support Vector ◽

Hoare Logic ◽

Excellent Performance ◽

Automated Software ◽

Inductive Invariants ◽

Linear Invariants

The problem of synthesizing adequate inductive invariants lies at the heart of automated software verification. The state-of-the-art machine learning algorithms for synthesizing invariants have gradually shown its excellent performance. However, synthesizing disjunctive invariants is a difficult task. In this paper, we propose a method k++ Support Vector Machine (SVM) integrating k-means++ and SVM to synthesize conjunctive and disjunctive invariants. At first, given a program, we start with executing the program to collect program states. Next, k++SVM adopts k-means++ to cluster the positive samples and then applies SVM to distinguish each positive sample cluster from all negative samples to synthesize the candidate invariants. Finally, a set of theories founded on Hoare logic are adopted to check whether the candidate invariants are true invariants. If the candidate invariants fail the check, we should sample more states and repeat our algorithm. The experimental results show that k++SVM is compatible with the algorithms for Intersection Of Half-space (IOH) and more efficient than the tool of Interproc. Furthermore, it is shown that our method can synthesize conjunctive and disjunctive invariants automatically

Download Full-text

Revisiting the Analysis of the Isochronous Mass Measurements of Uranium Fission Fragments at the ESR

EPJ Web of Conferences ◽

10.1051/epjconf/202022702012 ◽

2020 ◽

Vol 227 ◽

pp. 02012

Author(s):

R. S. Sidhu ◽

R. J. Chen ◽

Yu. A Litvinov ◽

Y. H. Zhang ◽

Keyword(s):

Experimental Data ◽

Data Analysis ◽

State Of The Art ◽

Fission Products ◽

Mass Measurements ◽

Fission Fragments ◽

Uranium Fission

The re-analysis of experimental data on mass measurements of ura- nium fission products obtained at the ESR in 2002 is discussed. State-of-the-art data analysis procedures developed for such measurements are employed.

Download Full-text

Improving Semi-Supervised Learning for Audio Classification with FixMatch

Electronics ◽

10.3390/electronics10151807 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1807

Author(s):

Sascha Grollmisch ◽

Estefanía Cano

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Transfer Learning ◽

Data Transfer ◽

State Of The Art ◽

Training Data ◽

Audio Classification ◽

Image Domain ◽

Full Dataset ◽

Audio Data

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.

Download Full-text

Ensemble-Based Out-of-Distribution Detection

Electronics ◽

10.3390/electronics10050567 ◽

2021 ◽

Vol 10 (5) ◽

pp. 567

Author(s):

Donghun Yang ◽

Kien Mai Mai Ngoc ◽

Iksoo Shin ◽

Kyong-Ha Lee ◽

Myunggwon Hwang

Keyword(s):

Detection Method ◽

State Of The Art ◽

Metric Learning ◽

Feature Space ◽

Confidence Score ◽

Distance Metric Learning ◽

Current State ◽

Overall Performance ◽

Deep Learning Model

To design an efficient deep learning model that can be used in the real-world, it is important to detect out-of-distribution (OOD) data well. Various studies have been conducted to solve the OOD problem. The current state-of-the-art approach uses a confidence score based on the Mahalanobis distance in a feature space. Although it outperformed the previous approaches, the results were sensitive to the quality of the trained model and the dataset complexity. Herein, we propose a novel OOD detection method that can train more efficient feature space for OOD detection. The proposed method uses an ensemble of the features trained using the softmax-based classifier and the network based on distance metric learning (DML). Through the complementary interaction of these two networks, the trained feature space has a more clumped distribution and can fit well on the Gaussian distribution by class. Therefore, OOD data can be efficiently detected by setting a threshold in the trained feature space. To evaluate the proposed method, we applied our method to various combinations of image datasets. The results show that the overall performance of the proposed approach is superior to those of other methods, including the state-of-the-art approach, on any combination of datasets.

Download Full-text

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Applied Sciences ◽

10.3390/app11114894 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4894

Author(s):

Anna Scius-Bertrand ◽

Michael Jungo ◽

Beat Wolf ◽

Andreas Fischer ◽

Marc Bui

Keyword(s):

Object Detection ◽

State Of The Art ◽

Positive Impact ◽

Detection System ◽

Training Data ◽

Detection Accuracy ◽

Current State ◽

Alignment Task ◽

Scanned Image ◽

Automatic Transcription

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

BEHRT-HF: an interpretable transformer-based, deep learning model for prediction of incident heart failure

European Heart Journal ◽

10.1093/ehjci/ehaa946.3553 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

S Rao ◽

Y Li ◽

R Ramakrishnan ◽

A Hassaine ◽

D Canoy ◽

...

Keyword(s):

Heart Failure ◽

Deep Learning ◽

State Of The Art ◽

Failure Prediction ◽

Predictive Performance ◽

Learning Model ◽

Learning Framework ◽

Incident Heart Failure ◽

Ablation Study ◽

Deep Learning Model

Abstract Background/Introduction Predicting incident heart failure has been challenging. Deep learning models when applied to rich electronic health records (EHR) offer some theoretical advantages. However, empirical evidence for their superior performance is limited and they remain commonly uninterpretable, hampering their wider use in medical practice. Purpose We developed a deep learning framework for more accurate and yet interpretable prediction of incident heart failure. Methods We used longitudinally linked EHR from practices across England, involving 100,071 patients, 13% of whom had been diagnosed with incident heart failure during follow-up. We investigated the predictive performance of a novel transformer deep learning model, “Transformer for Heart Failure” (BEHRT-HF), and validated it using both an external held-out dataset and an internal five-fold cross-validation mechanism using area under receiver operating characteristic (AUROC) and area under the precision recall curve (AUPRC). Predictor groups included all outpatient and inpatient diagnoses within their temporal context, medications, age, and calendar year for each encounter. By treating diagnoses as anchors, we alternatively removed different modalities (ablation study) to understand the importance of individual modalities to the performance of incident heart failure prediction. Using perturbation-based techniques, we investigated the importance of associations between selected predictors and heart failure to improve model interpretability. Results BEHRT-HF achieved high accuracy with AUROC 0.932 and AUPRC 0.695 for external validation, and AUROC 0.933 (95% CI: 0.928, 0.938) and AUPRC 0.700 (95% CI: 0.682, 0.718) for internal validation. Compared to the state-of-the-art recurrent deep learning model, RETAIN-EX, BEHRT-HF outperformed it by 0.079 and 0.030 in terms of AUPRC and AUROC. Ablation study showed that medications were strong predictors, and calendar year was more important than age. Utilising perturbation, we identified and ranked the intensity of associations between diagnoses and heart failure. For instance, the method showed that established risk factors including myocardial infarction, atrial fibrillation and flutter, and hypertension all strongly associated with the heart failure prediction. Additionally, when population was stratified into different age groups, incident occurrence of a given disease had generally a higher contribution to heart failure prediction in younger ages than when diagnosed later in life. Conclusions Our state-of-the-art deep learning framework outperforms the predictive performance of existing models whilst enabling a data-driven way of exploring the relative contribution of a range of risk factors in the context of other temporal information. Funding Acknowledgement Type of funding source: Private grant(s) and/or Sponsorship. Main funding source(s): National Institute for Health Research, Oxford Martin School, Oxford Biomedical Research Centre

Download Full-text