scholarly journals A new computational drug repurposing method using established disease–drug pair knowledge

2019 ◽  
Vol 35 (19) ◽  
pp. 3672-3678 ◽  
Author(s):  
Nafiseh Saberian ◽  
Azam Peyvandipour ◽  
Michele Donato ◽  
Sahar Ansari ◽  
Sorin Draghici

Abstract Motivation Drug repurposing is a potential alternative to the classical drug discovery pipeline. Repurposing involves finding novel indications for already approved drugs. In this work, we present a novel machine learning-based method for drug repurposing. This method explores the anti-similarity between drugs and a disease to uncover new uses for the drugs. More specifically, our proposed method takes into account three sources of information: (i) large-scale gene expression profiles corresponding to human cell lines treated with small molecules, (ii) gene expression profile of a human disease and (iii) the known relationship between Food and Drug Administration (FDA)-approved drugs and diseases. Using these data, our proposed method learns a similarity metric through a supervised machine learning-based algorithm such that a disease and its associated FDA-approved drugs have smaller distance than the other disease-drug pairs. Results We validated our framework by showing that the proposed method incorporating distance metric learning technique can retrieve FDA-approved drugs for their approved indications. Once validated, we used our approach to identify a few strong candidates for repurposing. Availability and implementation The R scripts are available on demand from the authors. Supplementary information Supplementary data are available at Bioinformatics online.

Viruses ◽  
2020 ◽  
Vol 12 (11) ◽  
pp. 1325
Author(s):  
Yoonjung Choi ◽  
Bonggun Shin ◽  
Keunsoo Kang ◽  
Sungsoo Park ◽  
Bo Ram Beck

Previously, our group predicted commercially available Food and Drug Administration (FDA) approved drugs that can inhibit each step of the replication of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) using a deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI). Unfortunately, additional clinically significant treatment options since the approval of remdesivir are scarce. To overcome the current coronavirus disease 2019 (COVID-19) more efficiently, a treatment strategy that controls not only SARS-CoV-2 replication but also the host entry step should be considered. In this study, we used MT-DTI to predict FDA approved drugs that may have strong affinities for the angiotensin-converting enzyme 2 (ACE2) receptor and the transmembrane protease serine 2 (TMPRSS2) which are essential for viral entry to the host cell. Of the 460 drugs with Kd of less than 100 nM for the ACE2 receptor, 17 drugs overlapped with drugs that inhibit the interaction of ACE2 and SARS-CoV-2 spike reported in the NCATS OpenData portal. Among them, enalaprilat, an ACE inhibitor, showed a Kd value of 1.5 nM against the ACE2. Furthermore, three of the top 30 drugs with strong affinity prediction for the TMPRSS2 are anti-hepatitis C virus (HCV) drugs, including ombitasvir, daclatasvir, and paritaprevir. Notably, of the top 30 drugs, AT1R blocker eprosartan and neuropsychiatric drug lisuride showed similar gene expression profiles to potential TMPRSS2 inhibitors. Collectively, we suggest that drugs predicted to have strong inhibitory potencies to ACE2 and TMPRSS2 through the DTI model should be considered as potential drug repurposing candidates for COVID-19.


2022 ◽  
Vol 02 ◽  
Author(s):  
Sergey Shityakov ◽  
Jane Pei-Chen Chang ◽  
Ching-Fang Sun ◽  
David Ta-Wei Guu ◽  
Thomas Dandekar ◽  
...  

Background: Omega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic (EPA) and docosahexaenoic (DHA) acids, have beneficial effects on human health, but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. In order to examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated dif-ferentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction (PPI) networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways. Objective: This study aimed to implement supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by PUFAs. Methods: The transcriptional profile of GSE12375 was obtained from the Gene Expression Om-nibus database, which is based on the Affymetrix NuGO array. The probe cell intensity data were converted into the gene expression values, and the background correction was performed by the multi-array average algorithm. The LIMMA (Linear Models for Microarray Data) algo-rithm was implemented to identify relevant DEGs at baseline and after 26 weeks of supplemen-tation with a p-value < 0.05. The DAVID web server was used to identify and construct the en-riched KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. Finally, the construction of machine learning (ML) models, including logistic regression, naïve Bayes, and deep neural networks, were implemented for the analyzed DEGs associated with the specific pathways. Results: The results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, ML approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression performing the best. Conclusion: Overall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.


2020 ◽  
Author(s):  
Sergey Shityakov ◽  
Jane Pei-Chen Chang ◽  
Ching-Fang Sun ◽  
David Ta-Wei Guu ◽  
Thomas Dandekar ◽  
...  

Abstract BackgroundOmega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic acid (EPA) and docosahexaenoic (DHA) acids have beneficial effects on human health but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. To examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated differentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways.ResultsThe results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to the cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, machine learning (ML) approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression algorithm performing the best. ConclusionOverall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.


2018 ◽  
Author(s):  
Brandon Monier ◽  
Adam McDermaid ◽  
Jing Zhao ◽  
Anne Fennell ◽  
Qin Ma

AbstractMotivationNext-Generation Sequencing has made available much more large-scale genomic and transcriptomic data. Studies with RNA-sequencing (RNA-seq) data typically involve generation of gene expression profiles that can be further analyzed, many times involving differential gene expression (DGE). This process enables comparison across samples of two or more factor levels. A recurring issue with DGE analyses is the complicated nature of the comparisons to be made, in which a variety of factor combinations, pairwise comparisons, and main or blocked main effects need to be tested.ResultsHere we present a tool called IRIS-DGE, which is a server-based DGE analysis tool developed using Shiny. It provides a straightforward, user-friendly platform for performing comprehensive DGE analysis, and crucial analyses that help design hypotheses and to determine key genomic features. IRIS-DGE integrates the three most commonly used R-based DGE tools to determine differentially expressed genes (DEGs) and includes numerous methods for performing preliminary analysis on user-provided gene expression information. Additionally, this tool integrates a variety of visualizations, in a highly interactive manner, for improved interpretation of preliminary and DGE analyses.AvailabilityIRIS-DGE is freely available at http://bmbl.sdstate.edu/IRIS/[email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Sergey Shityakov ◽  
Jane Pei-Chen Chang ◽  
Ching-Fang Sun ◽  
David Ta-Wei Guu ◽  
Thomas Dandekar ◽  
...  

Abstract BackgroundOmega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic (EPA) and docosahexaenoic (DHA) acids have beneficial effects on human health but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. To examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated differentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways.ResultsThe results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to the cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, machine learning (ML) approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression algorithm performing the best. ConclusionOverall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.


2021 ◽  
Author(s):  
Thai-Hoang Pham ◽  
Yue Qiu ◽  
Jiahui Liu ◽  
Steven Zimmer ◽  
Eric O'Neill ◽  
...  

Chemical-induced gene expression profiles provide critical information on the mode of action, off-target effect, and cellar heterogeneity of chemical actions in a biological system, thus offer new opportunities for drug discovery, system pharmacology, and precision medicine. Despite their successful applications in drug repurposing, large-scale analysis that leverages these profiles is limited by sparseness and low throughput of the data. Several methods have been proposed to predict missing values in gene expression data. However, most of them focused on imputation and classification settings which have limited applications to real-world scenarios of drug discovery. Therefore, a new deep learning framework named chemical-induced gene expression ranking (CIGER) is proposed to target a more realistic but more challenging setting in which the model predicts the rankings of genes in the whole gene expression profiles induced by de novo chemicals. The experimental results show that CIGER significantly outperforms existing methods in both ranking and classification metrics for this prediction task. Furthermore, a new drug screening pipeline based on CIGER is proposed to select approved or investigational drugs for the potential treatments of pancreatic cancer. Our predictions have been validated by experiments, thereby showing the effectiveness of CIGER for phenotypic compound screening of precision drug discovery in practice.


2019 ◽  
Vol 35 (22) ◽  
pp. 4688-4695 ◽  
Author(s):  
Rui Hou ◽  
Elena Denisenko ◽  
Alistair R R Forrest

Abstract Motivation Single-cell RNA sequencing (scRNA-seq) measures gene expression at the resolution of individual cells. Massively multiplexed single-cell profiling has enabled large-scale transcriptional analyses of thousands of cells in complex tissues. In most cases, the true identity of individual cells is unknown and needs to be inferred from the transcriptomic data. Existing methods typically cluster (group) cells based on similarities of their gene expression profiles and assign the same identity to all cells within each cluster using the averaged expression levels. However, scRNA-seq experiments typically produce low-coverage sequencing data for each cell, which hinders the clustering process. Results We introduce scMatch, which directly annotates single cells by identifying their closest match in large reference datasets. We used this strategy to annotate various single-cell datasets and evaluated the impacts of sequencing depth, similarity metric and reference datasets. We found that scMatch can rapidly and robustly annotate single cells with comparable accuracy to another recent cell annotation tool (SingleR), but that it is quicker and can handle larger reference datasets. We demonstrate how scMatch can handle large customized reference gene expression profiles that combine data from multiple sources, thus empowering researchers to identify cell populations in any complex tissue with the desired precision. Availability and implementation scMatch (Python code) and the FANTOM5 reference dataset are freely available to the research community here https://github.com/forrest-lab/scMatch. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 14 (10) ◽  
pp. 948
Author(s):  
Jiaying You ◽  
Michael Hsing ◽  
Artem Cherkasov

Aging is considered an inevitable process that causes deleterious effects in the functioning and appearance of cells, tissues, and organs. Recent emergence of large-scale gene expression datasets and significant advances in machine learning techniques have enabled drug repurposing efforts in promoting longevity. In this work, we further developed our previous approach—DeepCOP, a quantitative chemogenomic model that predicts gene regulating effects, and extended its application across multiple cell lines presented in LINCS to predict aging gene regulating effects induced by small molecules. As a result, a quantitative chemogenomic Deep Model was trained using gene ontology labels, molecular fingerprints, and cell line descriptors to predict gene expression responses to chemical perturbations. Other state-of-the-art machine learning approaches were also evaluated as benchmarks. Among those, the deep neural network (DNN) classifier has top-ranked known drugs with beneficial effects on aging genes, and some of these drugs were previously shown to promote longevity, illustrating the potential utility of this methodology. These results further demonstrate the capability of “hybrid” chemogenomic models, incorporating quantitative descriptors from biomarkers to capture cell specific drug–gene interactions. Such models can therefore be used for discovering drugs with desired gene regulatory effects associated with longevity.


2020 ◽  
Vol 6 ◽  
pp. e270 ◽  
Author(s):  
Reinel Tabares-Soto ◽  
Simon Orozco-Arias ◽  
Victor Romero-Cano ◽  
Vanesa Segovia Bucheli ◽  
José Luis Rodríguez-Sotelo ◽  
...  

Cancer classification is a topic of major interest in medicine since it allows accurate and efficient diagnosis and facilitates a successful outcome in medical treatments. Previous studies have classified human tumors using a large-scale RNA profiling and supervised Machine Learning (ML) algorithms to construct a molecular-based classification of carcinoma cells from breast, bladder, adenocarcinoma, colorectal, gastro esophagus, kidney, liver, lung, ovarian, pancreas, and prostate tumors. These datasets are collectively known as the 11_tumor database, although this database has been used in several works in the ML field, no comparative studies of different algorithms can be found in the literature. On the other hand, advances in both hardware and software technologies have fostered considerable improvements in the precision of solutions that use ML, such as Deep Learning (DL). In this study, we compare the most widely used algorithms in classical ML and DL to classify the tumors described in the 11_tumor database. We obtained tumor identification accuracies between 90.6% (Logistic Regression) and 94.43% (Convolutional Neural Networks) using k-fold cross-validation. Also, we show how a tuning process may or may not significantly improve algorithms’ accuracies. Our results demonstrate an efficient and accurate classification method based on gene expression (microarray data) and ML/DL algorithms, which facilitates tumor type prediction in a multi-cancer-type scenario.


2018 ◽  
Vol 14 (2) ◽  
pp. 106-116 ◽  
Author(s):  
Olujide O. Olubiyi ◽  
Maryam O. Olagunju ◽  
James O. Oni ◽  
Abidemi O. Olubiyi

Sign in / Sign up

Export Citation Format

Share Document