Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology

ABSTRACTA major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the potential to transform target selection and validation. In this study we investigate how computational intelligence methods can be applied to predict novel therapeutic targets in oncology. We compared different machine learning classifiers applied to the task of drug target classification for nine different human cancer types. For each cancer type, a set of “known” target genes was obtained and equally-sized sets of “non-targets” were sampled multiple times from the human protein-coding genes. Models were trained on mutation, gene expression (TCGA), and gene essentiality (DepMap) data. In addition, we generated a numerical embedding of the interaction network of protein-coding genes using deep network representation learning and included the results in the modeling. We assessed feature importance using a random forests classifier and performed feature selection based on measuring permutation importance against a null distribution. Our best models achieved good generalization performance based on the AUROC metric. With the best model for each cancer type, we ran predictions on more than 15,000 protein-coding genes to identify potential novel targets. Our results indicate that this approach may be useful to inform early stages of the drug discovery pipeline.

Download Full-text

Artificial intelligence-based computational framework for drug-target prioritization and inference of novel repositionable drugs for Alzheimer’s disease

Alzheimer s Research & Therapy ◽

10.1186/s13195-021-00826-3 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Shingo Tsuji ◽

Takeshi Hase ◽

Ayako Yachie-Kinoshita ◽

Taiko Nishino ◽

Samik Ghosh ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Deep Learning ◽

Drug Target ◽

Target Genes ◽

Drug Repositioning ◽

Therapeutic Targets ◽

Machine Learning Techniques ◽

Computational Framework ◽

Novel Therapeutic

Abstract Background Identifying novel therapeutic targets is crucial for the successful development of drugs. However, the cost to experimentally identify therapeutic targets is huge and only approximately 400 genes are targets for FDA-approved drugs. As a result, it is inevitable to develop powerful computational tools that can identify potential novel therapeutic targets. Fortunately, the human protein-protein interaction network (PIN) could be a useful resource to achieve this objective. Methods In this study, we developed a deep learning-based computational framework that extracts low-dimensional representations of high-dimensional PIN data. Our computational framework uses latent features and state-of-the-art machine learning techniques to infer potential drug target genes. Results We applied our computational framework to prioritize novel putative target genes for Alzheimer’s disease and successfully identified key genes that may serve as novel therapeutic targets (e.g., DLG4, EGFR, RAC1, SYK, PTK2B, SOCS1). Furthermore, based on these putative targets, we could infer repositionable candidate-compounds for the disease (e.g., tamoxifen, bosutinib, and dasatinib). Conclusions Our deep learning-based computational framework could be a powerful tool to efficiently prioritize new therapeutic targets and enhance the drug repositioning strategy.

Download Full-text

Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology

Scientific Reports ◽

10.1038/s41598-020-67846-1 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Adrián Bazaga ◽

Dan Leggate ◽

Hendrik Weisser

Keyword(s):

Therapeutic Targets ◽

Genome Wide ◽

Novel Therapeutic

Download Full-text

Tri-methylation of Histone H3 Lysine 4 Facilitates Gene Expression in Ageing Cells

10.1101/238048 ◽

2017 ◽

Cited By ~ 1

Author(s):

Cristina Cruz ◽

Monica Della Rosa ◽

Christel Krueger ◽

Qian Gao ◽

Lucy Field ◽

...

Keyword(s):

Gene Expression ◽

Budding Yeast ◽

Histone H3 ◽

Specific Factor ◽

Protein Coding ◽

Replicative Lifespan ◽

Protein Coding Genes ◽

Genome Wide ◽

Normal Expression ◽

Transcriptional Induction

AbstractTranscription of protein coding genes is accompanied by recruitment of COMPASS to promoter-proximal chromatin, which deposits di- and tri-methylation on histone H3 lysine 4 (H3K4) to form H3K4me2 and H3K4me3. Here we determine the importance of COMPASS in maintaining gene expression across lifespan in budding yeast. We find that COMPASS mutations dramatically reduce replicative lifespan and cause widespread gene expression defects. Known repressive functions of H3K4me2 are progressively lost with age, while hundreds of genes become dependent on H3K4me3 for full expression. Induction of these H3K4me3 dependent genes is also impacted in young cells lacking COMPASS components including the H3K4me3-specific factor Spp1. Remarkably, the genome-wide occurrence of H3K4me3 is progressively reduced with age despite widespread transcriptional induction, minimising the normal positive correlation between promoter H3K4me3 and gene expression. Our results provide clear evidence that H3K4me3 is required to attain normal expression levels of many genes across organismal lifespan.

Download Full-text

Abstract 5430: Analysis of genome-wide DNA methylation in CCRF-CEM cells confirms hypermethylation at reported epigenetic markers of T-cell lymphoblastic leukemia and identifies novel therapeutic targets with reduced methylation upon treatment with dietary indoles

10.1158/1538-7445.am2012-5430 ◽

2012 ◽

Author(s):

Lyndsey E. Shorey ◽

Pushpinder Kaur ◽

Caprice Rosato ◽

E Andrés Houseman ◽

Emily Ho ◽

...

Keyword(s):

Dna Methylation ◽

T Cell ◽

Lymphoblastic Leukemia ◽

Therapeutic Targets ◽

Epigenetic Markers ◽

Genome Wide ◽

Novel Therapeutic ◽

Dietary Indoles

Download Full-text

Genome-wide methylation and transcriptome of blood neutrophils reveal the roles of DNA methylation in affecting transcription of protein-coding genes and miRNAs in E. coli-infected mastitis cows

BMC Genomics ◽

10.1186/s12864-020-6526-z ◽

2020 ◽

Vol 21 (1) ◽

Cited By ~ 4

Author(s):

Zhihua Ju ◽

Qiang Jiang ◽

Jinpeng Wang ◽

Xiuge Wang ◽

Chunhong Yang ◽

...

Keyword(s):

Dna Methylation ◽

Protein Coding ◽

E Coli ◽

Protein Coding Genes ◽

Genome Wide ◽

Blood Neutrophils

Download Full-text

E2F8 and its target genes as novel therapeutic targets for lung cancer

Journal of Thoracic Oncology ◽

10.1016/j.jtho.2015.12.048 ◽

2016 ◽

Vol 11 (2) ◽

pp. S29

Author(s):

Sin-Aye Park ◽

Jong Woo Lee ◽

James Platt ◽

Joann Sweasy ◽

Peter Glazer ◽

...

Keyword(s):

Lung Cancer ◽

Target Genes ◽

Therapeutic Targets ◽

Novel Therapeutic

Download Full-text

Functional genomic analysis of glioblastoma multiforme through short interfering RNA screening: a paradigm for therapeutic development

Neurosurgical FOCUS ◽

10.3171/2009.10.focus09210 ◽

2010 ◽

Vol 28 (1) ◽

pp. E4 ◽

Cited By ~ 11

Author(s):

Nikhil G. Thaker ◽

Fang Zhang ◽

Peter R. McDonald ◽

Tong Ying Shun ◽

John S. Lazo ◽

...

Keyword(s):

Drug Discovery ◽

Glioblastoma Multiforme ◽

Target Genes ◽

Surgical Approaches ◽

Short Interfering Rna ◽

Discovery Process ◽

Drug Discovery Process ◽

Functional Genomic Analysis ◽

Genome Wide ◽

Interfering Rna

Glioblastoma multiforme (GBM) is a high-grade brain malignancy arising from astrocytes. Despite aggressive surgical approaches, optimized radiation therapy regimens, and the application of cytotoxic chemotherapies, the median survival of patients with GBM from time of diagnosis remains less than 15 months, having changed little in decades. Approaches that target genes and biological pathways responsible for tumorigenesis or potentiate the activity of current therapeutic modalities could improve treatment efficacy. In this regard, several genomic and proteomic strategies promise to impact significantly on the drug discovery process. High-throughput genome-wide screening with short interfering RNA (siRNA) is one strategy for systematically exploring possible therapeutically relevant targets in GBM. Statistical methods and protein-protein interaction network databases can also be applied to the screening data to explore the genes and pathways that underlie the pathological basis and development of GBM. In this study, we highlight several genome-wide siRNA screens and implement these experimental concepts in the T98G GBM cell line to uncover the genes and pathways that regulate GBM cell death and survival. These studies will ultimately influence the development of a new avenue of neurosurgical therapy by placing the drug discovery process in the context of the entire biological system.

Download Full-text

Artificial intelligence based computational framework for drug-target prioritization and inference of novel repositionable drugs for Alzheimer’s disease

10.1101/2020.07.17.208116 ◽

2020 ◽

Author(s):

Shingo Tsuji ◽

Takeshi Hase ◽

Ayako Yachie ◽

Taiko Nishino ◽

Samik Ghosh ◽

...

Keyword(s):

Target Genes ◽

Therapeutic Targets ◽

Computational Framework ◽

Computational Tools ◽

Therapeutic Drugs ◽

Latent Space ◽

Latent Features ◽

Low Dimensional ◽

Network Metrics ◽

Novel Therapeutic

AbstractBackgroundIdentification of novel therapeutic targets is a key for successful drug development. However, the cost to experimentally identify therapeutic targets is huge and only 400 genes are targets for FDA-approved drugs. Therefore, it is inevitable to develop powerful computational tools to identify potential novel therapeutic targets. Because proteins make their functions together with their interacting partners, a protein-protein interaction network (PIN) in human could be a useful resource to build computational tools to investigate potential targets for therapeutic drugs. Network embedding methods, especially deep-learning based methods would be useful tools to extract an informative low-dimensional latent space that contains enough information required to fully represent original high-dimensional non-linear data of PINs.ResultsIn this study, we developed a deep learning based computational framework that extracts low-dimensional latent space embedded in high-dimensional data of the human PIN and uses the features in the latent space (latent features) to infer potential novel targets for therapeutic drugs. We examined the relationships between the latent features and the representative network metrics and found that the network metrics can explain a large number of the latent features, while several latent features do not correlate with all the network metrics. The results indicate that the features are likely to capture information that the representative network metrics can not capture, while the latent features also can capture information obtained from the network metrics. Our computational framework uses the latent features together with state-of-the-art machine learning techniques to infer potential drug target genes. We applied our computational framework to prioritized novel putative target genes for Alzheimer’s disease and successfully identified key genes for potential novel therapeutic targets (e.g., DLG4, EGFR, RAC1, SYK, PTK2B, SOCS1). Furthermore, based on these putative targets, we inferred repositionable candidate-compounds for the disease (e.g., Tamoxifen, Bosutinib, and Dasatinib)DiscussionsOur computational framework could be powerful computational tools to efficiently prioritize new therapeutic targets and drug repositioning. It is pertinent to note here that our computational platform is easily applicable to investigate novel potential targets and repositionable compounds for any diseases, especially for rare diseases.

Download Full-text

Genome-wide identification and comparison of differentially expressed profiles of miRNAs and lncRNAs with associated ceRNA networks in the gonads of Chinese soft-shelled turtle, Pelodiscus sinensis

10.21203/rs.2.10525/v4 ◽

2020 ◽

Author(s):

Xiao Ma ◽

Shuangshuang Cen ◽

Luming Wang ◽

Chao Zhang ◽

Limin Wu ◽

...

Keyword(s):

Messenger Rna ◽

Target Genes ◽

Expression Profiles ◽

Differentially Expressed ◽

Integrated Analysis ◽

Pelodiscus Sinensis ◽

Specific Expression ◽

Protein Coding ◽

Reproductive Regulation ◽

Protein Coding Genes

Abstract Background: The gonad is the major factor affecting animal reproduction. The regulatory mechanism of the expression of protein-coding genes involved in reproduction still remains to be elucidated. Increasing evidence has shown that ncRNAs play key regulatory roles in gene expression in many life processes. The roles of microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) in reproduction have been investigated in some species. However, the regulatory patterns of miRNA and lncRNA in the sex biased expression of protein coding genes remains to be elucidated. In this study, we performed an integrated analysis of miRNA, messenger RNA (mRNA), and lncRNA expression profiles to explore their regulatory patterns in the female ovary and male testis of Chinese soft-shelled turtle, Pelodiscus sinensis.Results: We identified 10 446 mature miRNAs, 20 414 mRNAs and 28 500 lncRNAs in the ovaries and testes, and 633 miRNAs, 11 319 mRNAs, and 10 495 lncRNAs showed differential expression. A total of 2 814 target genes were identified for miRNAs. The predicted target genes of these differentially expressed (DE) miRNAs and lncRNAs included abundant genes related to reproductive regulation. Furthermore, we found that 189 DEmiRNAs and 5 408 DElncRNAs showed sex-specific expression. Of these, 3 DEmiRNAs and 917 DElncRNAs were testis-specific, and 186 DEmiRNAs and 4 491 DElncRNAs were ovary-specific. We further constructed complete endogenous lncRNA-miRNA-mRNA networks using bioinformatics, including 103 DEmiRNAs, 636 DEmRNAs, and 1 622 DElncRNAs. The target genes for the differentially expressed miRNAs and lncRNAs included abundant genes involved in gonadal development, including Wt1, Creb3l2, Gata4, Wnt2, Nr5a1, Hsd17, Igf2r, H2afz, Lin52, Trim71, Zar1, and Jazf1.Conclusions: In animals, miRNA and lncRNA as master regulators regulate reproductive processes by controlling the expression of mRNAs. Considering their importance, the identified miRNAs, lncRNAs, and their targets in P. sinensis might be useful for studying the molecular processes involved in sexual reproduction and genome editing to produce higher quality aquaculture animals. A thorough understanding of ncRNA-based cellular regulatory networks will aid in the improvement of P. sinensis reproductive traits for aquaculture.

Download Full-text

Genome-wide association analysis and replication in 810,625 individuals identifies novel therapeutic targets for varicose veins

10.1101/2020.05.14.095653 ◽

2020 ◽

Cited By ~ 2

Author(s):

Waheed-Ul-Rahman Ahmed ◽

Akira Wiberg ◽

Michael Ng ◽

Wei Wang ◽

Adam Auton ◽

...

Keyword(s):

Risk Score ◽

Genetic Risk ◽

Varicose Veins ◽

Genetic Risk Score ◽

Therapeutic Targets ◽

Genome Wide Association ◽

Western Society ◽

Susceptibility Loci ◽

Genome Wide ◽

Novel Therapeutic

AbstractBackgroundVaricose veins (VVs) affect one-third of Western society, with a significant subset of patients developing venous ulceration, and ongoing management of venous leg ulcers costing around $14.9 billion annually in the USA. There is no current medical management for VVs, with approaches limited to compression stockings, ablation techniques, or open surgery for more advanced disease. A significant proportion of patients report a positive family history, and heritability is ~17%, suggesting a strong genetic component. We aimed to identify novel therapeutic targets by improving our understanding of the aetiopathology and genetic architecture of VVs.MethodsWe performed the largest two-stage genome-wide association study of VVs in 401,656 subjects from UK Biobank, and replication in 408,969 subjects from 23andMe (total 135,514 varicose veins cases and 675,111 controls). We constructed a genetic risk score for VVs to investigate its use as a prognostic tool. Genes and pathways were prioritised using a suite of bioinformatic tools, and therapeutic targets identified using the Open Targets Platform.ResultsWe discovered 49 signals at 46 susceptibility loci associated with VVs, including 29 previously unreported genetic associations (28 susceptibility loci). We demonstrated that patients with VVs requiring surgery have a higher genetic risk score than those managed non-surgically. We map 237 genes to these loci, many of which are biologically relevant and tractable to therapeutic targeting or repurposing (notably VEGFA, COL27A1, EFEMP1, PPP3R1 and NFATC2). Tissue enrichment analyses implicated vascular tissue, and several genes were enriched in biological pathways relating to extracellular matrix biology, inflammation, angiogenesis, lymphangiogenesis, vascular smooth muscle cell migration, and apoptosis.ConclusionsGenes and pathways identified represent biologically plausible contributors to the pathobiology of VVs, identifying promising candidates for further investigation of venous biology and potential therapeutic targets. We have provided the proof-of-principle that genetic risk score correlates with disease severity, which represents a first step in personalised medicine approaches to varicose veins.

Download Full-text