Integrative Systems Biology Approaches to Identify and Prioritize Disease and Drug Candidate Genes

Author(s):  
Vivek Kaimal ◽  
Divya Sardana ◽  
Eric E. Bardes ◽  
Ranga Chandra Gudivada ◽  
Jing Chen ◽  
...  
2021 ◽  
Vol 12 ◽  
Author(s):  
Vidya Chidambaran ◽  
Valentina Pilipenko ◽  
Anil G. Jegga ◽  
Kristie Geisler ◽  
Lisa J. Martin

ObjectivesIncorporation of genetic factors in psychosocial/perioperative models for predicting chronic postsurgical pain (CPSP) is key for personalization of analgesia. However, single variant associations with CPSP have small effect sizes, making polygenic risk assessment important. Unfortunately, pediatric CPSP studies are not sufficiently powered for unbiased genome wide association (GWAS). We previously leveraged systems biology to identify candidate genes associated with CPSP. The goal of this study was to use systems biology prioritized gene enrichment to generate polygenic risk scores (PRS) for improved prediction of CPSP in a prospectively enrolled clinical cohort.MethodsIn a prospectively recruited cohort of 171 adolescents (14.5 ± 1.8 years, 75.4% female) undergoing spine fusion, we collected data about anesthesia/surgical factors, childhood anxiety sensitivity (CASI), acute pain/opioid use, pain outcomes 6–12 months post-surgery and blood (for DNA extraction/genotyping). We previously prioritized candidate genes using computational approaches based on similarity for functional annotations with a literature-derived “training set.” In this study, we tested ranked deciles of 1336 prioritized genes for increased representation of variants associated with CPSP, compared to 10,000 randomly selected control sets. Penalized regression (LASSO) was used to select final variants from enriched variant sets for calculation of PRS. PRS incorporated regression models were compared with previously published non-genetic models for predictive accuracy.ResultsIncidence of CPSP in the prospective cohort was 40.4%. 33,104 case and 252,590 control variants were included for association analyses. The smallest gene set enriched for CPSP had 80/1010 variants associated with CPSP (p < 0.05), significantly higher than in 10,000 randomly selected control sets (p = 0.0004). LASSO selected 20 variants for calculating weighted PRS. Model adjusted for covariates including PRS had AUROC of 0.96 (95% CI: 0.92–0.99) for CPSP prediction, compared to 0.70 (95% CI: 0.59–0.82) for non-genetic model (p < 0.001). Odds ratios and positive regression coefficients for the final model were internally validated using bootstrapping: PRS [OR 1.98 (95% CI: 1.21–3.22); β 0.68 (95% CI: 0.19–0.74)] and CASI [OR 1.33 (95% CI: 1.03–1.72); β 0.29 (0.03–0.38)].DiscussionSystems biology guided PRS improved predictive accuracy of CPSP risk in a pediatric cohort. They have potential to serve as biomarkers to guide risk stratification and tailored prevention. Findings highlight systems biology approaches for deriving PRS for phenotypes in cohorts less amenable to large scale GWAS.


2020 ◽  
Author(s):  
Muhammad Tahir Khan ◽  
Sajid Ali ◽  
anwar Sheed Khan ◽  
Arif Ali ◽  
Abbas Khan ◽  
...  

Abstract Background Tuberculosis (TB) is a global public health issue, getting worse due to emergence of resistance. Pyrazinamide (PZA) is first-line antimicrobial drugs used against non-replicated Mycobacterium tuberculosis (MTB). Data is scarce about whole genome sequencing of PZA resistance (PZA-R) in Khyber Pakhtunkhwa (KP) province of high burden country, Pakistan. In the current study we aimed to find the most common mutations in PZA-R MTB isolates in association with other candidate genes in a whole genome sequence (WGS). Samples were collected from TB suspects and drug susceptibility testing (DST) was performed according according to the WHO standards. The resistant samples were subjected for whole genome sequencing (WGS). The sequence data was through MTBseq and Total Genotyping Solution for Mycobacterium tuberculosis (TGS-TB). Metabolic model was analyzed, using RAST server. Results Among the three whole genome sequences, (NCBI BioProject Accession: PRJNA629298, PRJNA629388) 1997, 1162, and 2053 mutations including indel, was detected. Diverse variability has been detected in the membrane proteins PE and PPE, modulating the host immune response. Nine mutations in coding and promotor region have been detected in pncA with one novel (T-4C) variants. Mutations in the other drug candidate genes, KatG, rpoB have also been detected. Conclusion The metabolic model shows a distinct property. Diversity of variants has been detected in majority of MTB essential genes, functions from cell growth to cell signaling. The current study provides useful information, associated with geographic specific strains for biomarkers development and better management of drug resistance isolates.


2021 ◽  
Vol 12 ◽  
Author(s):  
Nooshin Ghahramani ◽  
Jalil Shodja ◽  
Seyed Abbas Rafat ◽  
Bahman Panahi ◽  
Karim Hasanpur

Background: Mastitis is the most prevalent disease in dairy cattle and one of the most significant bovine pathologies affecting milk production, animal health, and reproduction. In addition, mastitis is the most common, expensive, and contagious infection in the dairy industry.Methods: A meta-analysis of microarray and RNA-seq data was conducted to identify candidate genes and functional modules associated with mastitis disease. The results were then applied to systems biology analysis via weighted gene coexpression network analysis (WGCNA), Gene Ontology, enrichment analysis for the Kyoto Encyclopedia of Genes and Genomes (KEGG), and modeling using machine-learning algorithms.Results: Microarray and RNA-seq datasets were generated for 2,089 and 2,794 meta-genes, respectively. Between microarray and RNA-seq datasets, a total of 360 meta-genes were found that were significantly enriched as “peroxisome,” “NOD-like receptor signaling pathway,” “IL-17 signaling pathway,” and “TNF signaling pathway” KEGG pathways. The turquoise module (n = 214 genes) and the brown module (n = 57 genes) were identified as critical functional modules associated with mastitis through WGCNA. PRDX5, RAB5C, ACTN4, SLC25A16, MAPK6, CD53, NCKAP1L, ARHGEF2, COL9A1, and PTPRC genes were detected as hub genes in identified functional modules. Finally, using attribute weighting and machine-learning methods, hub genes that are sufficiently informative in Escherichia coli mastitis were used to optimize predictive models. The constructed model proposed the optimal approach for the meta-genes and validated several high-ranked genes as biomarkers for E. coli mastitis using the decision tree (DT) method.Conclusion: The candidate genes and pathways proposed in this study may shed new light on the underlying molecular mechanisms of mastitis disease and suggest new approaches for diagnosing and treating E. coli mastitis in dairy cattle.


2016 ◽  
Author(s):  
James P McCusker ◽  
Michel Dumontier ◽  
Rui Yan ◽  
Sylvia He ◽  
Jonathan S Dordick ◽  
...  

Metastatic cutaneous melanoma is an aggressive skin cancer with some progression-slowing treatments but no known cure. The omics data explosion has created many possible drug candidates, however filtering criteria remain challenging, and systems biology approaches have become fragmented with many disconnected databases. Using drug, protein, and disease interactions, we built an evidence-weighted knowledge graph of integrated interactions. Our knowledge graph-based system, ReDrugS, can be used via an API or web interface, and has generated 25 high quality melanoma drug candidates. We show that probabilistic analysis of systems biology graphs increases drug candidate quality compared to non-probabilistic methods. Four of the 25 candidates are novel therapies, three of which have been tested with other cancers. All other candidates have current or completed clinical trials, or have been studied in in vivo or in vitro. This approach can be used to identify candidate therapies for use in research or personalized medicine.


2019 ◽  
Vol 14 (5) ◽  
pp. 460-467 ◽  
Author(s):  
Neha Srivastava ◽  
Bhartendu Nath Mishra ◽  
Prachi Srivastava

Background: Neurodevelopmental Disorders (NDDs) are impairment of the growth and development of the brain or central nervous system, which occurs at the developmental stage. This can include developmental brain dysfunction, which can manifest as neuropsychiatric problems or impaired motor function, learning, language or non-verbal communication. These include the array of disorder, including: Autism Spectrum Disorders (ASD), Attention Deficit Hyperactivity Disorders (ADHD) etc. There is no particular diagnosis and cure for NDDs. These disorders seem to be result from a combination of genetic, biological, psychosocial and environmental risk factors. Diverse scientific literature reveals the adverse effect of environmental factors specifically, exposure of pesticides, which leads to growing number of human pathological conditions; among these, neurodevelopmental disorder is an emerging issue nowadays. Objective: The current study focused on in silico identification of potential drug targets for pesticides induced neurodevelopmental disorder including Attention Deficit Hyperactivity Disorder (ADHD) and Autism Spectrum Disorder (ASD) and to design potential drug molecule for the target through drug discovery approaches. Methods: We identified 139 candidate genes for ADHD and 206 candidate genes for ASD from the NCBI database for detailed study. Protein-protein interaction network analysis was performed to identify key genes/proteins in the network by using STRING 10.0 database and Cytoscape 3.3.0 software. The 3D structure of target protein was built and validated. Molecular docking was performed against twenty seven possible phytochemicals i.e. beta amyrin, ajmaline, serpentine, urosolic, huperzine A etc. having neuroprotective activity. The best-docked compound was identified by the lowest Binding Energy (BE). Further, the prediction of drug-likeness and bioactivity analysis of leads were performed by using molinspiration cheminformatics software. Result & Conclusion: Based on betweenness centrality and node degree as a network topological parameter, solute carrier family 6 member 4 (SLC6A4) was identified as a common key protein in both the networks. 3-D structure of SLC6A4 protein was designed and validated respectively. Based on the lowest binding energy, beta amyrin (B.E = -8.54 kcal/mol) was selected as a potential drug candidate against SLC6A4 protein. Prediction of drug-likeness and bioactivity analysis of leads showed drug candidate as a potential inhibitor. Beta amyrin (CID: 73145) was obtained as the most potential therapeutic inhibitor for ASD & ADHD in human.


2019 ◽  
Vol 17 (4) ◽  
pp. 352-365 ◽  
Author(s):  
Puneet Talwar ◽  
Renu Gupta ◽  
Suman Kushwaha ◽  
Rachna Agarwal ◽  
Luciano Saso ◽  
...  

Alzheimer’s disease (AD) is genetically complex with multifactorial etiology. Here, we aim to identify the potential viral pathogens leading to aberrant inflammatory and oxidative stress response in AD along with potential drug candidates using systems biology approach. We retrieved protein interactions of amyloid precursor protein (APP) and tau protein (MAPT) from NCBI and genes for oxidative stress from NetAge, for inflammation from NetAge and InnateDB databases. Genes implicated in aging were retrieved from GenAge database and two GEO expression datasets. These genes were individually used to create protein-protein interaction network using STRING database (score≥0.7). The interactions of candidate genes with known viruses were mapped using virhostnet v2.0 database. Drug molecules targeting candidate genes were retrieved using the Drug- Gene Interaction Database (DGIdb). Data mining resulted in 2095 APP, 116 MAPT, 214 oxidative stress, 1269 inflammatory genes. After STRING PPIN analysis, 404 APP, 109 MAPT, 204 oxidative stress and 1014 inflammation related high confidence proteins were identified. The overlap among all datasets yielded eight common markers (AKT1, GSK3B, APP, APOE, EGFR, PIN1, CASP8 and SNCA). These genes showed association with hepatitis C virus (HCV), Epstein– Barr virus (EBV), human herpes virus 8 and Human papillomavirus (HPV). Further, screening of drugs targeting candidate genes, and possessing anti-inflammatory property, antiviral activity along with a suggested role in AD pathophysiology yielded 12 potential drug candidates. Our study demonstrated the role of viral etiology in AD pathogenesis by elucidating interaction of oxidative stress and inflammation causing candidate genes with common viruses along with the identification of potential AD drug candidates.


2020 ◽  
Vol 52 (1) ◽  
Author(s):  
Marta Gòdia ◽  
Antonio Reverter ◽  
Rayner González-Prendes ◽  
Yuliaxis Ramayo-Caldas ◽  
Anna Castelló ◽  
...  

Abstract Background Genetic pressure in animal breeding is sparking the interest of breeders for selecting elite boars with higher sperm quality to optimize ejaculate doses and fertility rates. However, the molecular basis of sperm quality is not yet fully understood. Our aim was to identify candidate genes, pathways and DNA variants associated to sperm quality in swine by analysing 25 sperm-related phenotypes and integrating genome-wide association studies (GWAS) and RNA-seq under a systems biology framework. Results By GWAS, we identified 12 quantitative trait loci (QTL) associated to the percentage of head and neck abnormalities, abnormal acrosomes and motile spermatozoa. Candidate genes included CHD2, KATNAL2, SLC14A2 and ABCA1. By RNA-seq, we identified a wide repertoire of mRNAs (e.g. PRM1, OAZ3, DNAJB8, TPPP2 and TNP1) and miRNAs (e.g. ssc-miR-30d, ssc-miR-34c, ssc-miR-30c-5p, ssc-miR-191, members of the let-7 family and ssc-miR-425-5p) with functions related to sperm biology. We detected 6128 significant correlations (P-value ≤ 0.05) between sperm traits and mRNA abundances. By expression (e)GWAS, we identified three trans-expression QTL involving the genes IQCJ, ACTR2 and HARS. Using the GWAS and RNA-seq data, we built a gene interaction network. We considered that the genes and interactions that were present in both the GWAS and RNA-seq networks had a higher probability of being actually involved in sperm quality and used them to build a robust gene interaction network. In addition, in the final network we included genes with RNA abundances correlated with more than four semen traits and miRNAs interacting with the genes on the network. The final network was enriched for genes involved in gamete generation and development, meiotic cell cycle, DNA repair or embryo implantation. Finally, we designed a panel of 73 SNPs based on the GWAS, eGWAS and final network data, that explains between 5% (for sperm cell concentration) and 36% (for percentage of neck abnormalities) of the phenotypic variance of the sperm traits. Conclusions By applying a systems biology approach, we identified genes that potentially affect sperm quality and constructed a SNP panel that explains a substantial part of the phenotypic variance for semen quality in our study and that should be tested in other swine populations to evaluate its relevance for the pig breeding sector.


2017 ◽  
Vol 3 ◽  
pp. e106 ◽  
Author(s):  
James P. McCusker ◽  
Michel Dumontier ◽  
Rui Yan ◽  
Sylvia He ◽  
Jonathan S. Dordick ◽  
...  

Metastatic cutaneous melanoma is an aggressive skin cancer with some progression-slowing treatments but no known cure. The omics data explosion has created many possible drug candidates; however, filtering criteria remain challenging, and systems biology approaches have become fragmented with many disconnected databases. Using drug, protein and disease interactions, we built an evidence-weighted knowledge graph of integrated interactions. Our knowledge graph-based system, ReDrugS, can be used via an application programming interface or web interface, and has generated 25 high-quality melanoma drug candidates. We show that probabilistic analysis of systems biology graphs increases drug candidate quality compared to non-probabilistic methods. Four of the 25 candidates are novel therapies, three of which have been tested with other cancers. All other candidates have current or completed clinical trials, or have been studied in in vivo or in vitro. This approach can be used to identify candidate therapies for use in research or personalized medicine.


Sign in / Sign up

Export Citation Format

Share Document