scholarly journals Unsupervised discovery of phenotype-specific multi-omics networks

2019 ◽  
Vol 35 (21) ◽  
pp. 4336-4343 ◽  
Author(s):  
W Jenny Shi ◽  
Yonghua Zhuang ◽  
Pamela H Russell ◽  
Brian D Hobbs ◽  
Margaret M Parker ◽  
...  

Abstract Motivation Complex diseases often involve a wide spectrum of phenotypic traits. Better understanding of the biological mechanisms relevant to each trait promotes understanding of the etiology of the disease and the potential for targeted and effective treatment plans. There have been many efforts towards omics data integration and network reconstruction, but limited work has examined the incorporation of relevant (quantitative) phenotypic traits. Results We propose a novel technique, sparse multiple canonical correlation network analysis (SmCCNet), for integrating multiple omics data types along with a quantitative phenotype of interest, and for constructing multi-omics networks that are specific to the phenotype. As a case study, we focus on miRNA–mRNA networks. Through simulations, we demonstrate that SmCCNet has better overall prediction performance compared to popular gene expression network construction and integration approaches under realistic settings. Applying SmCCNet to studies on chronic obstructive pulmonary disease (COPD) and breast cancer, we found enrichment of known relevant pathways (e.g. the Cadherin pathway for COPD and the interferon-gamma signaling pathway for breast cancer) as well as less known omics features that may be important to the diseases. Although those applications focus on miRNA–mRNA co-expression networks, SmCCNet is applicable to a variety of omics and other data types. It can also be easily generalized to incorporate multiple quantitative phenotype simultaneously. The versatility of SmCCNet suggests great potential of the approach in many areas. Availability and implementation The SmCCNet algorithm is written in R, and is freely available on the web at https://cran.r-project.org/web/packages/SmCCNet/index.html. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Vol 36 (3) ◽  
pp. 842-850 ◽  
Author(s):  
Cheng Peng ◽  
Jun Wang ◽  
Isaac Asante ◽  
Stan Louie ◽  
Ran Jin ◽  
...  

Abstract Motivation Epidemiologic, clinical and translational studies are increasingly generating multiplatform omics data. Methods that can integrate across multiple high-dimensional data types while accounting for differential patterns are critical for uncovering novel associations and underlying relevant subgroups. Results We propose an integrative model to estimate latent unknown clusters (LUCID) aiming to both distinguish unique genomic, exposure and informative biomarkers/omic effects while jointly estimating subgroups relevant to the outcome of interest. Simulation studies indicate that we can obtain consistent estimates reflective of the true simulated values, accurately estimate subgroups and recapitulate subgroup-specific effects. We also demonstrate the use of the integrated model for future prediction of risk subgroups and phenotypes. We apply this approach to two real data applications to highlight the integration of genomic, exposure and metabolomic data. Availability and Implementation The LUCID method is implemented through the LUCIDus R package available on CRAN (https://CRAN.R-project.org/package=LUCIDus). Supplementary information Supplementary materials are available at Bioinformatics online.


Author(s):  
Mingming Gong ◽  
Peng Liu ◽  
Frank C Sciurba ◽  
Petar Stojanov ◽  
Dacheng Tao ◽  
...  

Abstract Motivation There is growing interest in the biomedical research community to incorporate retrospective data, available in healthcare systems, to shed light on associations between different biomarkers. Understanding the association between various types of biomedical data, such as genetic, blood biomarkers, imaging, etc. can provide a holistic understanding of human diseases. To formally test a hypothesized association between two types of data in Electronic Health Records (EHRs), one requires a substantial sample size with both data modalities to achieve a reasonable power. Current association test methods only allow using data from individuals who have both data modalities. Hence, researchers cannot take advantage of much larger EHR samples that includes individuals with at least one of the data types, which limits the power of the association test. Results We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows SAT to produce better control of false discovery and to improve the power of the association test. We study the properties of the new test theoretically and empirically, through a series of simulations and by applying our method on real studies in the context of Chronic Obstructive Pulmonary Disease. We are able to identify an association between the high-dimensional characterization of Computed Tomography chest images and several blood biomarkers as well as the expression of dozens of genes involved in the immune system. Availability and implementation Code is available on https://github.com/batmanlab/Semi-paired-Association-Test. Supplementary information Supplementary data are available at Bioinformatics online.


Biostatistics ◽  
2018 ◽  
Vol 21 (3) ◽  
pp. 561-576 ◽  
Author(s):  
Elin Shaddox ◽  
Christine B Peterson ◽  
Francesco C Stingo ◽  
Nicola A Hanania ◽  
Charmion Cruickshank-Quinn ◽  
...  

Summary In this article, we develop a graphical modeling framework for the inference of networks across multiple sample groups and data types. In medical studies, this setting arises whenever a set of subjects, which may be heterogeneous due to differing disease stage or subtype, is profiled across multiple platforms, such as metabolomics, proteomics, or transcriptomics data. Our proposed Bayesian hierarchical model first links the network structures within each platform using a Markov random field prior to relate edge selection across sample groups, and then links the network similarity parameters across platforms. This enables joint estimation in a flexible manner, as we make no assumptions on the directionality of influence across the data types or the extent of network similarity across the sample groups and platforms. In addition, our model formulation allows the number of variables and number of subjects to differ across the data types, and only requires that we have data for the same set of groups. We illustrate the proposed approach through both simulation studies and an application to gene expression levels and metabolite abundances on subjects with varying severity levels of chronic obstructive pulmonary disease. Bayesian inference; Chronic obstructive pulmonary disease (COPD); Data integration; Gaussian graphical model; Markov random field prior; Spike and slab prior.


2021 ◽  
Author(s):  
Hongyu Zhao ◽  
Wei Liu ◽  
Wenxuan Deng ◽  
Ming Chen ◽  
Zihan Dong ◽  
...  

Abstract Finding disease-relevant tissues and cell types can facilitate the identification and investigation of functional genes and variants. In particular, cell type proportions can serve as potential disease predictive biomarkers. Here, we introduce a novel statistical framework, cell-type Wide Association Study (cWAS), that integrates genetic data with transcriptomics data to identify cell types whose genetically regulated proportions (GRPs) are disease/trait-associated. On simulated and real GWAS data, cWAS showed substantial statistical power with newly identified significant GRP associations in disease-associated tissues. More specifically, GRPs of endothelial and myofibroblast in the lung tissue were associated with Idiopathic Pulmonary Fibrosis and Chronic Obstructive Pulmonary Disease, respectively. For breast cancer, the GRP of blood CD8+ T cells was negatively associated with breast cancer (BC) risk as well as survival. Overall, cWAS is a powerful tool to reveal cell types associated with complex diseases mediated by GRPs.


2012 ◽  
Vol 2012 ◽  
pp. 1-11 ◽  
Author(s):  
Hanaa Ahmed Shafiek ◽  
Nashwa Hassan Abd-Elwahab ◽  
Manal Mohammad Baddour ◽  
Mohamed Mabrouk El-Hoffy ◽  
Akram Abd-Elmoneim Degady ◽  
...  

Objective. To study the value of the inflammatory markers (interleukin-6 (IL-6), interleukin-8 (IL-8), and C-reactive protein (CRP)) in predicting the outcome of noninvasive ventilation (NIV) in the management of acute respiratory failure (ARF) on top of chronic obstructive pulmonary disease (COPD) and the role of bacteria in the systemic inflammation. Methods. Thirty three patients were subjected to standard treatment plus NIV, and accordingly, they were classified into responders and nonresponders. Serum samples were collected for IL-6, IL-8, and CRP analysis. Sputum samples were taken for microbiological evaluation. Results. A wide spectrum of bacteria was revealed; Gram-negative and atypical bacteria were the most common (31% and 28% resp.; single or copathogen). IL-8 and dyspnea grade was significantly higher in the non-responder group ( and 0.023 resp.). IL-6 correlated positivity with the presence of infection and type of pathogen ( and 0.034 resp.). Gram-negative bacteria were associated with higher significant IL-6 in comparison between others ( pg/dL; ) but insignificantly affected NIV outcome (). Conclusions. High systemic inflammation could predict failure of NIV. G-ve bacteria correlated with high IL-6 but did not affect the response to NIV.


2021 ◽  
Author(s):  
Wei Liu ◽  
Wenxuan Deng ◽  
Ming Chen ◽  
Zihan Dong ◽  
Biqing Zhu ◽  
...  

Finding disease-relevant tissues and cell types can facilitate the identification and investigation of functional genes and variants. In particular, cell type proportions can serve as potential disease predictive biomarkers. Here, we introduce a novel statistical framework, cell-type Wide Association Study (cWAS), that integrates genetic data with transcriptomics data to identify cell types whose genetically regulated proportions (GRPs) are disease/trait-associated. On simulated and real GWAS data, cWAS showed substantial statistical power with newly identified significant GRP associations in disease-associated tissues. More specifically, GRPs of endothelial and myofibroblasts in lung tissue were associated with Idiopathic Pulmonary Fibrosis and Chronic Obstructive Pulmonary Disease, respectively. For breast cancer, the GRP of blood CD8+ T cells was negatively associated with breast cancer (BC) risk as well as survival. Overall, cWAS is a powerful tool to reveal cell types associated with complex diseases mediated by GRPs.


2021 ◽  
Vol 28 ◽  
Author(s):  
Salvatore Fuschillo ◽  
Debora Paris ◽  
Annabella Tramice ◽  
Pasquale Ambrosino ◽  
Letizia Palomba ◽  
...  

: Chronic obstructive pulmonary disease (COPD) is an increasing cause of global morbidity and mortality, with poor long-term outcomes and chronic disability. COPD is a condition with a wide spectrum of clinical presentations, with different phenotypes being identified even among patients with comparable degrees of airflow limitation. Considering the burden of COPD in terms of social and economic costs, in recent years a growing attention has been given to the need of more personalized approaches and patient-tailored rehabilitation programs. In this regard, the systematic analysis of metabolites in biological matrices, namely metabolomics, may become an essential tool in phenotyping diseases. Through the identification and quantification of the small molecules produced during biological processes, metabolomic profiling of biological samples has thus been proposed as an opportunity to identify novel biomarkers of disease outcome and treatment response. Exhaled breath condensate (EBC) and plasma/serum are fluid pools, which can be easily extracted and analyzed. In this review, we discuss the potential clinical applications of the metabolomic profiling of EBC and plasma/serum in COPD.


2018 ◽  
Vol 68 (suppl 1) ◽  
pp. bjgp18X697145
Author(s):  
Carole Gardener ◽  
Caroline Moore ◽  
Morag Farquhar ◽  
Gail Ewing ◽  
Robbie Duschinsky

BackgroundPatients can be reluctant to say that they need support, telling clinicians they are ‘fine’ despite having unmet needs. Research in mental health settings suggests patients who do this they are less likely to follow treatment plans, and their carers are at a risk of depression. To-date these findings have not been explored in patients with advancing physical health conditions, or their carers.AimTo explore the presence, role and impact of assertions of ‘I’m Fine’ in patients with advanced chronic obstructive pulmonary disease (COPD) and their carers.MethodCriteria based on Attachment Theory were used to identify ‘I’m Fine’ cases from the Living with Breathlessness Study (LwB) dataset of 235 patients and 115 carers. Quantitative analysis explored variables such as health service use between ‘I’m Fine’ and non ‘I’m Fine’ cases, whilst narrative analysis is being used to explore discourses within cases.Results21 patients and six carers asserted they were ‘fine’ despite unmet needs. Patients’ minimised disease impact and symptoms, avoided thinking about the future and used stoical language. Despite ‘I’m Fine’ cases being more likely to report no exacerbations and more likely to score less on the COPD Assessment Test (CAT), all wanted to see more of their GP. Carers focused on the needs of the patient whilst downplaying their own problems.ConclusionThe existence of a sub-group of patients with advanced COPD who assert that they are ‘fine’ may have implications for primary care. This will be explored in planned focus groups with clinicians.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Robert I. Griffiths ◽  
Michelle L. Gleeson ◽  
José M. Valderas ◽  
Mark D. Danese

Preexisting comorbidity adversely impacts breast cancer treatment and outcomes. We examined the incremental impact of comorbidity undetected until cancer. We followed breast cancer patients in SEER-Medicare from 12 months before to 84 months after diagnosis. Two comorbidity indices were constructed: the National Cancer Institute index, using 12 months of claims before cancer, and a second index for previously undetected conditions, using three months after cancer. Conditions present in the first were excluded from the second. Overall, 6,184 (10.1%) had≥1undetected comorbidity. Chronic obstructive pulmonary disease (38%) was the most common undetected condition. In multivariable analyses that adjusted for comorbidity detected before cancer, older age, later stage, higher grade, and poor performance status all were associated with higher odds of≥1undetected comorbidity. In stage I–III cancer, undetected comorbidity was associated with lower adjusted odds of receiving adjuvant chemotherapy (Odds Ratio (OR) = 0.81, 95% Confidence Interval (CI) 0.73–0.90,P<0.0001;OR=0.38, 95% CI 0.30–0.49,P<0.0001; index score 1 or≥2, respectively), and with increased mortality (Hazard Ratio (HR) = 1.45, 95% CI 1.38–1.53,P<0.0001;HR=2.38, 95% CI 2.18–2.60,P<0.0001; index score 1 or≥2). Undetected comorbidity is associated with less aggressive treatment and higher mortality in breast cancer.


2020 ◽  
Vol 36 (11) ◽  
pp. 3393-3400 ◽  
Author(s):  
V Fortino ◽  
G Scala ◽  
D Greco

Abstract Motivation Omics technologies have the potential to facilitate the discovery of new biomarkers. However, only few omics-derived biomarkers have been successfully translated into clinical applications to date. Feature selection is a crucial step in this process that identifies small sets of features with high predictive power. Models consisting of a limited number of features are not only more robust in analytical terms, but also ensure cost effectiveness and clinical translatability of new biomarker panels. Here we introduce GARBO, a novel multi-island adaptive genetic algorithm to simultaneously optimize accuracy and set size in omics-driven biomarker discovery problems. Results Compared to existing methods, GARBO enables the identification of biomarker sets that best optimize the trade-off between classification accuracy and number of biomarkers. We tested GARBO and six alternative selection methods with two high relevant topics in precision medicine: cancer patient stratification and drug sensitivity prediction. We found multivariate biomarker models from different omics data types such as mRNA, miRNA, copy number variation, mutation and DNA methylation. The top performing models were evaluated by using two different strategies: the Pareto-based selection, and the weighted sum between accuracy and set size (w = 0.5). Pareto-based preferences show the ability of the proposed algorithm to search minimal subsets of relevant features that can be used to model accurate random forest-based classification systems. Moreover, GARBO systematically identified, on larger omics data types, such as gene expression and DNA methylation, biomarker panels exhibiting higher classification accuracy or employing a number of features much lower than those discovered with other methods. These results were confirmed on independent datasets. Availability and implementation github.com/Greco-Lab/GARBO. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document