scholarly journals Unpaired data empowers association tests

Author(s):  
Mingming Gong ◽  
Peng Liu ◽  
Frank C Sciurba ◽  
Petar Stojanov ◽  
Dacheng Tao ◽  
...  

Abstract Motivation There is growing interest in the biomedical research community to incorporate retrospective data, available in healthcare systems, to shed light on associations between different biomarkers. Understanding the association between various types of biomedical data, such as genetic, blood biomarkers, imaging, etc. can provide a holistic understanding of human diseases. To formally test a hypothesized association between two types of data in Electronic Health Records (EHRs), one requires a substantial sample size with both data modalities to achieve a reasonable power. Current association test methods only allow using data from individuals who have both data modalities. Hence, researchers cannot take advantage of much larger EHR samples that includes individuals with at least one of the data types, which limits the power of the association test. Results We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows SAT to produce better control of false discovery and to improve the power of the association test. We study the properties of the new test theoretically and empirically, through a series of simulations and by applying our method on real studies in the context of Chronic Obstructive Pulmonary Disease. We are able to identify an association between the high-dimensional characterization of Computed Tomography chest images and several blood biomarkers as well as the expression of dozens of genes involved in the immune system. Availability and implementation Code is available on https://github.com/batmanlab/Semi-paired-Association-Test. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Mingming Gong ◽  
Peng Liu ◽  
Frank C. Sciurba ◽  
Petar Stojanov ◽  
Dacheng Tao ◽  
...  

AbstractTo achieve a holistic view of the underlying mechanisms of human diseases, the biomedical research community is moving toward harvesting retrospective data available in Electronic Healthcare Records (EHRs). The first step for causal understanding is to perform association tests between types of potentially high-dimensional biomedical data, such as genetic, blood biomarkers, and imaging data. To obtain a reasonable power, current methods require a substantial sample size of individuals with both data modalities. This prevents researchers from using much larger EHR samples that include individuals with at least one data type, limits the power of the association test, and may result in higher false discovery rate. We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows SAT to produce better control of false discovery and, under some conditions, improve the association test power. We study the properties of SAT theoretically and empirically, through simulations and application to real studies in the context of Chronic Obstructive Pulmonary Disease. Our method identifies an association between the high-dimensional characterization of Computed Tomography (CT) chest images and blood biomarkers as well as the expression of dozens of genes involved in the immune system.


2019 ◽  
Vol 35 (21) ◽  
pp. 4336-4343 ◽  
Author(s):  
W Jenny Shi ◽  
Yonghua Zhuang ◽  
Pamela H Russell ◽  
Brian D Hobbs ◽  
Margaret M Parker ◽  
...  

Abstract Motivation Complex diseases often involve a wide spectrum of phenotypic traits. Better understanding of the biological mechanisms relevant to each trait promotes understanding of the etiology of the disease and the potential for targeted and effective treatment plans. There have been many efforts towards omics data integration and network reconstruction, but limited work has examined the incorporation of relevant (quantitative) phenotypic traits. Results We propose a novel technique, sparse multiple canonical correlation network analysis (SmCCNet), for integrating multiple omics data types along with a quantitative phenotype of interest, and for constructing multi-omics networks that are specific to the phenotype. As a case study, we focus on miRNA–mRNA networks. Through simulations, we demonstrate that SmCCNet has better overall prediction performance compared to popular gene expression network construction and integration approaches under realistic settings. Applying SmCCNet to studies on chronic obstructive pulmonary disease (COPD) and breast cancer, we found enrichment of known relevant pathways (e.g. the Cadherin pathway for COPD and the interferon-gamma signaling pathway for breast cancer) as well as less known omics features that may be important to the diseases. Although those applications focus on miRNA–mRNA co-expression networks, SmCCNet is applicable to a variety of omics and other data types. It can also be easily generalized to incorporate multiple quantitative phenotype simultaneously. The versatility of SmCCNet suggests great potential of the approach in many areas. Availability and implementation The SmCCNet algorithm is written in R, and is freely available on the web at https://cran.r-project.org/web/packages/SmCCNet/index.html. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Godfred O Antwi ◽  
Darson L Rhodes

Abstract Background Concern about the health impacts of e-cigarette use is growing; however, limited research exists regarding potential long-term health effects of this behavior. This study explored the relationship between e-cigarette use and COPD in a sample of US adults. Methods A secondary data analysis using data from the 2018 Behavioral Risk Factor Surveillance Survey in the USA was computed to examine associations between e-cigarette use and COPD controlling for conventional cigarette smoking status, past month leisure physical activity and demographic characteristics including age, sex, education, race, marital status and body mass index. Results Significant associations between e-cigarette use and COPD among former combustible cigarette smokers and those who reported never using combustible cigarettes were found. Compared with never e-cigarette users, the odds of having COPD were significantly greater for daily e-cigarette users (OR = 1.53; 95% CI: 1.11–2.03), occasional users (OR = 1.43, 95% CI: 1.13–1.80) and former users (OR = 1.46 95% CI: 1.28–1.67). Conclusions Findings from this study indicate a potential link between e-cigarette use and COPD. Further research to explore the potential effects of e-cigarette on COPD is recommended.


Informatics ◽  
2020 ◽  
Vol 7 (4) ◽  
pp. 56
Author(s):  
Fatma Zubaydi ◽  
Assim Sagahyroon ◽  
Fadi Aloul ◽  
Hasan Mir ◽  
Bassam Mahboub

In this work, a mobile application is developed to assist patients suffering from chronic obstructive pulmonary disease (COPD) or Asthma that will reduce the dependency on hospital and clinic based tests and enable users to better manage their disease through increased self-involvement. Due to the pervasiveness of smartphones, it is proposed to make use of their built-in sensors and ever increasing computational capabilities to provide patients with a mobile-based spirometer capable of diagnosing COPD or asthma in a reliable and cost effective manner. Data collected using an experimental setup consisting of an airflow source, an anemometer, and a smartphone is used to develop a mathematical model that relates exhalation frequency to air flow rate. This model allows for the computation of two key parameters known as forced vital capacity (FVC) and forced expiratory volume in one second (FEV1) that are used in the diagnosis of respiratory diseases. The developed platform has been validated using data collected from 25 subjects with various conditions. Results show that an excellent match is achieved between the FVC and FEV1 values computed using a clinical spirometer and those returned by the model embedded in the mobile application.


2019 ◽  
Vol 61 (2) ◽  
pp. 143-149 ◽  
Author(s):  
Elizabeth A. Regan ◽  
Craig P. Hersh ◽  
Peter J. Castaldi ◽  
Dawn L. DeMeo ◽  
Edwin K. Silverman ◽  
...  

Biostatistics ◽  
2018 ◽  
Vol 21 (3) ◽  
pp. 561-576 ◽  
Author(s):  
Elin Shaddox ◽  
Christine B Peterson ◽  
Francesco C Stingo ◽  
Nicola A Hanania ◽  
Charmion Cruickshank-Quinn ◽  
...  

Summary In this article, we develop a graphical modeling framework for the inference of networks across multiple sample groups and data types. In medical studies, this setting arises whenever a set of subjects, which may be heterogeneous due to differing disease stage or subtype, is profiled across multiple platforms, such as metabolomics, proteomics, or transcriptomics data. Our proposed Bayesian hierarchical model first links the network structures within each platform using a Markov random field prior to relate edge selection across sample groups, and then links the network similarity parameters across platforms. This enables joint estimation in a flexible manner, as we make no assumptions on the directionality of influence across the data types or the extent of network similarity across the sample groups and platforms. In addition, our model formulation allows the number of variables and number of subjects to differ across the data types, and only requires that we have data for the same set of groups. We illustrate the proposed approach through both simulation studies and an application to gene expression levels and metabolite abundances on subjects with varying severity levels of chronic obstructive pulmonary disease. Bayesian inference; Chronic obstructive pulmonary disease (COPD); Data integration; Gaussian graphical model; Markov random field prior; Spike and slab prior.


Neurology ◽  
2017 ◽  
Vol 88 (21) ◽  
pp. 1996-2002 ◽  
Author(s):  
Bojing Liu ◽  
Fang Fang ◽  
Nancy L. Pedersen ◽  
Annika Tillander ◽  
Jonas F. Ludvigsson ◽  
...  

Objective:To examine whether vagotomy decreases the risk of Parkinson disease (PD).Methods:Using data from nationwide Swedish registers, we conducted a matched-cohort study of 9,430 vagotomized patients (3,445 truncal and 5,978 selective) identified between 1970 and 2010 and 377,200 reference individuals from the general population individually matched to vagotomized patients by sex and year of birth with a 40:1 ratio. Participants were followed up from the date of vagotomy until PD diagnosis, death, emigration out of Sweden, or December 31, 2010, whichever occurred first. Vagotomy and PD were identified from the Swedish Patient Register. We estimated hazard ratios (HRs) with 95% confidence intervals (CIs) using Cox models stratified by matching variables, adjusting for country of birth, chronic obstructive pulmonary disease, diabetes mellitus, vascular diseases, rheumatologic disease, osteoarthritis, and comorbidity index.Results:A total of 4,930 cases of incident PD were identified during 7.3 million person-years of follow-up. PD incidence (per 100,000 person-years) was 61.8 among vagotomized patients (80.4 for truncal and 55.1 for selective) and 67.5 among reference individuals. Overall, vagotomy was not associated with PD risk (HR 0.96, 95% CI 0.78–1.17). However, there was a suggestion of lower risk among patients with truncal vagotomy (HR 0.78, 95% CI 0.55–1.09), which may be driven by truncal vagotomy at least 5 years before PD diagnosis (HR 0.59, 95% CI 0.37–0.93). Selective vagotomy was not related to PD risk in any analyses.Conclusions:Although overall vagotomy was not associated the risk of PD, we found suggestive evidence for a potential protective effect of truncal, but not selective, vagotomy against PD development.


2019 ◽  
Vol 7 (30) ◽  
pp. 4-11
Author(s):  
Sariya Wongsaengsak ◽  
Jeff Dennis ◽  
Meily Arevalo ◽  
Somedeb Ball ◽  
Kenneth Nugent

Background: Platelets are important mediators of coagulation, inflammation, andatherosclerosis. We conducted a large population study with National Health and NutritionExamination Survey (NHANES) data to understand the relationship of total platelet count(TPC) with health and disease in humans.Methods: NHANES is a cross-sectional survey of non-institutionalized United States adults,administered every 2 years by the Centers for Disease Control and Prevention. Participantsanswer a questionnaire, receive a physical examination, and undergo laboratory tests. TPCvalues were analyzed for a six-year period of NHANES (2011–2016). Weighted 10th and 90thpercentiles were calculated, and logistic regression was used to predict likelihood (Odds ratio[OR]) of being in categories with TPC < 10th percentile or > 90th percentile. Statistical analysiswas performed using Stata/SE 15.1, using population weights for complex survey design.Results: The mean TPC for our sample (N = 17,969) was 236 × 103/μL (SD = 59 × 103)with the 10th percentile 170 × 103/μL and the 90th percentile 311 × 103/μL. Hispanics (otherthan Mexican Americans) and obese individuals had lower odds of a TPC < 10th percentile.Males, Blacks, adults aged ≥ 45 years, and those with a recent (last 12 months) hospital staywere more likely to have a TPC < 10th percentile. Obese individuals and Mexican Americanshad higher odds of having TPC > 90th percentile. Individuals with a congestive heart failure(CHF) or coronary heart disease (CHD) diagnosis had over twice the odds (OR 2.06, 95% CI:1.50-2.82, p =< 0.001, and 2.11, 95% CI: 1.48-3.01, p =< 0.001, respectively) of having TPC<10th percentile. Individuals with an emphysema or asthma diagnosis were more likely to haveTPC > 90th percentile (OR 1.84, 95% CI: 1.08-3.13, p = 0.026, and 1.25, 95% CI: 1.00-1.56,p = 0.046, respectively). A diagnosis of chronic obstructive pulmonary disease and cancer didnot have significant associations with TPC.Conclusions: Our study showed that obese individuals are more likely to havehigher TPC. Individuals with CHF and CHD had higher odds of having TPC < 10thpercentile, and those with emphysema and asthma were more likely to have TPC > 90thpercentile.


Sign in / Sign up

Export Citation Format

Share Document