scholarly journals HASE: Framework for efficient high-dimensional association analyses

2016 ◽  
Author(s):  
G.V. Roshchupkin ◽  
H.H.H. Adams ◽  
M.W. Vernooij ◽  
A. Hofman ◽  
C.M. Van Duijn ◽  
...  

ABSTRACTLarge-scale data collection and processing have facilitated scientific discoveries in fields such as genomics and imaging, but cross-investigations between multiple big datasets remain impractical. Computational requirements of high-dimensional association studies are often too demanding for individual sites. Additionally, the sheer size of intermediate results is unfit for collaborative settings where summary statistics are exchanged for meta-analyses. Here we introduce the HASE framework to perform high-dimensional association studies with dramatic reduction in both computational burden and storage requirements of intermediate results. We implemented a novel meta-analytical method that yields identical power as pooled analyses without the need of sharing individual participant data. The efficiency of the framework is illustrated by associating 9 million genetic variants with 1.5 million brain imaging voxels in three cohorts (total N=4,034) followed by meta-analysis, on a standard computational infrastructure. These experiments indicate that HASE facilitates high-dimensional association studies enabling large multicenter association studies for future discoveries.

Author(s):  
Tianye Jia ◽  
Congying Chu ◽  
Yun Liu ◽  
Jenny van Dongen ◽  
Evangelos Papastergios ◽  
...  

AbstractDNA methylation, which is modulated by both genetic factors and environmental exposures, may offer a unique opportunity to discover novel biomarkers of disease-related brain phenotypes, even when measured in other tissues than brain, such as blood. A few studies of small sample sizes have revealed associations between blood DNA methylation and neuropsychopathology, however, large-scale epigenome-wide association studies (EWAS) are needed to investigate the utility of DNA methylation profiling as a peripheral marker for the brain. Here, in an analysis of eleven international cohorts, totalling 3337 individuals, we report epigenome-wide meta-analyses of blood DNA methylation with volumes of the hippocampus, thalamus and nucleus accumbens (NAcc)—three subcortical regions selected for their associations with disease and heritability and volumetric variability. Analyses of individual CpGs revealed genome-wide significant associations with hippocampal volume at two loci. No significant associations were found for analyses of thalamus and nucleus accumbens volumes. Cluster-based analyses revealed additional differentially methylated regions (DMRs) associated with hippocampal volume. DNA methylation at these loci affected expression of proximal genes involved in learning and memory, stem cell maintenance and differentiation, fatty acid metabolism and type-2 diabetes. These DNA methylation marks, their interaction with genetic variants and their impact on gene expression offer new insights into the relationship between epigenetic variation and brain structure and may provide the basis for biomarker discovery in neurodegeneration and neuropsychiatric conditions.


2020 ◽  
Author(s):  
Jinyoung Byun ◽  
Younghun Han ◽  
Yafang Li ◽  
Jun Xia ◽  
Xiangjun Xiao ◽  
...  

SummaryLung cancer is the leading cause of cancer death worldwide. Genome-wide association studies have revealed genetic risk factors, highlighting the role of smoking, family history, telomere regulation, and DNA damage-repair in lung cancer etiology. Many studies have focused on a single ethnic group to avoid confounding from variability in allele frequencies across populations; however, comprehensive multi-ethnic analyses may identify variants that are more likely to be causal. This large-scale, multi- ethnic meta-analyses identified 28 novel risk loci achieving genome-wide significance. Leading candidates were further studied using single-cell methods for evaluating DNA-damage. DNA-damage promoting activities were confirmed for selected genes by knockdown genes and overexpression studies.


2019 ◽  
Author(s):  
Harmen Draisma ◽  
Jun Liu ◽  
Igor Pupko ◽  
Ayşe Demirkan ◽  
Zhanna Balkhiyarova ◽  
...  

AbstractBackgroundMulti-phenotype genome-wide association studies (MP-GWAS) of correlated traits have greater power to detect genotype–phenotype associations than single-trait GWAS. However, no multi-phenotype analysis method exists for epigenome-wide association studies (EWAS).ResultsWe extended the SCOPA approach developed by us to “methylSCOPA” software in C++ by ‘reversely’ regressing DNA hyper/hypo-methylation information on a linear combination of phenotypes. We evaluated two models of association between DNA methylation and fasting glucose (FG) and insulin (FI) levels: Model 1, including FG, FI, and three measured potential confounders (body mass index [BMI], fasting serum triglyceride levels [TG], and waist/hip ratio [WHR]), and Model 2, including FG and FI corrected for the effects of BMI, TG, and WHR. Both models were additionally corrected for participant sex and smoking status (current/ever/never). We meta-analyzed the cohort-specific MP-EWAS results with our novel software META-methylSCOPA, mapped genomic locations to CGCh37/hg19, and adopted P<1×10−7 to denote epigenome-wide significance. We used the Illumina Infinium HumanMethylation450K BeadChip array data from the Northern Finland Birth Cohorts (NFBC) 1966/1986. We quality-controlled the data, regressed out the effects of measured potential confounders, and normalized the methylation signal intensity and FI data. The MP-EWAS included data for 643/457 individuals from NFBC1966 and NFBC1986, respectively (total N=1,100).In Model 1, we detected epigenome-wide significant association in the MP-EWAS meta-analysis at cg13708645 (chr12:121,974,305; P=1.2×10−8) within KDM2B gene. Single-trait effects within KDM2B were on FI, BMI, and WHR. Model with effect on BMI and WHR showed the strongest association at this locus, while effect on FI in single-phenotype analysis was driven by the effect of adiposity. In Model 2, the strongest association was at cg05063096 (chr3:143,689,810; P=2.3×10−7) annotated to C3orf58 with strongest effect on FI in single-trait analysis and multi-phenotype effect on FI and WHI within Model 1.We characterized the effects of established EWAS loci for diabetes and its risk factors and detected suggestive (p<0.01) associations at six markers including PHGDH, TXNIP, SLC7A11, CPT1A, MYO5C and ABCG1, through the dissection of the multi-phenotype effects in Model 1.ConclusionsWe implemented MP-EWAS in methylSCOPA and demonstrated its enhanced power over single-trait EWAS for correlated phenotypes in large-scale data.


2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews &amp; Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.


BMJ Open ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. e040481
Author(s):  
Sinead T J McDonagh ◽  
James P Sheppard ◽  
Fiona C Warren ◽  
Kate Boddy ◽  
Leon Farmer ◽  
...  

IntroductionBlood pressure (BP) is normally measured on the upper arm, and guidelines for the diagnosis and treatment of high BP are based on such measurements. Leg BP measurement can be an alternative when brachial BP measurement is impractical, due to injury or disability. Limited data exist to guide interpretation of leg BP values for hypertension management; study-level systematic review findings suggest that systolic BP (SBP) is 17 mm Hg higher in the leg than the arm. However, uncertainty remains about the applicability of this figure in clinical practice due to substantial heterogeneity.AimsTo examine the relationship between arm and leg SBP, develop and validate a multivariable model predicting arm SBP from leg SBP and investigate the prognostic association between leg SBP and cardiovascular disease and mortality.Methods and analysisIndividual participant data (IPD) meta-analyses using arm and leg SBP measurements for 33 710 individuals from 14 studies within the Inter-arm blood pressure difference IPD (INTERPRESS-IPD) Collaboration. We will explore cross-sectional relationships between arm and leg SBP using hierarchical linear regression with participants nested by study, in multivariable models. Prognostic models will be derived for all-cause and cardiovascular mortality and cardiovascular events.Ethics and disseminationData originate from studies with prior ethical approval and consent, and data sharing agreements are in place—no further approvals are required to undertake the secondary analyses proposed in this protocol. Findings will be published in peer-reviewed journal articles and presented at conferences. A comprehensive dissemination strategy is in place, integrated with patient and public involvement.PROSPERO registration numberCRD42015031227.


2020 ◽  
Author(s):  
Or Dagan ◽  
Pasco Fearon ◽  
Carlo Schuengel ◽  
Marije Verhage ◽  
Glenn I. Roisman ◽  
...  

Since the seminal 1992 paper by van IJzendoorn, Sagi, and Lambermon, putting forward the “The multiple caretaker paradox”, relatively little attention has been given to the potential joint effects of the role early attachment network to mother and father play in development. Recently, Dagan and Sagi-Schwartz (2018) have published a paper that attempts to revive this unsettled issue, calling for research on the subject and offering a framework for posing attachment network hypotheses. This Collaboration for Attachment Research Synthesis project attempts to use an Individual Participant Data meta-analyses to test the hypotheses put forward in Dagan and Sagi-Schwartz (2018). Specifically, we test (a) whether the number of secure attachments (0,1, or 2) matter in predicting a range of developmental outcomes, and (b) whether the quality of attachment relationship with one parent contributes more than the other to these outcomes.


2020 ◽  
Author(s):  
Patrick Sin-Chan ◽  
Nehal Gosalia ◽  
Chuan Gao ◽  
Cristopher V. Van Hout ◽  
Bin Ye ◽  
...  

SUMMARYAging is characterized by degeneration in cellular and organismal functions leading to increased disease susceptibility and death. Although our understanding of aging biology in model systems has increased dramatically, large-scale sequencing studies to understand human aging are now just beginning. We applied exome sequencing and association analyses (ExWAS) to identify age-related variants on 58,470 participants of the DiscovEHR cohort. Linear Mixed Model regression analyses of age at last encounter revealed variants in genes known to be linked with clonal hematopoiesis of indeterminate potential, which are associated with myelodysplastic syndromes, as top signals in our analysis, suggestive of age-related somatic mutation accumulation in hematopoietic cells despite patients lacking clinical diagnoses. In addition to APOE, we identified rare DISP2 rs183775254 (p = 7.40×10−10) and ZYG11A rs74227999 (p = 2.50×10−08) variants that were negatively associated with age in either both sexes combined and females, respectively, which were replicated with directional consistency in two independent cohorts. Epigenetic mapping showed these variants are located within cell-type-specific enhancers, suggestive of important transcriptional regulatory functions. To discover variants associated with extreme age, we performed exome-sequencing on persons of Ashkenazi Jewish descent ascertained for extensive lifespans. Case-Control analyses in 525 Ashkenazi Jews cases (Males ≥ 92 years, Females ≥ 95years) were compared to 482 controls. Our results showed variants in APOE (rs429358, rs6857), and TMTC2 (rs7976168) passed Bonferroni-adjusted p-value, as well as several nominally-associated population-specific variants. Collectively, our Age-ExWAS, the largest performed to date, confirmed and identified previously unreported candidate variants associated with human age.


2016 ◽  
Author(s):  
Hieab HH Adams ◽  
Hadie Adams ◽  
Lenore J Launer ◽  
Sudha Seshadri ◽  
Reinhold Schmidt ◽  
...  

Joint analysis of data from multiple studies in collaborative efforts strengthens scientific evidence, with the gold standard approach being the pooling of individual participant data (IPD). However, sharing IPD often has legal, ethical, and logistic constraints for sensitive or high-dimensional data, such as in clinical trials, observational studies, and large-scale omics studies. Therefore, meta-analysis of study-level effect estimates is routinely done, but this compromises on statistical power, accuracy, and flexibility. Here we propose a novel meta-analytical approach, named partial derivatives meta-analysis, that is mathematically equivalent to using IPD, yet only requires the sharing of aggregate data. It not only yields identical results as pooled IPD analyses, but also allows post-hoc adjustments for covariates and stratification without the need for site-specific re-analysis. Thus, in case that IPD cannot be shared, partial derivatives meta-analysis still produces gold standard results, which can be used to better inform guidelines and policies on clinical practice.


Circulation ◽  
2016 ◽  
Vol 133 (suppl_1) ◽  
Author(s):  
James S Floyd ◽  
Colleen Sitlani ◽  
Christy L Avery ◽  
Eric A Whitsel ◽  
Leslie Lange ◽  
...  

Introduction: Sulfonylureas are a commonly-used class of diabetes medication that can prolong the QT-interval, which is a leading cause of drug withdrawals from the market given the possible risk of life-threatening arrhythmias. Previously, we conducted a meta-analysis of genome-wide association studies of sulfonylurea-genetic interactions on QT interval among 9 European-ancestry (EA) cohorts using cross-sectional data, with null results. To improve our power to identify novel drug-gene interactions, we have included repeated measures of medication use and QT interval and expanded our study to include several additional cohorts, including African-American (AA) and Hispanic-ancestry (HA) cohorts with a high prevalence of sulfonylurea use. To identify potentially differential effects on cardiac depolarization and repolarization, we have also added two phenotypes - the JT and QRS intervals, which together comprise the QT interval. Hypothesis: The use of repeated measures and expansion of our meta-analysis to include diverse ancestry populations will allow us to identify novel pharmacogenomic interactions for sulfonylureas on the ECG phenotypes QT, JT, and QRS. Methods: Cohorts with unrelated individuals used generalized estimating equations to estimate interactions; cohorts with related individuals used mixed effect models clustered on family. For each ECG phenotype (QT, JT, QRS), we conducted ancestry-specific (EA, AA, HA) inverse variance weighted meta-analyses using standard errors based on the t-distribution to correct for small sample inflation in the test statistic. Ancestry-specific summary estimates were combined using MANTRA, an analytic method that accounts for differences in local linkage disequilibrium between ethnic groups. Results: Our study included 65,997 participants from 21 cohorts, including 4,020 (6%) sulfonylurea users, a substantial increase from the 26,986 participants and 846 sulfonylureas users in the previous meta-analysis. Preliminary ancestry-specific meta-analyses have identified genome-wide significant associations (P < 5х10–8) for each ECG phenotype, and analyses with MANTRA are in progress. Conclusions: In the setting of the largest collection of pharmacogenomic studies to date, we used repeated measurements and leveraged diverse ancestry populations to identify new pharmacogenomic loci for ECG traits associated with cardiovascular risk.


Author(s):  
Colin Baigent ◽  
Richard Peto ◽  
Richard Gray ◽  
Natalie Staplin ◽  
Sarah Parish ◽  
...  

Clinical trials generally need to be able to detect or to refute realistically moderate (but still worthwhile) differences between treatments in long-term disease outcome. Large-scale randomized evidence should be able to detect such effects, but medium-sized trials or medium-sized meta-analyses can, and often do, yield false-negative or exaggeratedly positive results. Hundreds of thousands of premature deaths each year could be avoided by seeking appropriately large-scale randomized evidence about various widely practicable treatments for the common causes of death, and by disseminating this evidence appropriately. This chapter takes a look at the use of large-scale randomized evidence—produced from trials and meta-analysis of trials—and how this data should be handled in order to produce accurate result.


Sign in / Sign up

Export Citation Format

Share Document