scholarly journals RICOPILI: Rapid Imputation for COnsortias PIpeLIne

Author(s):  
Max Lam ◽  
Swapnil Awasthi ◽  
Hunna J Watson ◽  
Jackie Goldstein ◽  
Georgia Panagiotaropoulou ◽  
...  

Abstract Summary Genome-wide association study (GWAS) analyses, at sufficient sample sizes and power, have successfully revealed biological insights for several complex traits. RICOPILI, an open-sourced Perl-based pipeline was developed to address the challenges of rapidly processing large-scale multi-cohort GWAS studies including quality control (QC), imputation and downstream analyses. The pipeline is computationally efficient with portability to a wide range of high-performance computing environments. RICOPILI was created as the Psychiatric Genomics Consortium pipeline for GWAS and adopted by other users. The pipeline features (i) technical and genomic QC in case-control and trio cohorts, (ii) genome-wide phasing and imputation, (iv) association analysis, (v) meta-analysis, (vi) polygenic risk scoring and (vii) replication analysis. Notably, a major differentiator from other GWAS pipelines, RICOPILI leverages on automated parallelization and cluster job management approaches for rapid production of imputed genome-wide data. A comprehensive meta-analysis of simulated GWAS data has been incorporated demonstrating each step of the pipeline. This includes all the associated visualization plots, to allow ease of data interpretation and manuscript preparation. Simulated GWAS datasets are also packaged with the pipeline for user training tutorials and developer work. Availability and implementation RICOPILI has a flexible architecture to allow for ongoing development and incorporation of newer available algorithms and is adaptable to various HPC environments (QSUB, BSUB, SLURM and others). Specific links for genomic resources are either directly provided in this paper or via tutorials and external links. The central location hosting scripts and tutorials is found at this URL: https://sites.google.com/a/broadinstitute.org/RICOPILI/home Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Max Lam ◽  
Swapnil Awasthi ◽  
Hunna J. Watson ◽  
Jackie Goldstein ◽  
Georgia Panagiotaropoulou ◽  
...  

AbstractMotivationGenome-wide association study (GWAS) analyses, at sufficient sample sizes and power, have successfully revealed biological insights for several complex traits. RICOPILI, an open sourced Perl-based pipeline was developed to address the challenges of rapidly processing large scale multi-cohort GWAS studies including quality control, imputation and downstream analyses. The pipeline is computationally efficient with portability to a wide range of high-performance computing (HPC) environments.SummaryRICOPILI was created as the Psychiatric Genomics Consortium (PGC) pipeline for GWAS and has been adopted by other users. The pipeline features i) technical and genomic quality control in case-control and trio cohorts ii) genome-wide phasing and imputation iv) association analysis v) meta-analysis vi) polygenic risk scoring and vii) replication analysis. Notably, a major differentiator from other GWAS pipelines, RICOPILI leverages on automated parallelization and cluster job management approaches for rapid production of imputed genome-wide data. A comprehensive meta-analysis of simulated GWAS data has been incorporated demonstrating each step of the pipeline. This includes all of the associated visualization plots, to allow ease of data interpretation and manuscript preparation. Simulated GWAS datasets are also packaged with the pipeline for user training tutorials and developer work.Availability and ImplementationRICOPILI has a flexible architecture to allow for ongoing development and incorporation of newer available algorithms and is adaptable to various HPC environments (QSUB, BSUB, SLURM and others). Specific links for genomic resources are either directly provided in this paper or via tutorials and external links. The central location hosting scripts and tutorials is found at this URL:https://sites.google.com/a/broadinstitute.org/RICOPILI/[email protected] informationSupplementary data are available.


2018 ◽  
Vol 35 (14) ◽  
pp. 2512-2514 ◽  
Author(s):  
Bongsong Kim ◽  
Xinbin Dai ◽  
Wenchao Zhang ◽  
Zhaohong Zhuang ◽  
Darlene L Sanchez ◽  
...  

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Lerato E Magosi ◽  
Anuj Goel ◽  
Jemma C Hopewell ◽  
Martin Farrall

Abstract Motivation Common small-effect genetic variants that contribute to human complex traits and disease are typically identified using traditional fixed-effect (FE) meta-analysis methods. However, the power to detect genetic associations under FE models deteriorates with increasing heterogeneity, so that some small-effect heterogeneous loci might go undetected. A modified random-effects meta-analysis approach (RE2) was previously developed that is more powerful than traditional fixed and random-effects methods at detecting small-effect heterogeneous genetic associations, the method was updated (RE2C) to identify small-effect heterogeneous variants overlooked by traditional fixed-effect meta-analysis. Here, we re-appraise a large-scale meta-analysis of coronary disease with RE2C to search for small-effect genetic signals potentially masked by heterogeneity in a FE meta-analysis. Results Our application of RE2C suggests a high sensitivity but low specificity of this approach for discovering small-effect heterogeneous genetic associations. We recommend that reports of small-effect heterogeneous loci discovered with RE2C are accompanied by forest plots and standardized predicted random-effects statistics to reveal the distribution of genetic effect estimates across component studies of meta-analyses, highlighting overly influential outlier studies with the potential to inflate genetic signals. Availability and implementation Scripts to calculate standardized predicted random-effects statistics and generate forest plots are available in the getspres R package entitled from https://magosil86.github.io/getspres/. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Brian C Zhang ◽  
Arjun Biddanda ◽  
Pier Francesco Palamara

Accurate inference of gene genealogies from genetic data has the potential to facilitate a wide range of analyses. We introduce a method for accurately inferring biobank-scale genome-wide genealogies from sequencing or genotyping array data, as well as strategies to utilize genealogies within linear mixed models to perform association and other complex trait analyses. We use these new methods to build genome-wide genealogies using genotyping data for 337,464 UK Biobank individuals and to detect associations in 7 complex traits. Genealogy-based association detects more rare and ultra-rare signals (N = 133, frequency range 0.0004% - 0.1%) than genotype imputation from ~65,000 sequenced haplotypes (N = 65). In a subset of 138,039 exome sequencing samples, these associations strongly tag (average r = 0.72) underlying sequencing variants, which are enriched for missense (2.3×) and loss-of-function (4.5×) variation. Inferred genealogies also capture additional association signals in higher frequency variants. These results demonstrate that large-scale inference of gene genealogies may be leveraged in the analysis of complex traits, complementing approaches that require the availability of large, population-specific sequencing panels.


2018 ◽  
Vol 21 (6) ◽  
pp. 538-545 ◽  
Author(s):  
W. D. Hill

Lam et al. (2018) respond to a commentary of their paper entitled ‘Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets’ Lam et al. (2017). While Lam et al. (2018) have now provided the recommended quality control metrics for their paper, problems remain. Specifically, Lam et al. (2018) do not dispute that the results of their multi-trait analysis of genome-wide association study (MTAG) analysis has produced a phenotype with a genetic correlation of one with three measures of education, but do claim the associations found are specific to the trait of cognitive ability. In this brief paper, it is empirically demonstrated that the phenotype derived by Lam et al. (2017) is more genetically similar to education than cognitive ability. In addition, it is shown that of the genome-wide significant loci identified by Lam et al. (2017) are loci that are associated with education rather than with cognitive ability.


2021 ◽  
Author(s):  
Kazuyoshi Ishigaki ◽  
Saori Sakaue ◽  
Chikashi Terao ◽  
Yang Luo ◽  
Kyuto Sonehara ◽  
...  

AbstractTrans-ancestry genetic research promises to improve power to detect genetic signals, fine-mapping resolution, and performances of polygenic risk score (PRS). We here present a large-scale genome-wide association study (GWAS) of rheumatoid arthritis (RA) which includes 276,020 samples of five ancestral groups. We conducted a trans-ancestry meta-analysis and identified 124 loci (P < 5 × 10-8), of which 34 were novel. Candidate genes at the novel loci suggested essential roles of the immune system (e.g., TNIP2 and TNFRSF11A) and joint tissues (e.g., WISP1) in RA etiology. Trans-ancestry fine mapping identified putatively causal variants with biological insights (e.g., LEF1). Moreover, PRS based on trans-ancestry GWAS outperformed PRS based on single-ancestry GWAS and had comparable performance between European and East Asian populations. Our study provides multiple insights into the etiology of RA and improves genetic predictability of RA.


PLoS Genetics ◽  
2020 ◽  
Vol 16 (11) ◽  
pp. e1009077
Author(s):  
Jeffery A. Goldstein ◽  
Joshua S. Weinstock ◽  
Lisa A. Bastarache ◽  
Daniel B. Larach ◽  
Lars G. Fritsche ◽  
...  

Phenotypes extracted from Electronic Health Records (EHRs) are increasingly prevalent in genetic studies. EHRs contain hundreds of distinct clinical laboratory test results, providing a trove of health data beyond diagnoses. Such lab data is complex and lacks a ubiquitous coding scheme, making it more challenging than diagnosis data. Here we describe the first large-scale cross-health system genome-wide association study (GWAS) of EHR-based quantitative laboratory-derived phenotypes. We meta-analyzed 70 lab traits matched between the BioVU cohort from the Vanderbilt University Health System and the Michigan Genomics Initiative (MGI) cohort from Michigan Medicine. We show high replication of known association for these traits, validating EHR-based measurements as high-quality phenotypes for genetic analysis. Notably, our analysis provides the first replication for 699 previous GWAS associations across 46 different traits. We discovered 31 novel associations at genome-wide significance for 22 distinct traits, including the first reported associations for two lab-based traits. We replicated 22 of these novel associations in an independent tranche of BioVU samples. The summary statistics for all association tests are freely available to benefit other researchers. Finally, we performed mirrored analyses in BioVU and MGI to assess competing analytic practices for EHR lab traits. We find that using the mean of all available lab measurements provides a robust summary value, but alternate summarizations can improve power in certain circumstances. This study provides a proof-of-principle for cross health system GWAS and is a framework for future studies of quantitative EHR lab traits.


2020 ◽  
Author(s):  
Dirk Smit

The ENIGMA-EEG working group was established to enable large scale international collaborations among cohorts who investigate the genetics of brain function measured with electroencephalography (EEG). The collaboration resulted in the currently largest genome-wide association study of oscillatory brain activity in EEG recordings by meta-analyzing the results across five participating cohorts’ results. Our endeavor has resulted in the first genome-wide significant hits for oscillatory brain function, and significant genes that were previously associated with psychiatric disorders. Our results have provided insight into the influence that psychitaric liability genes have on the functioning brain. In this overview, we also highlight how we have tackled methodological issues surrounding genetic meta-analysis of EEG features, and identify possible sources of heterogeneity across cohorts, which could affect the results of our meta-analysis. We discuss the importance of harmonizing EEG signal processing, cleaning, and feature extraction. Finally, we explain our selection of EEG features to be investigated in our future studies, e.g. temporal dynamics of oscillations and the connectivity network based on synchronization of oscillations. We argue that these represent some of the most important characteristics of the functioning brain. We conclude that disentangling the genetics of EEG will elucidate effects that genes have on brain function, as well as pathways from genes to neurological and psychiatric disorders.


2020 ◽  
Vol 7 (12) ◽  
pp. 1032-1045 ◽  
Author(s):  
Emma C Johnson ◽  
Ditte Demontis ◽  
Thorgeir E Thorgeirsson ◽  
Raymond K Walters ◽  
Renato Polimanti ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document