Regional imaging genetic enrichment analysis

Abstract Motivation Brain imaging genetics aims to reveal genetic effects on brain phenotypes, where most studies examine phenotypes defined on anatomical or functional regions of interest (ROIs) given their biologically meaningful interpretation and modest dimensionality compared with voxelwise approaches. Typical ROI-level measures used in these studies are summary statistics from voxelwise measures in the region, without making full use of individual voxel signals. Results In this article, we propose a flexible and powerful framework for mining regional imaging genetic associations via voxelwise enrichment analysis, which embraces the collective effect of weak voxel-level signals and integrates brain anatomical annotation information. Our proposed method achieves three goals at the same time: (i) increase the statistical power by substantially reducing the burden of multiple comparison correction; (ii) employ brain annotation information to enable biologically meaningful interpretation and (iii) make full use of fine-grained voxelwise signals. We demonstrate our method on an imaging genetic analysis using data from the Alzheimer’s Disease Neuroimaging Initiative, where we assess the collective regional genetic effects of voxelwise FDG-positron emission tomography measures between 116 ROIs and 565 373 single-nucleotide polymorphisms. Compared with traditional ROI-wise and voxelwise approaches, our method identified 2946 novel imaging genetic associations in addition to 33 ones overlapping with the two benchmark methods. In particular, two newly reported variants were further supported by transcriptome evidences from region-specific expression analysis. This demonstrates the promise of the proposed method as a flexible and powerful framework for exploring imaging genetic effects on the brain. Availability and implementation The R code and sample data are freely available at https://github.com/lshen/RIGEA. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Varanto: variant enrichment analysis and annotation

Bioinformatics ◽

10.1093/bioinformatics/btz046 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3154-3156 ◽

Cited By ~ 1

Author(s):

Oskari Timonen ◽

Mikko Särkkä ◽

Tibor Fülöp ◽

Anton Mattsson ◽

Juha Kekäläinen ◽

...

Keyword(s):

Association Studies ◽

Enrichment Analysis ◽

Genetic Variations ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Genome Wide ◽

Shiny App ◽

Specific Trait ◽

Diverse Data

Abstract Summary Genome-wide association studies (GWAS) aim to identify associations of genetic variations such as single-nucleotide polymorphisms (SNPs) to a specific trait or a disease. Identifying common themes such as pathways, biological processes and diseases associations is needed to further explore and interpret these results. Varanto is a novel web tool for annotating, visualizing and analyzing human genetic variations using diverse data sources. Varanto can be used to query a set of input variations, retrieve their associated variation and gene level annotations, perform annotation enrichment analysis and visualize the results. Availability and implementation Varanto web app is developed with R and implemented as Shiny app with PostgreSQL database and is freely available at http://bioinformatics.uef.fi/varanto. Source code for the tool is available at https://github.com/oqe/varanto. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Detection of SNP epistasis effects of quantitative traits using an extended Kempthorne model

Physiological Genomics ◽

10.1152/physiolgenomics.00096.2006 ◽

2006 ◽

Vol 28 (1) ◽

pp. 46-52 ◽

Cited By ~ 31

Author(s):

Yongcai Mao ◽

Nicole R. London ◽

Li Ma ◽

Daniel Dvorkin ◽

Yang Da

Keyword(s):

Sample Size ◽

Candidate Gene ◽

Complex Traits ◽

Statistical Power ◽

Type I Error ◽

Genetic Model ◽

Genetic Effects ◽

Type I ◽

Nucleotide Polymorphisms ◽

Epistasis Effect

Epistasis effects (gene interactions) have been increasingly recognized as important genetic factors underlying complex traits. The existence of a large number of single nucleotide polymorphisms (SNPs) provides opportunities and challenges to screen DNA variations affecting complex traits using a candidate gene analysis. In this article, four types of epistasis effects of two candidate gene SNPs with Hardy-Weinberg disequilibrium (HWD) and linkage disequilibrium (LD) are considered: additive × additive, additive × dominance, dominance × additive, and dominance × dominance. The Kempthorne genetic model was chosen for its appealing genetic interpretations of the epistasis effects. The method in this study consists of extension of Kempthorne's definitions of 35 individual genetic effects to allow HWD and LD, genetic contrasts of the 35 extended individual genetic effects to define the 4 epistasis effects, and a linear model method for testing epistasis effects. Formulas to predict statistical power (as a function of contrast heritability, sample size, and type I error) and sample size (as a function of contrast heritability, type I error, and type II error) for detecting each epistasis effect were derived, and the theoretical predictions agreed well with simulation studies. The accuracy in estimating each epistasis effect and rates of false positives in the absence of all or three epistasis effects were evaluated using simulations. The method for epistasis testing can be a useful tool to understand the exact mode of epistasis, to assemble genome-wide SNPs into an epistasis network, and to assemble all SNP effects affecting a phenotype using pairwise epistasis tests.

Download Full-text

Supervised multiblock sparse multivariable analysis with application to multimodal brain imaging genetics

Biostatistics ◽

10.1093/biostatistics/kxx011 ◽

2017 ◽

Vol 18 (4) ◽

pp. 651-665 ◽

Cited By ~ 7

Author(s):

Atsushi Kawaguchi ◽

Fumio Yamashita

Keyword(s):

Multivariable Analysis ◽

Brain Magnetic Resonance Imaging ◽

Brain Regions ◽

Original Method ◽

Imaging Genetics ◽

Data Sets ◽

Nucleotide Polymorphisms ◽

Brain Images ◽

Data Set ◽

Positron Emission

SUMMARYThis article proposes a procedure for describing the relationship between high-dimensional data sets, such as multimodal brain images and genetic data. We propose a supervised technique to incorporate the clinical outcome to determine a score, which is a linear combination of variables with hieratical structures to multimodalities. This approach is expected to obtain interpretable and predictive scores. The proposed method was applied to a study of Alzheimer’s disease (AD). We propose a diagnostic method for AD that involves using whole-brain magnetic resonance imaging (MRI) and positron emission tomography (PET), and we select effective brain regions for the diagnostic probability and investigate the genome-wide association with the regions using single nucleotide polymorphisms (SNPs). The two-step dimension reduction method, which we previously introduced, was considered applicable to such a study and allows us to partially incorporate the proposed method. We show that the proposed method offers classification functions with feasibility and reasonable prediction accuracy based on the receiver operating characteristic (ROC) analysis and reasonable regions of the brain and genomes. Our simulation study based on the synthetic structured data set showed that the proposed method outperformed the original method and provided the characteristic for the supervised feature.

Download Full-text

Genome-wide Variant-based Study of Genetic Effects with the Largest Neuroanatomic Coverage

10.21203/rs.3.rs-33088/v2 ◽

2020 ◽

Author(s):

Jin Li ◽

Wenjie Liu ◽

Huang Li ◽

Feng Chen ◽

Haoran Luo ◽

...

Keyword(s):

Genetic Algorithm ◽

Region Of Interest ◽

Genetic Effects ◽

Imaging Genetics ◽

Exhaustive Search ◽

Computational Time ◽

Brain Anatomy ◽

Nucleotide Polymorphisms ◽

Genome Wide ◽

The Brain

Abstract Background: Brain image genetics provides enormous opportunities for examining the effects of geneticvariations on the brain. Many studies have shown that the structure, function, and abnormality (e.g., thoserelated to Alzheimer's disease) of the brain are heritable. However, which genetic variations contribute to thesephenotypic changes is not completely clear. Advances in neuroimaging and genetics have led us to obtaindetailed brain anatomy and genome-wide information. These data offer us new opportunities to identify geneticvariations such as single nucleotide polymorphisms (SNPs) that affect brain structure. In this paper, we performa genome-wide variant-based study, and aim to identify top SNPs or SNP sets which have genetic effects withthe largest neuroanotomic coverage at both voxel and region-of-interest (ROI) levels. Based on the voxelwisegenome-wide association study (GWAS) results, we used the exhaustive search to nd the top SNPs or SNPsets that have the largest voxel-based or ROI-based neuroanatomic coverage. For SNP sets with >2 SNPs, weproposed an efficient genetic algorithm to identify top SNP sets that can cover all ROIs or a specific ROI.Results: We identified an ensemble of top SNPs, SNP-pairs and SNP-sets, whose effects have the largestneuroanatomic coverage. Experimental results on real imaging genetics data show that the proposed geneticalgorithm is superior to the exhaustive search in terms of computational time for identifying top SNP-sets.Conclusions: We proposed and applied an informatics strategy to identify top SNPs, SNP-pairs and SNP-setsthat have genetic effects with the largest neuroanatomic coverage. The proposed genetic algorithm others anefficient solution to accomplish the task, especially for identifying top SNP-sets.

Download Full-text

Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification

Bioinformatics ◽

10.1093/bioinformatics/btaa434 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i371-i379 ◽

Cited By ~ 1

Author(s):

Lei Du ◽

Fang Liu ◽

Kefei Liu ◽

Xiaohui Yao ◽

Shannon L Risacher ◽

...

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Correlation Coefficients ◽

Imaging Genetics ◽

Supplementary Information ◽

Local Optimum ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Sparse Canonical Correlation Analysis

Abstract Motivation Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype–phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype–phenotype associations. Results In this article, we propose a new joint multitask learning method, named MT–SCCALR, which absorbs the merits of both SCCA and logistic regression. MT–SCCALR learns genotype–phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype–phenotype pattern. Meanwhile, MT–SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT–SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype–phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. Availability and implementation The software is publicly available at https://github.com/dulei323/MTSCCALR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Subset-Based Analysis using Gene-Environment Interactions for Discovery of Genetic Associations across Multiple Studies or Phenotypes

10.1101/326777 ◽

2018 ◽

Cited By ~ 1

Author(s):

Youfei Yu ◽

Lu Xia ◽

Seunggeun Lee ◽

Xiang Zhou ◽

Heather M Stringham ◽

...

Keyword(s):

Association Studies ◽

Testing Procedure ◽

Genetic Effects ◽

P Value ◽

Genome Wide Association Studies ◽

Simulation Studies ◽

Nucleotide Polymorphisms ◽

Genetic Associations ◽

Novel Studies ◽

Gene Environment

AbstractObjectivesClassical methods for combining summary data from genome-wide association studies (GWAS) only use marginal genetic effects and power can be compromised in the presence of heterogeneity. We aim to enhance the discovery of novel associated loci in the presence of heterogeneity of genetic effects in sub-groups defined by an environmental factor.MethodsWe present a p-value Assisted Subset Testing for Associations (pASTA) framework that generalizes the previously proposed association analysis based on subsets (ASSET) method by incorporating gene-environment (G-E) interactions into the testing procedure. We conduct simulation studies and provide two data examples.ResultsSimulation studies show that our proposal is more powerful than methods based on marginal associations in the presence of G-E interactions and maintains comparable power even in their absence. Both data examples demonstrate that our method can increase power to detect overall genetic associations and identify novel studies/phenotypes that contribute to the association.ConclusionsOur proposed method can be a useful screening tool to identify candidate single nucleotide polymorphisms (SNPs) that are potentially associated with the trait(s) of interest for further validation. It also allows researchers to determine the most probable subset of traits that exhibit genetic associations in addition to the enhancement of power.

Download Full-text

Genome-Wide Analysis of Sex Disparities in the Genetic Architecture of Lung and Colorectal Cancers

Genes ◽

10.3390/genes12050686 ◽

2021 ◽

Vol 12 (5) ◽

pp. 686

Author(s):

Alireza Nazarian ◽

Alexander M. Kulminski

Keyword(s):

Nucleotide Polymorphisms ◽

Genetic Associations ◽

Significance Level ◽

Association Analyses ◽

Complex Disorders ◽

Specific Effects ◽

Genome Wide ◽

Genetic Mechanisms ◽

Sex Disparities ◽

Almost All

Almost all complex disorders have manifested epidemiological and clinical sex disparities which might partially arise from sex-specific genetic mechanisms. Addressing such differences can be important from a precision medicine perspective which aims to make medical interventions more personalized and effective. We investigated sex-specific genetic associations with colorectal (CRCa) and lung (LCa) cancers using genome-wide single-nucleotide polymorphisms (SNPs) data from three independent datasets. The genome-wide association analyses revealed that 33 SNPs were associated with CRCa/LCa at P < 5.0 × 10−6 neither males or females. Of these, 26 SNPs had sex-specific effects as their effect sizes were statistically different between the two sexes at a Bonferroni-adjusted significance level of 0.0015. None had proxy SNPs within their ±1 Mb regions and the closest genes to 32 SNPs were not previously associated with the corresponding cancers. The pathway enrichment analyses demonstrated the associations of 35 pathways with CRCa or LCa which were mostly implicated in immune system responses, cell cycle, and chromosome stability. The significant pathways were mostly enriched in either males or females. Our findings provided novel insights into the potential sex-specific genetic heterogeneity of CRCa and LCa at SNP and pathway levels.

Download Full-text

Investigation of allele specific expression in various tissues of broiler chickens using the detection tool VADT

Scientific Reports ◽

10.1038/s41598-021-83459-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

M. Joseph Tomlinson ◽

Shawn W. Polson ◽

Jing Qiu ◽

Juniper A. Lake ◽

William Lee ◽

...

Keyword(s):

Broiler Chickens ◽

Nucleotide Polymorphisms ◽

Rna Seq ◽

Specific Expression ◽

Single Nucleotide ◽

Allele Specific Expression ◽

Detection Tool ◽

Commercial Broiler ◽

Significant Phenomenon ◽

Allele Specific

AbstractDifferential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.

Download Full-text

Dementia key gene identification with multi-layered SNP-gene-disease network

Bioinformatics ◽

10.1093/bioinformatics/btaa814 ◽

2020 ◽

Vol 36 (Supplement_2) ◽

pp. i831-i839

Author(s):

Dong-gi Lee ◽

Myungjun Kim ◽

Sang Joon Son ◽

Chang Hyung Hong ◽

Hyunjung Shin

Keyword(s):

Candidate Genes ◽

Learning Algorithm ◽

Search Space ◽

Supplementary Information ◽

Gene Identification ◽

Nucleotide Polymorphisms ◽

Disease Network ◽

Single Nucleotide ◽

Key Genes ◽

Significant Attention

Abstract Motivation Recently, various approaches for diagnosing and treating dementia have received significant attention, especially in identifying key genes that are crucial for dementia. If the mutations of such key genes could be tracked, it would be possible to predict the time of onset of dementia and significantly aid in developing drugs to treat dementia. However, gene finding involves tremendous cost, time and effort. To alleviate these problems, research on utilizing computational biology to decrease the search space of candidate genes is actively conducted. In this study, we propose a framework in which diseases, genes and single-nucleotide polymorphisms are represented by a layered network, and key genes are predicted by a machine learning algorithm. The algorithm utilizes a network-based semi-supervised learning model that can be applied to layered data structures. Results The proposed method was applied to a dataset extracted from public databases related to diseases and genes with data collected from 186 patients. A portion of key genes obtained using the proposed method was verified in silico through PubMed literature, and the remaining genes were left as possible candidate genes. Availability and implementation The code for the framework will be available at http://www.alphaminers.net/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Spectral dynamic causal modelling of resting-state fMRI: an exploratory study relating effective brain connectivity in the default mode network to genetics

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2019-0058 ◽

2020 ◽

Vol 19 (3) ◽

Author(s):

Yunlong Nie ◽

Eugene Opoku ◽

Laila Yasmin ◽

Yin Song ◽

Jie Wang ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Default Mode Network ◽

Resting State ◽

Brain Connectivity ◽

Parametric Bootstrap ◽

Imaging Genetics ◽

Nucleotide Polymorphisms ◽

Mixed Effect ◽

Default Mode

AbstractWe conduct an imaging genetics study to explore how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer’s disease and mild cognitive impairment. We develop an analysis of longitudinal resting-state functional magnetic resonance imaging (rs-fMRI) and genetic data obtained from a sample of 111 subjects with a total of 319 rs-fMRI scans from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. A Dynamic Causal Model (DCM) is fit to the rs-fMRI scans to estimate effective brain connectivity within the DMN and related to a set of single nucleotide polymorphisms (SNPs) contained in an empirical disease-constrained set which is obtained out-of-sample from 663 ADNI subjects having only genome-wide data. We relate longitudinal effective brain connectivity estimated using spectral DCM to SNPs using both linear mixed effect (LME) models as well as function-on-scalar regression (FSR). In both cases we implement a parametric bootstrap for testing SNP coefficients and make comparisons with p-values obtained from asymptotic null distributions. In both networks at an initial q-value threshold of 0.1 no effects are found. We report on exploratory patterns of associations with relatively high ranks that exhibit stability to the differing assumptions made by both FSR and LME.

Download Full-text