gwas data
Recently Published Documents


TOTAL DOCUMENTS

180
(FIVE YEARS 77)

H-INDEX

24
(FIVE YEARS 6)

2021 ◽  
Author(s):  
Douglas P. Loesch ◽  
Andrea RVR Horimoto ◽  
Irem Sarihan ◽  
Miguel Inca-Martinez ◽  
Emily Mason ◽  
...  

Background: Large-scale Parkinson's disease (PD) genome-wide association studies (GWAS) and meta-analyses have, until recently, only been conducted on subjects with European-ancestry. Consequently, polygenic risk scores (PRS) constructed using PD GWAS data are likely to be less predictive when applied to non-European cohorts. Methods: Using GWAS data from Nalls et al. 2019, we constructed a PD PRS for a Latino PD cohort (LARGE-PD) and tested it for association with PD status. We validated the PRS performance through testing the PD PRS in an independent cohort of Latino PD patients and by repeating the PRS analysis in LARGE-PD with the addition of 440 external Peruvian controls. To explore the global distribution of PD PRS, we utilized 1000 Genomes Project (1KGP) and Peruvian Genome Project (PGP) data to estimate PD risk allele frequencies. We also tested SNCA haplotypes for association with PD risk using logistic regression in LARGE-PD and a European-ancestry PD cohort from the International Parkinson Disease Genomics Consortium (IPDGC). Results: The GWAS-significant PD PRS had an area under the receiver-operator curve (AUC) of 0.668 (95% CI: 0.640-0.695) and explained 2.8% of the phenotypic variance in LARGE-PD as determined via pseudo R2. The inclusion of external Peruvian data as controls mitigated this result, dropping the AUC 0.632 (95% CI: 0.607-0.657). In 1KGP Latinos, we found the PD PRS to exhibit a bias by ancestry. At the SNCA locus, haplotypes differ by ancestry. Ancestry-specific SNCA haplotypes are significantly associated with PD status in both LARGE-PD and the IPDGC cohort (p-value < 0.05). Apart from rs356182, these haplotypes share as little as 14% of their variants. Conclusion: The PD PRS has potential for PD risk prediction in Latinos, but variability caused by admixture patterns and bias in the PD PRS calculated using only European-ancestry data limits its utility. The inclusion of diverse subjects can help elucidate PD risk loci and improve risk prediction in non-European cohorts. In the case of the SNCA locus, by leveraging a Latino cohort, we provide orthogonal evidence for rs356182 causality.


2021 ◽  
Author(s):  
Martin Zhang ◽  
Kangcheng Hou ◽  
Bogdan Pasaniuc ◽  
Alkes L. Price ◽  
Kushal Dey ◽  
...  

Abstract Gene expression at the individual cell-level resolution, as quantified by single-cell RNA-sequencing (scRNA-seq), can provide unique insights into the pathology and cellular origin of diseases and complex traits. Here, we introduce single-cell Disease Relevance Score (scDRS), an approach that links scRNA-seq with polygenic risk of disease at individual cell resolution; scDRS identifies individual cells that show excess expression levels for genes in a disease-specific gene set constructed from GWAS data. We determined via simulations that scDRS is well-calibrated and powerful in identifying individual cells associated to disease. We applied scDRS to GWAS data from 74 diseases and complex traits (average N=341K) in conjunction with 16 scRNA-seq data sets spanning 1.3 million cells from 31 tissues and organs. At the cell type level, scDRS broadly recapitulated known links between classical cell types and disease, and also produced novel biologically plausible findings. At the individual cell level, scDRS identified subpopulations of disease-associated cells that are not captured by existing cell type labels, including subpopulations of CD4+ T cells associated with inflammatory bowel disease, partially characterized by their effector-like states; subpopulations of hippocampal CA1 pyramidal neurons associated with schizophrenia, partially characterized by their spatial location at the proximal part of the hippocampal CA1 region; and subpopulations of hepatocytes associated with triglyceride levels, partially characterized by their higher ploidy levels. At the gene level, we determined that genes whose expression across individual cells was correlated with the scDRS score (thus reflecting co-expression with GWAS disease genes) were strongly enriched for gold-standard drug target and Mendelian disease genes.


2021 ◽  
Author(s):  
Martin Jinye Zhang ◽  
Kangcheng Hou ◽  
Kushal K Dey ◽  
Karthik A. Jagadeesh ◽  
Kathryn Weinand ◽  
...  

Gene expression at the individual cell-level resolution, as quantified by single-cell RNA-sequencing (scRNA-seq), can provide unique insights into the pathology and cellular origin of diseases and complex traits. Here, we introduce single-cell Disease Relevance Score (scDRS), an approach that links scRNA-seq with polygenic risk of disease at individual cell resolution; scDRS identifies individual cells that show excess expression levels for genes in a disease-specific gene set constructed from GWAS data. We determined via simulations that scDRS is well-calibrated and powerful in identifying individual cells associated to disease. We applied scDRS to GWAS data from 74 diseases and complex traits (average N=341K) in conjunction with 16 scRNA-seq data sets spanning 1.3 million cells from 31 tissues and organs. At the cell type level, scDRS broadly recapitulated known links between classical cell types and disease, and also produced novel biologically plausible findings. At the individual cell level, scDRS identified subpopulations of disease-associated cells that are not captured by existing cell type labels, including subpopulations of CD4+ T cells associated with inflammatory bowel disease, partially characterized by their effector-like states; subpopulations of hippocampal CA1 pyramidal neurons associated with schizophrenia, partially characterized by their spatial location at the proximal part of the hippocampal CA1 region; and subpopulations of hepatocytes associated with triglyceride levels, partially characterized by their higher ploidy levels. At the gene level, we determined that genes whose expression across individual cells was correlated with the scDRS score (thus reflecting co-expression with GWAS disease genes) were strongly enriched for gold-standard drug target and Mendelian disease genes.


2021 ◽  
Vol 5 ◽  
pp. 287
Author(s):  
Carolyne M. Ndila ◽  
Vysaul Nyirongo ◽  
Alexander W. Macharia ◽  
Anna E. Jeffreys ◽  
Kate Rowlands ◽  
...  

Background: The -α3.7I-thalassaemia deletion is very common throughout Africa because it protects against malaria. When undertaking studies to investigate human genetic adaptations to malaria or other diseases, it is important to account for any confounding effects of α-thalassaemia to rule out spurious associations. Methods: In this study, we have used direct α-thalassaemia genotyping to understand why GWAS data from a large malaria association study in Kilifi Kenya did not identify the α-thalassaemia signal. We then explored the potential use of a number of new approaches to using GWAS data for imputing α-thalassaemia as an alternative to direct genotyping by PCR. Results: We found very low linkage-disequilibrium of the directly typed data with the GWAS SNP markers around α-thalassaemia and across the haemoglobin-alpha (HBA) gene region, which along with a complex haplotype structure, could explain the lack of an association signal from the GWAS SNP data. Some indirect typing methods gave results that were in broad agreement with those derived from direct genotyping and could identify an association signal, but none were sufficiently accurate to allow correct interpretation compared with direct typing, leading to confusing or erroneous results. Conclusions: We conclude that going forwards, direct typing methods such as PCR will still be required to account for α-thalassaemia in GWAS studies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Triinu Peters ◽  
Jochen Antel ◽  
Roaa Naaresh ◽  
Björn-Hergen Laabs ◽  
Manuel Föcker ◽  
...  

Genetic correlations suggest a coexisting genetic predisposition to both low leptin levels and risk for anorexia nervosa (AN). To investigate the causality and direction of these associations, we performed bidirectional two-sample Mendelian randomization (MR) analyses using data of the most recent genome-wide association study (GWAS) for AN and both a GWAS and an exome-wide-association-study (EWAS) for leptin levels. Most MR methods with genetic instruments from GWAS showed a causal effect of lower leptin levels on higher risk of AN (e.g. IVW b = −0.923, p = 1.5 × 10−4). Because most patients with AN are female, we additionally performed analyses using leptin GWAS data of females only. Again, there was a significant effect of leptin levels on the risk of AN (e.g. IVW b = −0.826, p = 1.1 × 10−04). MR with genetic instruments from EWAS showed no overall effect of leptin levels on the risk for AN. For the opposite direction, MR revealed no causal effect of AN on leptin levels. If our results are confirmed in extended GWAS data sets, a low endogenous leptin synthesis represents a risk factor for developing AN.


2021 ◽  
Vol 2 (3) ◽  
pp. 100768
Author(s):  
Olivia L. Sabik ◽  
Cheryl L. Ackert-Bicknell ◽  
Charles R. Farber

Author(s):  
Masao Ueki ◽  
Gen Tamiya ◽  

Abstract We propose a genetic prediction modeling approach for genome-wide association study (GWAS) data that can include not only marginal gene effects but also gene-environment (GxE) interaction effects—i.e., multiplicative effects of environmental factors with genes rather than merely additive effects of each. The proposed approach is a straightforward extension of our previous multiple-regression-based method, STMGP (smooth-threshold multivariate genetic prediction), with the new feature being that genome-wide test statistics from a GxE interaction analysis are used to weight the corresponding variants. We develop a simple univariate regression approximation to the GxE interaction effect that allows a direct fit of the STMGP framework without modification. The sparse nature of our model automatically removes irrelevant predictors (including variants and GxE combinations), and the model is able to simultaneously incorporate multiple environmental variables. Simulation studies to evaluate the proposed method in comparison with other modeling approaches demonstrate its superior performance under the presence of GxE interaction effects. We illustrate the usefulness of our prediction model through application to real GWAS data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI).


2021 ◽  
Vol 21 ◽  
Author(s):  
Zijun Zhu ◽  
Xudong Han ◽  
Liang Cheng

: Type 2 diabetes mellitus (T2DM) is a chronic disease. The molecular diagnosis should be helpful for the treatment of T2DM patients. With the development of sequencing technology, a large number of differentially expressed genes were identified from expression data. However, the method of machine learning can only identify the local optimal solution as the signature. The mutation information obtained by inheritance can better reflect the relationship between genes and diseases. Therefore, we need to integrate mutation information to more accurately identify the signature. To this end, we integrated genome-wide association study (GWAS) data and expression data, combined with expression quantitative trait loci (eQTL) technology to get T2DM predictive signature (T2DMSig-10). Firstly, we used GWAS data to obtain a list of T2DM susceptible loci. Then, we used eQTL technology to obtain risk single nucleotide polymorphisms (SNPs), and combined with the pancreatic β-cells gene expression data to obtain 10 protein-coding genes. Next, we combined these genes with equal weights. After receiver operating characteristic (ROC), single-gene removal and increase method, gene ontology function enrichment and protein-protein interaction network were used to verify the results that showed that T2DMSig-10 had an excellent predictive effect on T2DM (AUC=0.99), and was highly robust. In short, we obtained the predictive signature of T2DM, and further verified it.


2021 ◽  
Vol 3 (3) ◽  
Author(s):  
Marlena Osipowicz ◽  
Bartek Wilczynski ◽  
Magdalena A Machnicka ◽  

Abstract Despite great increase of the amount of data from genome-wide association studies (GWAS) and whole-genome sequencing (WGS), the genetic background of a partially heritable Alzheimer’s disease (AD) is not fully understood yet. Machine learning methods are expected to help researchers in the analysis of the large number of SNPs possibly associated with the disease onset. To date, a number of such approaches were applied to genotype-based classification of AD patients and healthy controls using GWAS data and reported accuracy of 0.65–0.975. However, since the estimated influence of genotype on sporadic AD occurrence is lower than that, these very high classification accuracies may potentially be a result of overfitting. We have explored the possibilities of applying feature selection and classification using random forests to WGS and GWAS data from two datasets. Our results suggest that this approach is prone to overfitting if feature selection is performed before division of data into the training and testing set. Therefore, we recommend avoiding selection of features used to build the model based on data included in the testing set. We suggest that for currently available dataset sizes the expected classifier performance is between 0.55 and 0.7 (AUC) and higher accuracies reported in literature are likely a result of overfitting.


2021 ◽  
Author(s):  
Rosella Mechelli ◽  
Renato Umeton ◽  
Virginia Rinaldi ◽  
Gianmarco Bellucci ◽  
Rachele Bigi ◽  
...  

We exploited genetic information to assess non-genetic influences in autoimmunity. We isolated gene modules whose products physically interact with environmental exposures related to autoimmunity, and analyzed their nominal statistical evidence of association with autoimmune and non-autoimmune diseases in genome-wide association studies (GWAS) data. Epstein Barr virus (EBV) and other Herpesviruses interactomes emerged as specifically associated with multiple sclerosis (MS), possibly under common regulatory mechanisms. Analyses of MS blood and brain transcriptomes, cytofluorimetric studies of endogenous EBV-infected lymphoblastoid lines, and lesion immunohistochemistry, confirmed a dysregulation of MS-associated EBV interactors, suggesting their contribution to CD40 signaling alterations in MS. These interactors resulted enriched in modules from inherited axonopathies-causing genes, supporting a link between EBV and neurodegeneration in MS, in accord with the observed transcriptomic dysregulations in MS brains. They were also enriched with top-ranked pharmaceutical targets prioritized on a genetic basis. This study delineates a disease-specific influence of herpesviruses on MS biology.


Sign in / Sign up

Export Citation Format

Share Document