P1-350: ALZHEIMER'S DISEASE SEQUENCING PROJECT DATA PORTAL

2014 ◽  
Vol 10 ◽  
pp. P441-P441
Author(s):  
Daniel Micah Childress ◽  
Otto Valladares ◽  
Amanda Partch ◽  
Georgy Godynskiy ◽  
Kurt Rodarmer ◽  
...  
2021 ◽  
Author(s):  
Michael E Belloy ◽  
Yann E Le Guen ◽  
Sarah J. Eger ◽  
Valerio Napolioni ◽  
Michael D. Greicius ◽  
...  

Whole-exome sequencing (WES) and whole-genome sequencing (WGS) are expected to be critical to further elucidate the missing genetic heritability of Alzheimer's disease (AD) risk by identifying rare coding and/or noncoding variants that contribute to AD pathogenesis. In the United States, the Alzheimer's Disease Sequencing Project (ADSP) has taken a leading role in sequencing AD-related samples at scale, with the resultant data being made publicly available to researchers to generate new insights into the genetic etiology of AD. In order to achieve sufficient power, the ADSP has adapted a study design where subsets of larger AD cohorts are collected and sequenced across multiple centers, using a variety of sequencing kits. This approach may lead to variable variant quality across sequencing centers and/or kits. Here, we performed exome-wide and genome-wide association analyses on AD risk using the latest ADSP WES and WGS data releases. We observed that many variants displayed large variation in allele frequencies across sequencing centers/kits and contributed to spurious association signals with AD risk. We also observed that sequencing kit/center adjustment in association models could not fully account for these spurious signals. To address this issue, we designed and implemented novel filters that aim to capture and remove these center/kit-specific artifactual variants. We conclude by deriving a novel, fast, and robust approach to filter variants that represent sequencing center- or kit-related artifacts underlying spurious associations with AD risk in ADSP WES and WGS data. This approach will be important to support future robust genetic association studies on ADSP data, as well as other studies with similar designs.


2020 ◽  
Author(s):  
Emily Greenfest-Allen ◽  
Conor Klamann ◽  
Prabhakaran Gangadharan ◽  
Amanda Kuzma ◽  
Yuk Yee Leung ◽  
...  

AbstractINTRODUCTIONThe NIAGADS Alzheimer’s Genomics Database is an interactive knowledgebase for AD genetics that provides access to GWAS summary statistics datasets deposited at NIAGADS, a national genetics data repository for AD and related dementia (ADRD).METHODSThe website makes available >70 genome-wide summary statistics datasets from GWAS and genome sequencing analysis for AD/ADRD. Variants identified from these datasets are mapped to up-to-date variant and gene annotations from a variety of resources and linked to functional genomics data.The database is powered by a big data optimized relational database and ontologies to consistently annotate study designs and phenotypes, facilitating data harmonization and efficient real-time data analysis and variant or gene report generation.RESULTSDetailed variant reports provide tabular and interactive graphical summaries of known ADRD associations, as well as highlight variants flagged by the Alzheimer’s Disease Sequencing Project (ADSP). Gene reports provide summaries of co-located ADRD risk-associated variants and have been expanded to include meta-analysis results from aggregate association tests performed by the ADSP allowing us to flag genes with genetic-evidence for AD.DISCUSSIONThe GenomicsDB makes available >100 million variant annotations, including ~30 million (5 million novel) variants identified as AD-relevant by ADSP, for browsing and real-time mining via the website or programmatically through a REST API. With a newly redesigned, efficient, search interface and comprehensive record pages linking summary statistics to variant and gene annotations, this resource makes these data both accessible and interpretable, establishing itself as valuable tool for AD research.


2017 ◽  
Vol 13 (7) ◽  
pp. P570-P571
Author(s):  
Joshua C. Bis ◽  
Xueqiu Jian ◽  
Brian W. Kunkle ◽  
Kara L. Hamilton ◽  
William S. Bush ◽  
...  

2021 ◽  
Vol 11 ◽  
Author(s):  
Valerio Napolioni ◽  
Marzia A. Scelsi ◽  
Raiyan R. Khan ◽  
Andre Altmann ◽  
Michael D. Greicius

Prior work in late-onset Alzheimer’s disease (LOAD) has resulted in discrepant findings as to whether recent consanguinity and outbred autozygosity are associated with LOAD risk. In the current study, we tested the association between consanguinity and outbred autozygosity with LOAD in the largest such analysis to date, in which 20 LOAD GWAS datasets were retrieved through public databases. Our analyses were restricted to eight distinct ethnic groups: African–Caribbean, Ashkenazi–Jewish European, European–Caribbean, French–Canadian, Finnish European, North-Western European, South-Eastern European, and Yoruba African for a total of 21,492 unrelated subjects (11,196 LOAD and 10,296 controls). Recent consanguinity determination was performed using FSuite v1.0.3, according to subjects’ ancestral background. The level of autozygosity in the outbred population was assessed by calculating inbreeding estimates based on the proportion (FROH) and the number (NROH) of runs of homozygosity (ROHs). We analyzed all eight ethnic groups using a fixed-effect meta-analysis, which showed a significant association of recent consanguinity with LOAD (N = 21,481; OR = 1.262, P = 3.6 × 10–4), independently of APOE∗4 (N = 21,468, OR = 1.237, P = 0.002), and years of education (N = 9,257; OR = 1.274, P = 0.020). Autozygosity in the outbred population was also associated with an increased risk of LOAD, both for FROH (N = 20,237; OR = 1.204, P = 0.030) and NROH metrics (N = 20,237; OR = 1.019, P = 0.006), independently of APOE∗4 [(FROH, N = 20,225; OR = 1.222, P = 0.029) (NROH, N = 20,225; OR = 1.019, P = 0.007)]. By leveraging the Alzheimer’s Disease Sequencing Project (ADSP) whole-exome sequencing (WES) data, we determined that LOAD subjects do not show an enrichment of rare, risk-enhancing minor homozygote variants compared to the control population. A two-stage recessive GWAS using ADSP data from 201 consanguineous subjects in the discovery phase followed by validation in 10,469 subjects led to the identification of RPH3AL p.A303V (rs117190076) as a rare minor homozygote variant increasing the risk of LOAD [discovery: Genotype Relative Risk (GRR) = 46, P = 2.16 × 10–6; validation: GRR = 1.9, P = 8.0 × 10–4]. These results confirm that recent consanguinity and autozygosity in the outbred population increase risk for LOAD. Subsequent work, with increased samples sizes of consanguineous subjects, should accelerate the discovery of non-additive genetic effects in LOAD.


2019 ◽  
Author(s):  
Mark T. W. Ebbert ◽  
Tanner D. Jensen ◽  
Karen Jansen-West ◽  
Jonathon P. Sens ◽  
Joseph S. Reddy ◽  
...  

AbstractBackgroundThe human genome contains ‘dark’ gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions that are ‘dark by depth’ (few mappable reads) and others that are ‘camouflaged’ (ambiguous alignment), and we assess how well long-read technologies resolve these regions. We further present an algorithm to resolve most camouflaged regions (including in short-read data) and apply it to the Alzheimer’s Disease Sequencing Project (ADSP; 13142 samples), as a proof of principle.ResultsBased on standard whole-genome lllumina sequencing data, we identified 37873 dark regions in 5857 gene bodies (3635 protein-coding) from pathways important to human health, development, and reproduction. Of the 5857 gene bodies, 494 (8.4%) were 100% dark (142 protein-coding) and 2046 (34.9%) were ≥5% dark (628 protein-coding). Exactly 2757 dark regions were in protein-coding exons (CDS) across 744 genes. Long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduced dark CDS regions to approximately 45.1%, 33.3%, and 18.2% respectively. Applying our algorithm to the ADSP, we rescued 4622 exonic variants from 501 camouflaged genes, including a rare, ten-nucleotide frameshift deletion in CR1, a top Alzheimer’s disease gene, found in only five ADSP cases and zero controls.ConclusionsWhile we could not formally assess the CR1 frameshift mutation in Alzheimer’s disease (insufficient sample-size), we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.


2021 ◽  
Author(s):  
Bowen Jin ◽  
John A Capra ◽  
Penelope Benchek ◽  
Nicholas R Wheeler ◽  
Adam C Naj ◽  
...  

Over 90% of variants are rare, and 50% of them are singletons in the Alzheimer's Disease Sequencing Project Whole Exome Sequencing (ADSP WES) data. However, either single variant tests or unit-based tests are limited in the statistical power to detect the association between rare variants and phenotypes. To best utilize rare variants and investigate their biological effect, we exam their association with phenotypes in the context of protein. We developed a protein structure-based approach, POKEMON (Protein Optimized Kernel Evaluation of Missense Nucleotides), which evaluates rare missense variants based on their spatial distribution on the protein rather than allele frequency. The hypothesis behind this is that the three-dimensional spatial distribution of variants within a protein structure provides functional context and improves the power of association tests. POKEMON identified four candidate genes from the ADSP WES data, namely two known Alzheimer's disease (AD) genes (TREM2 and SORL) and two novel genes (DUSP18 and CSF1R). For known AD genes, the signal from the spatial cluster is stable even if we exclude known AD risk variants, indicating the presence of additional low frequency risk variants within these genes. DUSP18 has a cluster of variants primarily shared by case subjects around the ligand-binding domain, and this cluster is further validated in a replication dataset with a larger sample size. POKEMON is an open-source tool available at https://github.com/bushlab-genomics/POKEMON.


2006 ◽  
Vol 14 (7S_Part_6) ◽  
pp. P333-P334
Author(s):  
Amanda B. Kuzma ◽  
Kelley Faber ◽  
William J. Salerno ◽  
Yuk Yee Leung ◽  
Laura B. Cantwell ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document