scholarly journals A simple and fast two-locus quality control test to detect false positives due to batch effects in genome-wide association studies

2010 ◽  
Vol 34 (8) ◽  
pp. 854-862 ◽  
Author(s):  
Sang Hong Lee ◽  
Dale R. Nyholt ◽  
Stuart Macgregor ◽  
Anjali K. Henders ◽  
Krina T. Zondervan ◽  
...  
2012 ◽  
Vol 28 (24) ◽  
pp. 3329-3331 ◽  
Author(s):  
S. M. Gogarten ◽  
T. Bhangale ◽  
M. P. Conomos ◽  
C. A. Laurie ◽  
C. P. McHugh ◽  
...  

2019 ◽  
Author(s):  
Kosuke Hamazaki ◽  
Hiroyoshi Iwata

AbstractBackgroundDiffculty in detecting rare variants is one of the problems in conventional genome wide association studies (GWAS). The problem is closely related to the complex gene compositions comprising multiple alleles, such as haplotypes. Several single nucleotide polymorphism (SNP) set approaches have been proposed to solve this problem. These methods, however, have been rarely discussed in connection with haplotypes. In this study, we developed a novel SNP-set GWAS method named “RAINBOW” and applied the method to haplotype-based GWAS by regarding a haplotype block as a SNP-set. Combining haplotype block estimation and SNP-set GWAS, haplotype-based GWAS can be conducted without prior information of haplotypes.ResultsWe prepared 100 datasets of simulated phenotypic data and real marker genotype data of Oryza sativa subsp. indica, and performed GWAS of the datasets. We compared the power of our method, the conventional single-SNP GWAS, the conventional haplotype-based GWAS, and the conventional SNP-set GWAS. The results of the comparison indicated that the proposed method was able to better control false positives than the others. The proposed method was also excellent at detecting causal variants without relying on the linkage disequilibrium if causal variants were genotyped in the dataset. Moreover, the proposed method showed greater power than the other methods, i.e., it was able to detect causal variants that were not detected by the others, especially when the causal variants were located very close to each other and the directions of their effects were opposite.ConclusionThe proposed method, RAINBOW, is especially superior in controlling false positives, detecting causal variants, and detecting nearby causal variants with opposite effects. By using the SNP-set approach as the proposed method, we expect that detecting not only rare variants but also genes with complex mechanisms, such as genes with multiple causal variants, can be realized. RAINBOW was implemented as the R package and is available at https://github.com/KosukeHamazaki/RAINBOW.


2017 ◽  
Author(s):  
Clare Bycroft ◽  
Colin Freeman ◽  
Desislava Petkova ◽  
Gavin Band ◽  
Lloyd T. Elliott ◽  
...  

AbstractThe UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities for assessing quality issues, although the wide range of ancestries of the individuals in the cohort also creates particular challenges. We also conducted a set of analyses that reveal properties of the genetic data – such as population structure and relatedness – that can be important for downstream analyses. In addition, we phased and imputed genotypes into the dataset, using computationally efficient methods combined with the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. This increases the number of testable variants by over 100-fold to ~96 million variants. We also imputed classical allelic variation at 11 human leukocyte antigen (HLA) genes, and as a quality control check of this imputation, we replicate signals of known associations between HLA alleles and many common diseases. We describe tools that allow efficient genome-wide association studies (GWAS) of multiple traits and fast phenome-wide association studies (PheWAS), which work together with a new compressed file format that has been used to distribute the dataset. As a further check of the genotyped and imputed datasets, we performed a test-case genome-wide association scan on a well-studied human trait, standing height.


2010 ◽  
Vol 34 (6) ◽  
pp. 591-602 ◽  
Author(s):  
Cathy C. Laurie ◽  
Kimberly F. Doheny ◽  
Daniel B. Mirel ◽  
Elizabeth W. Pugh ◽  
Laura J. Bierut ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document