scholarly journals Fitting linear mixed models to a highly-structured dataset effectively controls for population structure in bacterial genome-wide association studies

2019 ◽  
Vol 1 (1A) ◽  
Author(s):  
Samuel Kidman ◽  
Emem-Fong Ukor ◽  
Andres Floto ◽  
Julian Parkhill
2017 ◽  
Author(s):  
Carl Kadie ◽  
David Heckerman

AbstractWe have developed Ludicrous Speed Linear Mixed Models, a version of FaST-LMM optimized for the cloud. The approach can perform a genome-wide association analysis on a dataset of one million SNPs across one million individuals at a cost of about 868 CPU days with an elapsed time on the order of two weeks. A Python implementation is available at https://fastlmm.github.io/.SignificanceIdentifying SNP-phenotype correlations using GWAS is difficult because effect sizes are so small for common, complex diseases. To address this issue, institutions are creating extremely large cohorts with sample sizes on the order of one million. Unfortunately, such cohorts are likely to contain confounding factors such as population structure and family/cryptic relatedness. The linear mixed model (LMM) can often correct for such confounding factors, but is too slow to use even with algebraic speedups known as FaST-LMM. We present a cloud implementation of FaST-LMM, called Ludicrous Speed LMM, that can process one million samples and one million test SNPs in a reasonable amount of time and at a reasonable cost.


2012 ◽  
Vol 9 (6) ◽  
pp. 525-526 ◽  
Author(s):  
Jennifer Listgarten ◽  
Christoph Lippert ◽  
Carl M Kadie ◽  
Robert I Davidson ◽  
Eleazar Eskin ◽  
...  

2011 ◽  
Vol 8 (10) ◽  
pp. 833-835 ◽  
Author(s):  
Christoph Lippert ◽  
Jennifer Listgarten ◽  
Ying Liu ◽  
Carl M Kadie ◽  
Robert I Davidson ◽  
...  

2014 ◽  
Vol 4 (1) ◽  
Author(s):  
Christian Widmer ◽  
Christoph Lippert ◽  
Omer Weissbrod ◽  
Nicolo Fusi ◽  
Carl Kadie ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document