LabWAS: novel findings and study design recommendations from a meta-analysis of clinical labs in two independent biobanks
ABSTRACTPhenotypes extracted from Electronic Health Records (EHRs) are increasingly prevalent in genetic studies. EHRs contain hundreds of distinct clinical laboratory test results, providing a trove of health data beyond diagnoses. Such lab data is complex and lacks a ubiquitous coding scheme, making it more challenging than diagnosis data. Here we describe the first large-scale cross-health system genome-wide association study (GWAS) of EHR-based quantitative lab measurements. We meta-analyzed 70 labs matched between the BioVU cohort from the Vanderbilt University Health System and the Michigan Genomics Initiative (MGI) cohort from Michigan Medicine. We show high replication of known association for these labs, validating EHR-based measurements as high-quality phenotypes for genetic analysis. Notably, our analysis provides the first replication for 700 previous GWAS associations across 46 different labs. We discovered 31 novel associations at genome-wide significance for 22 distinct labs, including the first reported associations for two labs. We replicated 22 of these novel associations in an independent tranche of BioVU samples. The summary statistics for all association tests are available through an interactive webtool to benefit other researchers. Finally, we performed mirrored analyses in BioVU and MGI to assess competing analytic practices for lab data. We find that using the mean of all available lab measurements provides a robust summary value, but alternate summarizations can improve power in certain labs. This study provides a proof-of-principle for cross health system GWAS and is a framework for future studies of quantitative traits in EHRs.