scholarly journals Novel Approach for Parallelizing Pairwise Comparison Problems as Applied to Detecting Segments Identical By Decent in Whole-Genome Data

Author(s):  
Emmanuel Sapin ◽  
Matthew C Keller

Abstract Motivation Pairwise comparison problems arise in many areas of science. In genomics, datasets are already large and getting larger, and so operations that require pairwise comparisons—either on pairs of SNPs or pairs of individuals—are extremely computationally challenging. We propose a generic algorithm for addressing pairwise comparison problems that breaks a large problem (of order n2 comparisons) into multiple smaller ones (each of order n comparisons), allowing for massive parallelization. Results We demonstrated that this approach is very efficient for calling identical by descent (IBD) segments between all pairs of individuals in the UK Biobank dataset, with a 250-fold savings in time and 750-fold savings in memory over the standard approach to detecting such segments across the full dataset. This efficiency should extend to other methods of IBD calling and, more generally, to other pairwise comparison tasks in genomics or other areas of science.

2020 ◽  
Author(s):  
Emmanuel Sapin ◽  
Matthew C. Keller

AbstractMotivationPairwise comparison problems arise in many areas of science. In genomics, datasets are already large and getting larger, and so operations that require pairwise comparisons—either on pairs of SNPs or pairs of individuals—are extremely computationally challenging. We propose a generic algorithm for addressing pairwise comparison problems that breaks a large problem (of order n2 comparisons) into multiple smaller ones (each of order n comparisons), allowing for massive parallelization.ResultsWe demonstrated that this procedure is very efficient for calling identical by descent (IBD) segments between all pairs of individuals in the UK Biobank dataset, with a user time savings roughly 180-fold over the traditional (non-parallel) approach to detecting such segments. This efficiency should extend to other methods of IBD calling and, more generally, to other pairwise comparison tasks in genomics or other areas of [email protected]


2015 ◽  
Author(s):  
Po-Ru Loh ◽  
Pier Francesco Palamara ◽  
Alkes L Price

Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here, we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N=150K samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1-2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈0.3%, corresponding to perfect phase in most 10Mb segments). We also observed that when used within an imputation pipeline, Eagle pre-phasing improved downstream imputation accuracy compared to pre-phasing in batches using existing methods (as necessary to achieve comparable computational cost).


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ruhollah Shemirani ◽  
Gillian M. Belbin ◽  
Christy L. Avery ◽  
Eimear E. Kenny ◽  
Christopher R. Gignoux ◽  
...  

AbstractThe ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by several orders of magnitude on genomic datasets, making IBD estimation tractable for millions of individuals. We apply iLASH to the PAGE dataset of ~52,000 multi-ethnic participants, including several founder populations with elevated IBD sharing, identifying IBD segments in ~3 minutes per chromosome compared to over 6 days for a state-of-the-art algorithm. iLASH enables efficient analysis of very large-scale datasets, as we demonstrate by computing IBD across the UK Biobank (~500,000 individuals), detecting 12.9 billion pairwise connections.


2019 ◽  
Author(s):  
Elizabeth Curtis ◽  
Justin Liu ◽  
Kate Ward ◽  
Karen Jameson ◽  
Zahra Raisi-Estabragh ◽  
...  

2020 ◽  
Author(s):  
John E. McGeary ◽  
Chelsie Benca-Bachman ◽  
Victoria Risner ◽  
Christopher G Beevers ◽  
Brandon Gibb ◽  
...  

Twin studies indicate that 30-40% of the disease liability for depression can be attributed to genetic differences. Here, we assess the explanatory ability of polygenic scores (PGS) based on broad- (PGSBD) and clinical- (PGSMDD) depression summary statistics from the UK Biobank using independent cohorts of adults (N=210; 100% European Ancestry) and children (N=728; 70% European Ancestry) who have been extensively phenotyped for depression and related neurocognitive phenotypes. PGS associations with depression severity and diagnosis were generally modest, and larger in adults than children. Polygenic prediction of depression-related phenotypes was mixed and varied by PGS. Higher PGSBD, in adults, was associated with a higher likelihood of having suicidal ideation, increased brooding and anhedonia, and lower levels of cognitive reappraisal; PGSMDD was positively associated with brooding and negatively related to cognitive reappraisal. Overall, PGS based on both broad and clinical depression phenotypes have modest utility in adult and child samples of depression.


SLEEP ◽  
2021 ◽  
Vol 44 (Supplement_2) ◽  
pp. A273-A273
Author(s):  
Xi Zheng ◽  
Ma Cherrysse Ulsa ◽  
Peng Li ◽  
Lei Gao ◽  
Kun Hu

Abstract Introduction While there is emerging evidence for acute sleep disruption in the aftermath of coronavirus disease 2019 (COVID-19), it is unknown whether sleep traits contribute to mortality risk. In this study, we tested whether earlier-life sleep duration, chronotype, insomnia, napping or sleep apnea were associated with increased 30-day COVID-19 mortality. Methods We included 34,711 participants from the UK Biobank, who presented for COVID-19 testing between March and October 2020 (mean age at diagnosis: 69.4±8.3; range 50.2–84.6). Self-reported sleep duration (less than 6h/6-9h/more than 9h), chronotype (“morning”/”intermediate”/”evening”), daytime dozing (often/rarely), insomnia (often/rarely), napping (often/rarely) and presence of sleep apnea (ICD-10 or self-report) were obtained between 2006 and 2010. Multivariate logistic regression models were used to adjust for age, sex, education, socioeconomic status, and relevant risk factors (BMI, hypertension, diabetes, respiratory diseases, smoking, and alcohol). Results The mean time between sleep measures and COVID-19 testing was 11.6±0.9 years. Overall, 5,066 (14.6%) were positive. In those who were positive, 355 (7.0%) died within 30 days (median = 8) after diagnosis. Long sleepers (>9h vs. 6-9h) [20/103 (19.4%) vs. 300/4,573 (6.6%); OR 2.09, 95% 1.19–3.64, p=0.009), often daytime dozers (OR 1.68, 95% 1.04–2.72, p=0.03), and nappers (OR 1.52, 95% 1.04–2.23, p=0.03) were at greater odds of mortality. Prior diagnosis of sleep apnea also saw a two-fold increased odds (OR 2.07, 95% CI: 1.25–3.44 p=0.005). No associations were seen for short sleepers, chronotype or insomnia with COVID-19 mortality. Conclusion Data across all current waves of infection show that prior sleep traits/disturbances, in particular long sleep duration, daytime dozing, napping and sleep apnea, are associated with increased 30-day mortality after COVID-19, independent of health-related risk factors. While sleep health traits may reflect unmeasured poor health, further work is warranted to examine the exact underlying mechanisms, and to test whether sleep health optimization offers resilience to severe illness from COVID-19. Support (if any) NIH [T32GM007592 and R03AG067985 to L.G. RF1AG059867, RF1AG064312, to K.H.], the BrightFocus Foundation A2020886S to P.L. and the Foundation of Anesthesia Education and Research MRTG-02-15-2020 to L.G.


Sign in / Sign up

Export Citation Format

Share Document