scholarly journals A Variable-Sized Sliding-Window Approach for Genetic Association Studies via Principal Component Analysis

2009 ◽  
Vol 73 (6) ◽  
pp. 631-637 ◽  
Author(s):  
Rui Tang ◽  
Tao Feng ◽  
Qiuying Sha ◽  
Shuanglin Zhang
Author(s):  
Zhongxue Chen ◽  
Shizhong Han ◽  
Kai Wang

AbstractMany gene- and pathway-based association tests have been proposed in the literature. Among them, the SKAT is widely used, especially for rare variants association studies. In this paper, we investigate the connection between SKAT and a principal component analysis. This investigation leads to a procedure that encompasses SKAT as a special case. Through simulation studies and real data applications, we compare the proposed method with some existing tests.


2014 ◽  
Vol 94 (5) ◽  
pp. 662-676 ◽  
Author(s):  
Hugues Aschard ◽  
Bjarni J. Vilhjálmsson ◽  
Nicolas Greliche ◽  
Pierre-Emmanuel Morange ◽  
David-Alexandre Trégouët ◽  
...  

2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Hao-Chih Lee ◽  
Osamu Ichikawa ◽  
Benjamin S. Glicksberg ◽  
Aparna A. Divaraniya ◽  
Christine E. Becker ◽  
...  

2019 ◽  
Author(s):  
Daiwei Zhang ◽  
Rounak Dey ◽  
Seunggeun Lee

AbstractPopulation stratification (PS) is a major confounder in genome-wide association studies (GWAS) and can lead to false positive associations. To adjust for PS, principal component analysis (PCA)-based ancestry prediction has been widely used. Simple projection (SP) based on principal component loading and recently developed data augmentation-decomposition-transformation (ADP), such as LASER and TRACE, are popular methods for predicting PC scores. However, they are either biased or computationally expensive. The predicted PC scores from SP can be biased toward NULL. On the other hand, since ADP requires running PCA separately for each study sample on the augmented data set, its computational cost is high. To address these problems, we develop and propose two alternative approaches, bias-adjusted projection (AP) and online ADP (OADP). Using random matrix theory, AP asymptotically estimates and adjusts for the bias of SP. OADP uses computationally efficient online singular value decomposition, which can greatly reduce the computation cost of ADP. We carried out extensive simulation studies to show that these alternative approaches are unbiased and the computation times can be 10-100 times faster than ADP. We applied our approaches to UK-Biobank data of 488,366 study samples with 2,492 samples from the 1000 Genomes data as the reference. AP and OADP required 7 and 75 CPU hours, respectively, while the projected computation time of ADP is 2,534 CPU hours. Furthermore, when we only used the European reference samples in the 1000 Genomes to infer sub-European ancestry, SP clearly showed bias, unlike the proposed approaches. By using AP and OADP, we can infer ancestry and adjust for PS robustly and efficiently.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1531
Author(s):  
Shanshan Huang ◽  
Yikun Yang ◽  
Xin Jin ◽  
Ya Zhang ◽  
Qian Jiang ◽  
...  

Multi-sensor image fusion is used to combine the complementary information of source images from the multiple sensors. Recently, conventional image fusion schemes based on signal processing techniques have been studied extensively, and machine learning-based techniques have been introduced into image fusion because of the prominent advantages. In this work, a new multi-sensor image fusion method based on the support vector machine and principal component analysis is proposed. First, the key features of the source images are extracted by combining the sliding window technique and five effective evaluation indicators. Second, a trained support vector machine model is used to extract the focus region and the non-focus region of the source images according to the extracted image features, the fusion decision is therefore obtained for each source image. Then, the consistency verification operation is used to absorb a single singular point in the decisions of the trained classifier. Finally, a novel method based on principal component analysis and the multi-scale sliding window is proposed to handle the disputed areas in the fusion decision pair. Experiments are performed to verify the performance of the new combined method.


2020 ◽  
Vol 36 (11) ◽  
pp. 3439-3446 ◽  
Author(s):  
Daiwei Zhang ◽  
Rounak Dey ◽  
Seunggeun Lee

Abstract Motivation Population stratification (PS) is a major confounder in genome-wide association studies (GWAS) and can lead to false-positive associations. To adjust for PS, principal component analysis (PCA)-based ancestry prediction has been widely used. Simple projection (SP) based on principal component loadings and the recently developed data augmentation, decomposition and Procrustes (ADP) transformation, such as LASER and TRACE, are popular methods for predicting PC scores. However, the predicted PC scores from SP can be biased toward NULL. On the other hand, ADP has a high computation cost because it requires running PCA separately for each study sample on the augmented dataset. Results We develop and propose two alternative approaches: bias-adjusted projection (AP) and online ADP (OADP). Using random matrix theory, AP asymptotically estimates and adjusts for the bias of SP. OADP uses a computationally efficient online singular value decomposition algorithm, which can greatly reduce the computation cost of ADP. We carried out extensive simulation studies to show that these alternative approaches are unbiased and the computation speed can be 16–16 000 times faster than ADP. We applied our approaches to the UK Biobank data of 488 366 study samples with 2492 samples from the 1000 Genomes data as the reference. AP and OADP required 0.82 and 21 CPU hours, respectively, while the projected computation time of ADP was 1628 CPU hours. Furthermore, when inferring sub-European ancestry, SP clearly showed bias, unlike the proposed approaches. Availability and implementation The OADP and AP methods, as well as SP and ADP, have been implemented in the open-source Python software FRAPOSA, available at github.com/daviddaiweizhang/fraposa. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document