scholarly journals Imputation and quality control steps for combining multiple genome-wide datasets

2014 ◽  
Vol 5 ◽  
Author(s):  
Shefali S. Verma ◽  
Mariza de Andrade ◽  
Gerard Tromp ◽  
Helena Kuivaniemi ◽  
Elizabeth Pugh ◽  
...  
Author(s):  
Priyanka Barman ◽  
Rwik Sen ◽  
Amala Kaja ◽  
Jannatul Ferdoush ◽  
Shalini Guha ◽  
...  

San1 ubiquitin ligase is involved in nuclear protein quality control via its interaction with intrinsically disordered proteins for ubiquitylation and proteasomal degradation. Since several transcription/chromatin regulatory factors contain intrinsically disordered domains and can be inhibitory to transcription when in excess, San1 might be involved in transcription regulation. To address this, we analyzed the role of San1 in genome-wide association of TBP [that nucleates pre-initiation complex (PIC) formation for transcription initiation] and RNA polymerase II (Pol II). Our results reveal the roles of San1 in regulating TBP recruitment to the promoters and Pol II association with the coding sequences, and hence PIC formation and coordination of elongating Pol II, respectively. Consistently, transcription is altered in the absence of San1. Such transcriptional alteration is associated with impaired ubiquitylation and proteasomal degradation of Spt16 and gene association of Paf1, but not the incorporation of centromeric histone, Cse4, into the active genes in Δsan1 . Collectively, our results demonstrate distinct functions of a nuclear protein quality control factor in regulating the genome-wide PIC formation and elongating Pol II (and hence transcription), thus unraveling new gene regulatory mechanisms.


2020 ◽  
Author(s):  
Celine Charon ◽  
Rodrigue Allodji ◽  
Vincent Meyer ◽  
Jean-François Deleuze

Abstract Quality control methods for genome-wide association studies and fine mapping are commonly used for imputation, however, they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1,031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1,089 NCBI recorded individuals for additional validation.Without variant pre-filtration based on quality control (QC), we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E-04-1E-03) and rare variants (1E-03-5E-03) (p < 1E-04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) <0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E-04). As a result, to maintain confidence and enough SNVs, we propose here a 2-step post-filtration approach to increase the number of very rare and rare variants compared to conservative post-filtration methods.


2012 ◽  
Vol 28 (24) ◽  
pp. 3329-3331 ◽  
Author(s):  
S. M. Gogarten ◽  
T. Bhangale ◽  
M. P. Conomos ◽  
C. A. Laurie ◽  
C. P. McHugh ◽  
...  

PLoS ONE ◽  
2012 ◽  
Vol 7 (1) ◽  
pp. e30088 ◽  
Author(s):  
Jun Cheng ◽  
Muhammad Awais Khan ◽  
Wen-Ming Qiu ◽  
Jing Li ◽  
Hui Zhou ◽  
...  

2020 ◽  
Vol 79 (4) ◽  
pp. 588-602.e6 ◽  
Author(s):  
Sezen Meydan ◽  
Nicholas R. Guydosh
Keyword(s):  

2014 ◽  
Vol 32 (3_suppl) ◽  
pp. 42-42
Author(s):  
Eric Morgen ◽  
Xiaowei Shen ◽  
Thomas L. Vaughan ◽  
David Whiteman ◽  
Anna H. Wu ◽  
...  

42 Background: Methods of stratifying esophageal adenocarcinoma patients into prognostic groups are needed, as are new insights into genetic determinants of disease behaviour. Prognosis is likely to have non-negligible genetic influences, as mediated by host responses to tumor, resistance to therapeutic side-effects, and/or an influence on tumor development. Prior studies have used candidate-gene approaches. We took an alternative approach, using an unbiased, genome-wide approach, and novel analytic methods that may be better able to detect multi-gene interactions, which may contribute the majority of genetic effects for many clinical phenotypes. Methods: Germline DNA from a Toronto-based cohort of EAC patients (n=270) was analyzed by Omni1 Quad microarray as part of the BEAGESS initiative. Quality control and analysis was performed using PLINK, R, and GenABEL software packages. A Cox proportional hazards (CPH) model for progression-free survival tested each polymorphism for independent effects at a genome-wide significance level of P < 1E-07, adjusting for population stratification. While classical analysis has limited ability to detect gene-gene interactions, a Random Survival Forest algorithm was used to detect effects based on the complex interactions among top 1,000 polymorphisms by p-value ranking. Results: After data cleaning and standard GWAS quality control procedures, there were 735,309 SNPs and 245 patients remaining for analysis. The CPH model, adjusted for population stratification, produced a satisfactory Q-Q plot, and showed one SNP (rs7844673, Chr 8) that was significant at p=7.8E-8. In addition, Random Forest based variable selection produced a set of 20 polymorphisms that (1) reproduced 86% of the predictive ability of the full 1000 variables, and (2) also included the #3 ranked polymorphism by CPH modeling (rs9290822, Chr 3) upstream of the IGF2BP2 gene. Conclusions: A genome-wide approach has discovered two previously undescribed SNPs with a potential influence on EAC prognosis via a combination of independent and interactive effects. Validation in an independent cohort is currently being pursued.


BMC Genetics ◽  
2003 ◽  
Vol 4 (Suppl 1) ◽  
pp. S102 ◽  
Author(s):  
Ellen L Goode ◽  
Michael D Badzioch ◽  
Helen Kim ◽  
France Gagnon ◽  
Laura S Rozek ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document