scholarly journals Combining an Evolution-guided Clustering Algorithm and Haplotype-based LRT in Family Association Studies

BMC Genetics ◽  
2011 ◽  
Vol 12 (1) ◽  
pp. 48 ◽  
Author(s):  
Mei-Hsien Lee ◽  
Jung-Ying Tzeng ◽  
Su-Yun Huang ◽  
Chuhsing Hsiao
2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Tae-Joon Park ◽  
Lyong Heo ◽  
Sanghoon Moon ◽  
Young Jin Kim ◽  
Ji Hee Oh ◽  
...  

Exome-based genotyping arrays are cost-effective and have recently been used as alternative platforms to whole-exome sequencing. However, the automated clustering algorithm in an exome array has a genotype calling problem in accuracy for identifying rare and low-frequency variants. To address these shortcomings, we present a practical approach for accurate genotype calling using the Illumina Infinium HumanExome BeadChip. We present comparison results and a statistical summary of our genotype data sets. Our data set comprises 14,647 Korean samples. To solve the limitation of automated clustering, we performed manual genotype clustering for the targeted identification of 46,076 variants that were identified using GenomeStudio software. To evaluate the effects of applying custom cluster files, we tested cluster files using 804 independent Korean samples and the same platform. Our study firstly suggests practical guidelines for exome chip quality control in Asian populations and provides valuable insight into an association study using exome chip.


2012 ◽  
Vol 51 (2) ◽  
pp. 185-190 ◽  
Author(s):  
Alex J. Cannon

AbstractRegression-guided clustering is introduced as a means of constructing circulation-to-environment synoptic climatological classifications. Rather than applying an unsupervised clustering algorithm to synoptic-scale atmospheric circulation data, one instead augments the atmospheric circulation dataset with predictions from a supervised regression model linking circulation to environment. The combined dataset is then entered into the clustering algorithm. The level of influence of the environmental dataset can be controlled by a simple weighting factor. The method is generic in that the choice of regression model and clustering algorithm is left to the user. Examples are given using standard multivariate linear regression models and the k-means clustering algorithm, both established methods in synoptic climatology. Results for southern British Columbia, Canada, indicate that model performance can be made to range between that of a fully unsupervised algorithm and a fully supervised algorithm.


2021 ◽  
Author(s):  
Ghislain Rocheleau ◽  
Iain S Forrest ◽  
Áine Duffy ◽  
Shantanu Bafna ◽  
Amanda Dobbyn ◽  
...  

Background: Phenome-wide association studies conducted in electronic health record (EHR)-linked biobanks have uncovered a large number of genomic loci associated with traits and diseases. However, interpretation of the complex relationships of associated genes and phenotypes is challenging. Results: We constructed a tissue-level phenome-wide network map of colocalized genes and phenotypes. First, we generated colocalized expression quantitative trait loci from 48 tissues of the Genotype-Tissue Expression project and from publicly available genome-wide association study summary statistics from the UK Biobank. We identified 9,151 colocalized genes for 1,411 phenotypes across 48 tissues. Then, we constructed a bipartite network using the colocalized signals to establish links between genes and phenotypes in each tissue. The majority of links are observed in a single tissue whereas only a few are present in all tissues. Finally, we applied the biLouvain clustering algorithm in each tissue-specific bipartite network to identify co-clusters of non-overlapping genes and phenotypes. The majority of co-clusters contains a small number of genes and phenotypes, and 88.6% of co-clusters are found in only one tissue. To demonstrate functionality of the phenome-wide map, we tested if these co-clusters were enriched with known biological and functional gene classes and observed several significant enrichments. Furthermore, we observed that tissue-specific co-clusters are enriched with reported drug side effects for the corresponding drug target genes in clinical trial data. Conclusions: The phenome-wide map provides links between genes, phenotypes and tissues across a wide spectrum of biological classes and can yield biological and clinical discoveries. The phenome-wide map is publicly available at https://rstudio-connect.hpc.mssm.edu/biPheMap/.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Qin Wu ◽  
Xingqin Qi ◽  
Eddie Fuller ◽  
Cun-Quan Zhang

Within graph theory and network analysis, centrality of a vertex measures the relative importance of a vertex within a graph. The centrality plays key role in network analysis and has been widely studied using different methods. Inspired by the idea of vertex centrality, a novel centrality guided clustering (CGC) is proposed in this paper. Different from traditional clustering methods which usually choose the initial center of a cluster randomly, the CGC clustering algorithm starts from a “LEADER”—a vertex with the highest centrality score—and a new “member” is added into the same cluster as the “LEADER” when some criterion is satisfied. The CGC algorithm also supports overlapping membership. Experiments on three benchmark social network data sets are presented and the results indicate that the proposed CGC algorithm works well in social network clustering.


2011 ◽  
Vol 28 (1) ◽  
pp. 134-135 ◽  
Author(s):  
Céline Bellenguez ◽  
Amy Strange ◽  
Colin Freeman ◽  
Peter Donnelly ◽  
Chris C.A. Spencer ◽  
...  

2018 ◽  
Vol 19 (8) ◽  
pp. 2267 ◽  
Author(s):  
Xia Cao ◽  
Guoxian Yu ◽  
Jie Liu ◽  
Lianyin Jia ◽  
Jun Wang

Identifying single nucleotide polymorphism (SNP) interactions is considered as a popular and crucial way for explaining the missing heritability of complex diseases in genome-wide association studies (GWAS). Many approaches have been proposed to detect SNP interactions. However, existing approaches generally suffer from the high computational complexity resulting from the explosion of candidate high-order interactions. In this paper, we propose a two-stage approach (called ClusterMI) to detect high-order genome-wide SNP interactions based on significant pairwise SNP combinations. In the screening stage, to alleviate the huge computational burden, ClusterMI firstly applies a clustering algorithm combined with mutual information to divide SNPs into different clusters. Then, ClusterMI utilizes conditional mutual information to screen significant pairwise SNP combinations in each cluster. In this way, there is a higher probability of identifying significant two-locus combinations in each group, and the computational load for the follow-up search can be greatly reduced. In the search stage, two different search strategies (exhaustive search and improved ant colony optimization search) are provided to detect high-order SNP interactions based on the cardinality of significant two-locus combinations. Extensive simulation experiments show that ClusterMI has better performance than other related and competitive approaches. Experiments on two real case-control datasets from Wellcome Trust Case Control Consortium (WTCCC) also demonstrate that ClusterMI is more capable of identifying high-order SNP interactions from genome-wide data.


2016 ◽  
Vol 25 (11) ◽  
pp. 5252-5265 ◽  
Author(s):  
Sheng He ◽  
Petros Samara ◽  
Jan Burgers ◽  
Lambert Schomaker

Crisis ◽  
2001 ◽  
Vol 22 (2) ◽  
pp. 54-60 ◽  
Author(s):  
Lisheng Du ◽  
Gabor Faludi ◽  
Miklos Palkovits ◽  
David Bakish ◽  
Pavel D. Hrdina

Summary: Several lines of evidence indicate that abnormalities in the functioning of the central serotonergic system are involved in the pathogenesis of depressive illness and suicidal behavior. Studies have shown that the number of brain and platelet serotonin transporter binding sites are reduced in patients with depression and in suicide victims, and that the density of 5-HT2A receptors is increased in brain regions of depressed in suicide victims and in platelets of depressed suicidal patients. Genes that code for proteins, such as tryptophan hydroxylase, 5-HT transporter, and 5-HT2A receptor, involved in regulating serotonergic neurotransmission, have thus been major candidate genes for association studies of suicide and suicidal behavior. Recent studies by our group and by others have shown that genetic variations in the serotonin-system-related genes might be associated with suicidal ideation and completed suicide. We have shown that the 102 C allele in 5-HT2A receptor gene was significantly associated with suicidal ideation (χ2 = 8.5, p < .005) in depressed patients. Patients with a 102 C/C genotype had a significantly higher mean HAMD item #3 score (indication of suicidal ideation) than T/C or T/T genotype patients. Our results suggest that the 102T/C polymorphism in 5-HT2A receptor gene is primarily associated with suicidal ideation in patients with major depression and not with depression itself. We also found that the 5-HT transporter gene S/L polymorphism was significantly associated with completed suicide. The frequency of the L/L genotype in depressed suicide victims was almost double of that found in control group (48.6% vs. 26.2%). The odds ratio for the L allele was 2.1 (95% CI 1.2-3.7). The association between polymorphism in serotonergic genes and suicidality supports the hypothesis that genetic factors can modulate suicide risk by influencing serotonergic activity.


Sign in / Sign up

Export Citation Format

Share Document