scholarly journals Identifying and Assessing Interesting Subgroups in a Heterogeneous Population

2015 ◽  
Vol 2015 ◽  
pp. 1-13
Author(s):  
Woojoo Lee ◽  
Andrey Alexeyenko ◽  
Maria Pernemalm ◽  
Justine Guegan ◽  
Philippe Dessen ◽  
...  

Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such ask-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability—the basis of cluster generation—is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.

2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Yaohui Liu ◽  
Dong Liu ◽  
Fang Yu ◽  
Zhengming Ma

Clustering is widely used in data analysis, and density-based methods are developed rapidly in the recent 10 years. Although the state-of-art density peak clustering algorithms are efficient and can detect arbitrary shape clusters, they are nonsphere type of centroid-based methods essentially. In this paper, a novel local density hierarchical clustering algorithm based on reverse nearest neighbors, RNN-LDH, is proposed. By constructing and using a reverse nearest neighbor graph, the extended core regions are found out as initial clusters. Then, a new local density metric is defined to calculate the density of each object; meanwhile, the density hierarchical relationships among the objects are built according to their densities and neighbor relations. Finally, each unclustered object is classified to one of the initial clusters or noise. Results of experiments on synthetic and real data sets show that RNN-LDH outperforms the current clustering methods based on density peak or reverse nearest neighbors.


Author(s):  
Naghmeh Niroomand ◽  
Christian Bach ◽  
Miriam Elser

There has been globally continuous growth in passenger car sizes and types over the past few decades. To assess the development of vehicular specifications in this context and to evaluate changes in powertrain technologies depending on surrounding frame conditions, such as charging stations and vehicle taxation policy, we need a detailed understanding of the vehicle fleet composition. This paper aims therefore to introduce a novel mathematical approach to segment passenger vehicles based on dimensions features using a means fuzzy clustering algorithm, Fuzzy C-means (FCM), and a non-fuzzy clustering algorithm, K-means (KM). We analyze the performance of the proposed algorithms and compare them with Swiss expert segmentation. Experiments on the real data sets demonstrate that the FCM classifier has better correlation with the expert segmentation than KM. Furthermore, the outputs from FCM with five clusters show that the proposed algorithm has a superior performance for accurate vehicle categorization because of its capacity to recognize and consolidate dimension attributes from the unsupervised data set. Its performance in categorizing vehicles was promising with an average accuracy rate of 79% and an average positive predictive value of 75%.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Qi Diao ◽  
Yaping Dai ◽  
Qichao An ◽  
Weixing Li ◽  
Xiaoxue Feng ◽  
...  

This paper presents an improved clustering algorithm for categorizing data with arbitrary shapes. Most of the conventional clustering approaches work only with round-shaped clusters. This task can be accomplished by quickly searching and finding clustering methods for density peaks (DPC), but in some cases, it is limited by density peaks and allocation strategy. To overcome these limitations, two improvements are proposed in this paper. To describe the clustering center more comprehensively, the definitions of local density and relative distance are fused with multiple distances, including K-nearest neighbors (KNN) and shared-nearest neighbors (SNN). A similarity-first search algorithm is designed to search the most matching cluster centers for noncenter points in a weighted KNN graph. Extensive comparison with several existing DPC methods, e.g., traditional DPC algorithm, density-based spatial clustering of applications with noise (DBSCAN), affinity propagation (AP), FKNN-DPC, and K-means methods, has been carried out. Experiments based on synthetic data and real data show that the proposed clustering algorithm can outperform DPC, DBSCAN, AP, and K-means in terms of the clustering accuracy (ACC), the adjusted mutual information (AMI), and the adjusted Rand index (ARI).


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kim Hoa Ho ◽  
Annarita Patrizi

AbstractChoroid plexus (ChP), a vascularized secretory epithelium located in all brain ventricles, plays critical roles in development, homeostasis and brain repair. Reverse transcription quantitative real-time PCR (RT-qPCR) is a popular and useful technique for measuring gene expression changes and also widely used in ChP studies. However, the reliability of RT-qPCR data is strongly dependent on the choice of reference genes, which are supposed to be stable across all samples. In this study, we validated the expression of 12 well established housekeeping genes in ChP in 2 independent experimental paradigms by using popular stability testing algorithms: BestKeeper, DeltaCq, geNorm and NormFinder. Rer1 and Rpl13a were identified as the most stable genes throughout mouse ChP development, while Hprt1 and Rpl27 were the most stable genes across conditions in a mouse sensory deprivation experiment. In addition, Rpl13a, Rpl27 and Tbp were mutually among the top five most stable genes in both experiments. Normalisation of Ttr and Otx2 expression levels using different housekeeping gene combinations demonstrated the profound effect of reference gene choice on target gene expression. Our study emphasized the importance of validating and selecting stable housekeeping genes under specific experimental conditions.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Ridwan Dwi Saputro ◽  
Hanggoro Tri Rinonce ◽  
Yayuk Iramawasita ◽  
Muhammad Rasyid Ridho ◽  
Maria Fransiska Pudjohartono ◽  
...  

Abstract Objective Biomarker mRNA levels have been suggested to be predictors of patient survival and therapy response in melanoma cases. This study aimed to investigate the correlations between the mRNA expression levels of PD-L1 and NKG2A in melanoma tissue with clinicopathologic characteristics and survival in Indonesian primary nodular melanoma patients. Results Thirty-one tissue samples were obtained; two were excluded from survival analysis due to Breslow depth of less than 4 mm. The median survival of upregulated and normoregulated PD-L1-patients were 15.800 ± 2.345 and 28.945 ± 4.126 months, respectively. However, this difference was not significant statistically (p = 0.086). Upregulated and normoregulated NKG2A patients differed very little in median survival time (25.943 ± 7.415 vs 26.470 ± 3.854 months; p = 0.981). Expression of PD-L1 and NKG2A were strongly correlated (rs: 0.787, p < 0.001). No clinicopathologic associations with PD-L1 and NKG2A mRNA levels were observed. These results suggest that PD-L1 may have potential as a prognostic factor. Although an unlikely prognostic factor, NKG2A may become an adjunct target for therapy. The strong correlation between PD-L1 and NKG2A suggests that anti-PD-1 and anti-NKG2A agents could be effective in patients with PD-L1 upregulation. The mRNA levels of these two genes may help direct choice of immunotherapy and predict patient outcomes.


Sign in / Sign up

Export Citation Format

Share Document