Multi-view document clustering based on geometrical similarity measurement

Author(s):  
Bassoma Diallo ◽  
Jie Hu ◽  
Tianrui Li ◽  
Ghufran Ahmad Khan ◽  
Ahmed Saad Hussein
GEOMATICA ◽  
2015 ◽  
Vol 69 (4) ◽  
pp. 385-394
Author(s):  
Zhanlong Chen ◽  
Yongyang Xu ◽  
Liang Wu

With the rotation, translation and scaling invariance, etc. it is difficult to measure the similarity between GIS planar elements. To describe the graphics precisely, according to the number of shortest paths where vertices occur, we define the “vertex betweenness;” this measures the importance of each vertex in a graph. The higher the vertex betweenness, the more important vertex becomes. We propose a contour fea ture points extraction method, where Fourier descriptors are used. We normalize the first n order factors of Fourier descriptors, on the basis of similarity between polygons, which is obtained by comparing the cosine values for every two vectors. The experiment is operated on two different data scales, 1:50 000 and 1:250 000. Combined with analysis of impact factors during similarity measurement, the experiment results show that the contour feature points extraction method can effectively measure the geometrical similarity between GIS planar elements.


Author(s):  
Laith Mohammad Abualigah ◽  
Essam Said Hanandeh ◽  
Ahamad Tajudin Khader ◽  
Mohammed Abdallh Otair ◽  
Shishir Kumar Shandilya

Background: Considering the increasing volume of text document information on Internet pages, dealing with such a tremendous amount of knowledge becomes totally complex due to its large size. Text clustering is a common optimization problem used to manage a large amount of text information into a subset of comparable and coherent clusters. Aims: This paper presents a novel local clustering technique, namely, β-hill climbing, to solve the problem of the text document clustering through modeling the β-hill climbing technique for partitioning the similar documents into the same cluster. Methods: The β parameter is the primary innovation in β-hill climbing technique. It has been introduced in order to perform a balance between local and global search. Local search methods are successfully applied to solve the problem of the text document clustering such as; k-medoid and kmean techniques. Results: Experiments were conducted on eight benchmark standard text datasets with different characteristics taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed β-hill climbing achieved better results in comparison with the original hill climbing technique in solving the text clustering problem. Conclusion: The performance of the text clustering is useful by adding the β operator to the hill climbing.


Author(s):  
Ruina Bai ◽  
Ruizhang Huang ◽  
Yanping Chen ◽  
Yongbin Qin

2021 ◽  
pp. 106907
Author(s):  
Sahar Behpour ◽  
Mohammadmahdi Mohammadi ◽  
Mark V. Albert ◽  
Zinat S. Alam ◽  
Lingling Wang ◽  
...  

Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 184
Author(s):  
Xia Que ◽  
Siyuan Jiang ◽  
Jiaoyun Yang ◽  
Ning An

Many mixed datasets with both numerical and categorical attributes have been collected in various fields, including medicine, biology, etc. Designing appropriate similarity measurements plays an important role in clustering these datasets. Many traditional measurements treat various attributes equally when measuring the similarity. However, different attributes may contribute differently as the amount of information they contained could vary a lot. In this paper, we propose a similarity measurement with entropy-based weighting for clustering mixed datasets. The numerical data are first transformed into categorical data by an automatic categorization technique. Then, an entropy-based weighting strategy is applied to denote the different importances of various attributes. We incorporate the proposed measurement into an iterative clustering algorithm, and extensive experiments show that this algorithm outperforms OCIL and K-Prototype methods with 2.13% and 4.28% improvements, respectively, in terms of accuracy on six mixed datasets from UCI.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 696
Author(s):  
Haipeng Chen ◽  
Zeyu Xie ◽  
Yongping Huang ◽  
Di Gai

The fuzzy C-means clustering (FCM) algorithm is used widely in medical image segmentation and suitable for segmenting brain tumors. Therefore, an intuitionistic fuzzy C-means algorithm based on membership information transferring and similarity measurements (IFCM-MS) is proposed to segment brain tumor magnetic resonance images (MRI) in this paper. The original FCM lacks spatial information, which leads to a high noise sensitivity. To address this issue, the membership information transfer model is adopted to the IFCM-MS. Specifically, neighborhood information and the similarity of adjacent iterations are incorporated into the clustering process. Besides, FCM uses simple distance measurements to calculate the membership degree, which causes an unsatisfactory result. So, a similarity measurement method is designed in the IFCM-MS to improve the membership calculation, in which gray information and distance information are fused adaptively. In addition, the complex structure of the brain results in MRIs with uncertainty boundary tissues. To overcome this problem, an intuitive fuzzy attribute is embedded into the IFCM-MS. Experiments performed on real brain tumor images demonstrate that our IFCM-MS has low noise sensitivity and high segmentation accuracy.


Sign in / Sign up

Export Citation Format

Share Document