scholarly journals On the Matrices of Pairwise Frequencies of Categorical Attributes for Objects Classification

2019 ◽  
Vol 11 (04) ◽  
pp. 65-75
Author(s):  
Vladimir N. Shats
Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 184
Author(s):  
Xia Que ◽  
Siyuan Jiang ◽  
Jiaoyun Yang ◽  
Ning An

Many mixed datasets with both numerical and categorical attributes have been collected in various fields, including medicine, biology, etc. Designing appropriate similarity measurements plays an important role in clustering these datasets. Many traditional measurements treat various attributes equally when measuring the similarity. However, different attributes may contribute differently as the amount of information they contained could vary a lot. In this paper, we propose a similarity measurement with entropy-based weighting for clustering mixed datasets. The numerical data are first transformed into categorical data by an automatic categorization technique. Then, an entropy-based weighting strategy is applied to denote the different importances of various attributes. We incorporate the proposed measurement into an iterative clustering algorithm, and extensive experiments show that this algorithm outperforms OCIL and K-Prototype methods with 2.13% and 4.28% improvements, respectively, in terms of accuracy on six mixed datasets from UCI.


2020 ◽  
Vol 9 (7) ◽  
pp. 440
Author(s):  
Junfang Gong ◽  
Jay Lee ◽  
Shunping Zhou ◽  
Shengwen Li

Human activity events are often recorded with their geographic locations and temporal stamps, which form spatial patterns of the events during individual time periods. Temporal attributes of these events help us understand the evolution of spatial processes over time. A challenge that researchers still face is that existing methods tend to treat all events as the same when evaluating the spatiotemporal pattern of events that have different properties. This article suggests a method for assessing the level of spatiotemporal clustering or spatiotemporal autocorrelation that may exist in a set of human activity events when they are associated with different categorical attributes. This method extends the Voronoi structure from 2D to 3D and integrates a sliding-window model as an approach to spatiotemporal tessellations of a space-time volume defined by a study area and time period. Furthermore, an index was developed to evaluate the partial spatiotemporal clustering level of one of the two event categories against the other category. The proposed method was applied to simulated data and a real-world dataset as a case study. Experimental results show that the method effectively measures the level of spatiotemporal clustering patterns among human activity events of multiple categories. The method can be applied to the analysis of large volumes of human activity events because of its computational efficiency.


Author(s):  
Pragathi Penikalapati ◽  
A. Nagaraja Rao

The compatibility issues among the characteristics of data involving numerical as well as categorical attributes (mixed) laid many challenges in pattern recognition field. Clustering is often used to group identical elements and to find structures out of data. However, clustering categorical data poses some notable challenges. Particularly clustering diversified (mixed) data constitute bigger challenges because of its range of attributes. Computations on such data are merely too complex to match the scales of numerical and categorical values due to its ranges and conversions. This chapter is intended to cover literature clustering algorithms in the context of mixed attribute unlabelled data. Further, this chapter will cover the types and state of the art methodologies that help in separating data by satisfying inter and intracluster similarity. This chapter further identifies challenges and Future research directions of state-of-the-art clustering algorithms with notable research gaps.


Sign in / Sign up

Export Citation Format

Share Document