scholarly journals Universality of Logarithmic Loss in Fixed-Length Lossy Compression

Entropy ◽  
2019 ◽  
Vol 21 (6) ◽  
pp. 580
Author(s):  
Albert No

We established a universality of logarithmic loss over a finite alphabet as a distortion criterion in fixed-length lossy compression. For any fixed-length lossy-compression problem under an arbitrary distortion criterion, we show that there is an equivalent lossy-compression problem under logarithmic loss. The equivalence is in the strong sense that we show that finding good schemes in corresponding lossy compression under logarithmic loss is essentially equivalent to finding good schemes in the original problem. This equivalence relation also provides an algebraic structure in the reconstruction alphabet, which allows us to use known techniques in the clustering literature. Furthermore, our result naturally suggests a new clustering algorithm in the categorical data-clustering problem.

2006 ◽  
Vol 3 (1) ◽  
pp. 23-32 ◽  
Author(s):  
Zengyou He ◽  
Xiaofei Xu ◽  
Shenchun Deng

This paper presents an improved Squeezer algorithm for categorical data clustering by giving greater weight to uncommon attribute value matches in similarity computations. Experimental results on real life datasets show that, the modified algorithm is superior to the original Squeezer algorithm and other clustering algorithm with respect to clustering accuracy.


In data mining ample techniques use distance based measures for data clustering. Improving clustering performance is the fundamental goal in cluster domain related tasks. Many techniques are available for clustering numerical data as well as categorical data. Clustering is an unsupervised learning technique and objects are grouped or clustered based on similarity among the objects. A new cluster similarity finding measure, which is cosine like cluster similarity measure (CLCSM), is proposed in this paper. The proposed cluster similarity measure is used for data classification. Extensive experiments are conducted by taking UCI machine learning datasets. The experimental results have shown that the proposed cosinelike cluster similarity measure is superior to many of the existing cluster similarity measures for data classification.


2011 ◽  
pp. 154-159
Author(s):  
Thomas R. Shultz ◽  
Scott E. Fahlman ◽  
Susan Craw ◽  
Periklis Andritsos ◽  
Panayiotis Tsaparas ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document