scholarly journals Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search

2020 ◽  
Vol 34 (01) ◽  
pp. 51-58 ◽  
Author(s):  
Xinyan Dai ◽  
Xiao Yan ◽  
Kelvin K. W. Ng ◽  
Jiu Liu ◽  
James Cheng

Vector quantization (VQ) techniques are widely used in similarity search for data compression, computation acceleration and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) — a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.

2014 ◽  
Vol 644-650 ◽  
pp. 2185-2188
Author(s):  
Qiang Li ◽  
Xiao Hong Zhang ◽  
Qing Yu Niu

In order to reduce bit-rate and still maintain fine distortion performance,this paper proposed a method of LSF quantization based on speech unvoiced/voiced classification. This method use differential LSF parameters from unvoiced/voiced database to train codebook. And using this method can suppress the quantization error propagation caused by directly vector quantization of LSF parameters. Experimental results show that using this method to quantify LSF will have a better quality while allocating the same number of bits.


2020 ◽  
Vol 12 (2) ◽  
pp. 206-215
Author(s):  
Hang Zou ◽  
Fengjun Zhao ◽  
Xiaoxue Jia ◽  
Heng Zhang ◽  
Wei Wang

2013 ◽  
Vol 333-335 ◽  
pp. 1106-1109
Author(s):  
Wei Wu

Palm vein pattern recognition is one of the newest biometric techniques researched today. This paper proposes project the palm vein image matrix based on independent component analysis directly, then calculates the Euclidean distance of the projection matrix, seeks the nearest distance for classification. The experiment has been done in a self-build palm vein database. Experimental results show that the algorithm of independent component analysis is suitable for palm vein recognition and the recognition performance is practical.


2021 ◽  
Author(s):  
Changyi Ma ◽  
Fangchen Yu ◽  
Yueyao Yu ◽  
Wenye Li

2018 ◽  
Vol 8 (4) ◽  
pp. 3203-3208
Author(s):  
P. N. Smyrlis ◽  
D. C. Tsouros ◽  
M. G. Tsipouras

Classification-via-clustering (CvC) is a widely used method, using a clustering procedure to perform classification tasks. In this paper, a novel K-Means-based CvC algorithm is presented, analysed and evaluated. Two additional techniques are employed to reduce the effects of the limitations of K-Means. A hypercube of constraints is defined for each centroid and weights are acquired for each attribute of each class, for the use of a weighted Euclidean distance as a similarity criterion in the clustering procedure. Experiments are made with 42 well–known classification datasets. The experimental results demonstrate that the proposed algorithm outperforms CvC with simple K-Means.


Author(s):  
Jun Zhou ◽  
Longfei Li ◽  
Ziqi Liu ◽  
Chaochao Chen

Recently, Factorization Machine (FM) has become more and more popular for recommendation systems due to its effectiveness in finding informative interactions between features. Usually, the weights for the interactions are learned as a low rank weight matrix, which is formulated as an inner product of two low rank matrices. This low rank matrix can help improve the generalization ability of Factorization Machine. However, to choose the rank properly, it usually needs to run the algorithm for many times using different ranks, which clearly is inefficient for some large-scale datasets. To alleviate this issue, we propose an Adaptive Boosting framework of Factorization Machine (AdaFM), which can adaptively search for proper ranks for different datasets without re-training. Instead of using a fixed rank for FM, the proposed algorithm will gradually increase its rank according to its performance until the performance does not grow. Extensive experiments are conducted to validate the proposed method on multiple large-scale datasets. The experimental results demonstrate that the proposed method can be more effective than the state-of-the-art Factorization Machines.


Sign in / Sign up

Export Citation Format

Share Document