Indexing and mining large-scale neuron databases using maximum inner product search

Recently, Factorization Machine (FM) has become more and more popular for recommendation systems due to its effectiveness in finding informative interactions between features. Usually, the weights for the interactions are learned as a low rank weight matrix, which is formulated as an inner product of two low rank matrices. This low rank matrix can help improve the generalization ability of Factorization Machine. However, to choose the rank properly, it usually needs to run the algorithm for many times using different ranks, which clearly is inefficient for some large-scale datasets. To alleviate this issue, we propose an Adaptive Boosting framework of Factorization Machine (AdaFM), which can adaptively search for proper ranks for different datasets without re-training. Instead of using a fixed rank for FM, the proposed algorithm will gradually increase its rank according to its performance until the performance does not grow. Extensive experiments are conducted to validate the proposed method on multiple large-scale datasets. The experimental results demonstrate that the proposed method can be more effective than the state-of-the-art Factorization Machines.

Download Full-text

Simple Yet Efficient Algorithms for Maximum Inner Product Search via Extreme Order Statistics

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining ◽

10.1145/3447548.3467345 ◽

2021 ◽

Author(s):

Ninh Pham

Keyword(s):

Order Statistics ◽

Efficient Algorithms ◽

Inner Product ◽

Extreme Order Statistics ◽

Product Search

Download Full-text

Joint Geosequential Preference and Distance Metric Factorization for Point-of-Interest Recommendation

Mathematical Problems in Engineering ◽

10.1155/2020/6582676 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Chunyang Liu ◽

Chao Liu ◽

Haiqiang Xin ◽

Jian Wang ◽

Jiping Liu ◽

...

Keyword(s):

Metric Space ◽

Matrix Factorization ◽

Euclidean Distance ◽

Large Scale ◽

Inner Product ◽

Interaction Matrix ◽

Distance Metric ◽

Point Of Interest ◽

Poi Recommendation ◽

Real World Datasets

Point-of-interest (POI) recommendation is a valuable service to help users discover attractive locations in location-based social networks (LBSNs). It focuses on capturing users’ movement patterns and location preferences by using massive historical check-in data. In the past decade, matrix factorization has become a mature and widely used technology in POI recommendation. However, the inner product of latent vectors adopted in matrix factorization methods does not satisfy the triangle inequality property, which may limit the expressiveness and lead to suboptimal solutions. Besides, the extreme sparsity of check-in data makes it challenging to capture users’ movement preferences accurately. In this paper, we propose a joint geosequential preference and distance metric factorization framework, called GeoSeDMF, for POI recommendation. First, we introduce a distance metric factorization method that is capable of learning users’ personalized preferences from a position and distance perspective in the metric space. Specifically, we convert the user-POI interaction matrix into a distance matrix and factorize it into user and POI dense embeddings. Additionally, we measure users’ personalized preference for the POI by using the Euclidean distance metric instead of the inner product. Then, we model the users’ geospatial preference by applying a geographic weight coefficient and model the users’ sequential preference by using the Euclidean distance of continuous check-in locations. Moreover, a pointwise loss strategy and AdaGrad algorithm are adopted to optimize the positions and relationships of users and POIs in a metric space. Finally, experimental results on three large-scale real-world datasets demonstrate the effectiveness and superiority of the proposed method.

Download Full-text

Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5333 ◽

2020 ◽

Vol 34 (01) ◽

pp. 51-58 ◽

Cited By ~ 1

Author(s):

Xinyan Dai ◽

Xiao Yan ◽

Kelvin K. W. Ng ◽

Jiu Liu ◽

James Cheng

Keyword(s):

Data Compression ◽

Vector Quantization ◽

Similarity Search ◽

Euclidean Distance ◽

Quantization Error ◽

Experimental Results ◽

Inner Product ◽

Direction Error ◽

Product Search ◽

Norm Error

Vector quantization (VQ) techniques are widely used in similarity search for data compression, computation acceleration and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) — a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.

Download Full-text

AdaLSH: Adaptive LSH for Solving c-Approximate Maximum Inner Product Search Problem

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2020edp7132 ◽

2021 ◽

Vol E104.D (1) ◽

pp. 138-145

Author(s):

Kejing LU ◽

Mineichi KUDO

Keyword(s):

Inner Product ◽

Search Problem ◽

Product Search

Download Full-text

Understanding and Improving Proximity Graph Based Maximum Inner Product Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5344 ◽

2020 ◽

Vol 34 (01) ◽

pp. 139-146

Author(s):

Jie Liu ◽

Xiao Yan ◽

Xinyan Dai ◽

Zhirong Li ◽

James Cheng ◽

...

Keyword(s):

State Of The Art ◽

Small World ◽

Inner Product ◽

Robust Performance ◽

Small Norm ◽

Order Of Magnitude ◽

Product Search ◽

Proximity Graph ◽

Product Proximity ◽

Strong Norm

The inner-product navigable small world graph (ip-NSW) represents the state-of-the-art method for approximate maximum inner product search (MIPS) and it can achieve an order of magnitude speedup over the fastest baseline. However, to date it is still unclear where its exceptional performance comes from. In this paper, we show that there is a strong norm bias in the MIPS problem, which means that the large norm items are very likely to become the result of MIPS. Then we explain the good performance of ip-NSW as matching the norm bias of the MIPS problem — large norm items have big in-degrees in the ip-NSW proximity graph and a walk on the graph spends the majority of computation on these items, thus effectively avoids unnecessary computation on small norm items. Furthermore, we propose the ip-NSW+ algorithm, which improves ip-NSW by introducing an additional angular proximity graph. Search is first conducted on the angular graph to find the angular neighbors of a query and then the MIPS neighbors of these angular neighbors are used to initialize the candidate pool for search on the inner-product proximity graph. Experiment results show that ip-NSW+ consistently and significantly outperforms ip-NSW and provides more robust performance under different data distributions.

Download Full-text

Maximum inner-product search using cone trees

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '12 ◽

10.1145/2339530.2339677 ◽

2012 ◽

Cited By ~ 41

Author(s):

Parikshit Ram ◽

Alexander G. Gray

Keyword(s):

Inner Product ◽

Product Search

Download Full-text

Indexing and mining large-scale neuron databases using maximum inner product search

Maximum inner product search for morphological retrieval of large-scale neuron data

Learning Sparse Binary Code for Maximum Inner Product Search

Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search

A Boosting Framework of Factorization Machine

Simple Yet Efficient Algorithms for Maximum Inner Product Search via Extreme Order Statistics

Joint Geosequential Preference and Distance Metric Factorization for Point-of-Interest Recommendation

Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search

AdaLSH: Adaptive LSH for Solving c-Approximate Maximum Inner Product Search Problem

Understanding and Improving Proximity Graph Based Maximum Inner Product Search

Maximum inner-product search using cone trees

Export Citation Format