A Meta-Algorithm for Improving Top-N Prediction Efficiency of Matrix Factorization Models in Collaborative Filtering

Matrix factorization models often reveal the low-dimensional latent structure in high-dimensional spaces while bringing space efficiency to large-scale collaborative filtering problems. Improving training and prediction time efficiencies of these models are also important since an accurate model may raise practical concerns if it is slow to capture the changing dynamics of the system. For the training task, powerful improvements have been proposed especially using SGD, ALS, and their parallel versions. In this paper, we focus on the prediction task and combine matrix factorization with approximate nearest neighbor search methods to improve the efficiency of top-N prediction queries. Our efforts result in a meta-algorithm, MMFNN, which can employ various common matrix factorization models, drastically improve their prediction efficiency, and still perform comparably to standard prediction approaches or sometimes even better in terms of predictive power. Using various batch, online, and incremental matrix factorization models, we present detailed empirical analysis results on many large implicit feedback datasets from different application domains.

Download Full-text

Large-scale high-dimensional nearest neighbor search using flash memory with in-store processing

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig) ◽

10.1109/reconfig.2015.7393324 ◽

2015 ◽

Cited By ~ 1

Author(s):

Sang-Woo Jun ◽

Chanwoo Chung ◽

Arvind

Keyword(s):

Flash Memory ◽

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

High Dimensional ◽

Neighbor Search

Download Full-text

Approximate Nearest Neighbor Search for Low Dimensional Queries

Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms ◽

10.1137/1.9781611973082.67 ◽

2011 ◽

Cited By ~ 1

Author(s):

Sariel Har-Peled ◽

Nirman Kumar

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Approximate Nearest Neighbor Search ◽

Approximate Nearest Neighbor ◽

Neighbor Search ◽

Low Dimensional

Download Full-text

Very large scale nearest neighbor search: ideas, strategies and challenges

International Journal of Multimedia Information Retrieval ◽

10.1007/s13735-013-0046-4 ◽

2013 ◽

Vol 2 (4) ◽

pp. 229-241 ◽

Cited By ~ 5

Author(s):

Erik Gast ◽

Ard Oerlemans ◽

Michael S. Lew

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Neighbor Search

Download Full-text

QuickN: Practical and Secure Nearest Neighbor Search on Encrypted Large-Scale Data

IEEE Transactions on Cloud Computing ◽

10.1109/tcc.2020.3009961 ◽

2020 ◽

pp. 1-1

Author(s):

Boyang Wang ◽

Yantian Hou ◽

Ming Li

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Neighbor Search ◽

Large Scale Data ◽

Scale Data

Download Full-text

Efficient Large-Scale Approximate Nearest Neighbor Search on OpenCL FPGA

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition ◽

10.1109/cvpr.2018.00517 ◽

2018 ◽

Cited By ~ 6

Author(s):

Jialiang Zhang ◽

Jing Li ◽

Soroosh Khoram

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Approximate Nearest Neighbor Search ◽

Approximate Nearest Neighbor ◽

Neighbor Search

Download Full-text

Efficient Large-Scale Approximate Nearest Neighbor Search on the GPU

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2016.223 ◽

2016 ◽

Cited By ~ 12

Author(s):

Patrick Wieschollek ◽

Oliver Wang ◽

Alexander Sorkine-Hornung ◽

Hendrik P. A. Lensch

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Approximate Nearest Neighbor Search ◽

Approximate Nearest Neighbor ◽

Neighbor Search

Download Full-text

Accelerating Large-Scale Nearest Neighbor Search with Computational Storage Device

2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) ◽

10.1109/fccm51124.2021.00041 ◽

2021 ◽

Author(s):

Ji-Hoon Kim ◽

Yeo-Reum Park ◽

Jaeyoung Do ◽

Soo-Young Ji ◽

Joo-Young Kim

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Storage Device ◽

Neighbor Search

Download Full-text

A Survey: Over Various Hashing Techniques Which Provide Nearest Neighbor Search in Large Scale Data

International Journal of Computer Trends and Technology ◽

10.14445/22312803/ijctt-v36p118 ◽

2016 ◽

Vol 36 (2) ◽

pp. 101-106

Author(s):

Mahendra Kumar Ahirwar ◽

◽

Jitendra Agrawal

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Neighbor Search ◽

Large Scale Data ◽

Scale Data

Download Full-text

Complementary Binary Quantization for Joint Multiple Indexing

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/292 ◽

2018 ◽

Cited By ~ 2

Author(s):

Qiang Fu ◽

Xu Han ◽

Xianglong Liu ◽

Jingkuan Song ◽

Cheng Deng

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Hash Tables ◽

Indexing Methods ◽

Neighbor Search ◽

Binary Coding ◽

Successful Technique ◽

Performance Gains ◽

Massive Databases

Building multiple hash tables has been proven a successful technique for indexing massive databases, which can guarantee a desired level of overall performance. However, existing hash based multi-indexing methods suffer from the heavy redundancy, without strong table complementarity and effective hash code learning. To address the problems, this paper proposes a complementary binary quantization (CBQ) method to jointly learning multiple hash tables. It exploits the power of incomplete binary coding based on prototypes to align the original space and the Hamming space, and further utilizes the nature of multi-indexing search to jointly reduce the quantization loss based on the prototype based hash function. Our alternating optimization adaptively discovers the complementary prototype sets and the corresponding code sets of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes. Extensive experiments carried out on two popular large-scale tasks including Euclidean and semantic nearest neighbor search demonstrate that the proposed CBQ method enjoys the strong table complementarity and significantly outperforms the state-of-the-art, with up to 57.76\% performance gains relatively.

Download Full-text

Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music

Applied Sciences ◽

10.3390/app10010019 ◽

2019 ◽

Vol 10 (1) ◽

pp. 19 ◽

Cited By ~ 1

Author(s):

Frank Zalkow ◽

Meinard Müller

Keyword(s):

Neural Networks ◽

Dimensionality Reduction ◽

Classical Music ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Neighbor Search ◽

Reduction Methods ◽

Retrieval Problem ◽

Low Dimensional ◽

Western Classical Music

Cross-version music retrieval aims at identifying all versions of a given piece of music using a short query audio fragment. One previous approach, which is particularly suited for Western classical music, is based on a nearest neighbor search using short sequences of chroma features, also referred to as audio shingles. From the viewpoint of efficiency, indexing and dimensionality reduction are important aspects. In this paper, we extend previous work by adapting two embedding techniques; one is based on classical principle component analysis, and the other is based on neural networks with triplet loss. Furthermore, we report on systematically conducted experiments with Western classical music recordings and discuss the trade-off between retrieval quality and embedding dimensionality. As one main result, we show that, using neural networks, one can reduce the audio shingles from 240 to fewer than 8 dimensions with only a moderate loss in retrieval accuracy. In addition, we present extended experiments with databases of different sizes and different query lengths to test the scalability and generalizability of the dimensionality reduction methods. We also provide a more detailed view into the retrieval problem by analyzing the distances that appear in the nearest neighbor search.

Download Full-text