scholarly journals CTC-Based Learning of Chroma Features for ScoreAudio Music Retrieval

Author(s):  
Frank Zalkow ◽  
Meinard Mueller
Keyword(s):  
Signals ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 336-352
Author(s):  
Frank Zalkow ◽  
Julian Brandner ◽  
Meinard Müller

Flexible retrieval systems are required for conveniently browsing through large music collections. In a particular content-based music retrieval scenario, the user provides a query audio snippet, and the retrieval system returns music recordings from the collection that are similar to the query. In this scenario, a fast response from the system is essential for a positive user experience. For realizing low response times, one requires index structures that facilitate efficient search operations. One such index structure is the K-d tree, which has already been used in music retrieval systems. As an alternative, we propose to use a modern graph-based index, denoted as Hierarchical Navigable Small World (HNSW) graph. As our main contribution, we explore its potential in the context of a cross-version music retrieval application. In particular, we report on systematic experiments comparing graph- and tree-based index structures in terms of the retrieval quality, disk space requirements, and runtimes. Despite the fact that the HNSW index provides only an approximate solution to the nearest neighbor search problem, we demonstrate that it has almost no negative impact on the retrieval quality in our application. As our main result, we show that the HNSW-based retrieval is several orders of magnitude faster. Furthermore, the graph structure also works well with high-dimensional index items, unlike the tree-based structure. Given these merits, we highlight the practical relevance of the HNSW graph for music information retrieval (MIR) applications.


2008 ◽  
Vol 81 (7) ◽  
pp. 1065-1080 ◽  
Author(s):  
Seungmin Rho ◽  
Byeong-jun Han ◽  
Eenjun Hwang ◽  
Minkoo Kim

2009 ◽  
Vol 03 (02) ◽  
pp. 209-234 ◽  
Author(s):  
YI YU ◽  
KAZUKI JOE ◽  
VINCENT ORIA ◽  
FABIAN MOERCHEN ◽  
J. STEPHEN DOWNIE ◽  
...  

Research on audio-based music retrieval has primarily concentrated on refining audio features to improve search quality. However, much less work has been done on improving the time efficiency of music audio searches. Representing music audio documents in an indexable format provides a mechanism for achieving efficiency. To address this issue, in this work Exact Locality Sensitive Mapping (ELSM) is suggested to join the concatenated feature sets and soft hash values. On this basis we propose audio-based music indexing techniques, ELSM and Soft Locality Sensitive Hash (SoftLSH) using an optimized Feature Union (FU) set of extracted audio features. Two contributions are made here. First, the principle of similarity-invariance is applied in summarizing audio feature sequences and utilized in training semantic audio representations based on regression. Second, soft hash values are pre-calculated to help locate the searching range more accurately and improve collision probability among features similar to each other. Our algorithms are implemented in a demonstration system to show how to retrieve and evaluate multi-version audio documents. Experimental evaluation over a real "multi-version" audio dataset confirms the practicality of ELSM and SoftLSH with FU and proves that our algorithms are effective for both multi-version detection (online query, one-query vs. multi-object) and same content detection (batch queries, multi-queries vs. one-object).


2019 ◽  
Vol 36 (1) ◽  
pp. 52-62 ◽  
Author(s):  
Meinard Mueller ◽  
Andreas Arzt ◽  
Stefan Balke ◽  
Matthias Dorfer ◽  
Gerhard Widmer
Keyword(s):  

2008 ◽  
Vol 16 (6) ◽  
pp. 1152-1162 ◽  
Author(s):  
I. Karydis ◽  
A. Nanopoulos ◽  
A. Papadopoulos ◽  
D. Katsaros ◽  
Y. Manolopoulos

Sign in / Sign up

Export Citation Format

Share Document