A Content-Sensitive Approach to Search in Shared File Storages
The article presents a novel approach to search in shared audio file storages such as P2P-based systems. The proposed method enables the recognition of specific patterns in the audio contents, in such a way it extends the searching possibility from the description-based model to the content- based model. The targeted shared file storages seam to change contents rather unexpectedly. This volatile nature led our development to use real-time capable methods for the search process. The importance of the real-time pattern recognition algorithms that are used on audio data for content-sensitive searching in stream media has been growing over a decade (Liu, Wang, & Chen, 1998). The main problem of many algorithms is the optimal selection of the reference patterns (soundprints in our approach) used in the recognition procedure. This proposed method is based on distance maximization and is able to choose the pattern that later will be used as reference by the pattern recognition algorithms quickly (Richly, Kozma, Kovács & Hosszú, 2001). The presented method called EMESE (Experimental MEdia-Stream rEcognizer) is an important part of a lightweight content-searching method, which is suitable for the investigation of the network-wide shared file storages. This method was initially applied for real-time monitoring of the occurrence of known sound materials in broadcast audio. The experimental measurement data showed in the article demonstrate the efficiency of the procedure that was the reason for using it in shared audio database environment.