Machine Learning Techniques for Adaptive Multimedia Retrieval
Latest Publications


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By IGI Global

9781616928599, 9781616928612

Author(s):  
Clement H.C. Leung ◽  
Jiming Liu ◽  
Alfredo Milani ◽  
Alice W.S. Chan

With the rapid advancement of music compression and storage technologies, digital music can be easily created, shared and distributed, not only in computers, but also in numerous portable digital devices. Music often constitutes a key component in many multimedia databases, and as they grow in size and complexity, their meaningful search and retrieval become important and necessary. Music Information Retrieval (MIR) is a relatively young and challenging research area started since the late 1990s. Although some form of music retrieval is available on the Internet, these tend to be inflexible and have significant limitations. Currently, most of these music retrieval systems only rely on low-level music information contents (e.g., metadata, album title, lyrics, etc.), and in this chapter, the authors present an adaptive indexing approach to search and discover music information. Experimental results show that through such an indexing architecture, high-level music semantics may be incorporated into search strategies.


Author(s):  
Min Chen

The fast proliferation of video data archives has increased the need for automatic video content analysis and semantic video retrieval. Since temporal information is critical in conveying video content, in this chapter, an effective temporal-based event detection framework is proposed to support high-level video indexing and retrieval. The core is a temporal association mining process that systematically captures characteristic temporal patterns to help identify and define interesting events. This framework effectively tackles the challenges caused by loose video structure and class imbalance issues. One of the unique characteristics of this framework is that it offers strong generality and extensibility with the capability of exploring representative event patterns with little human interference. The temporal information and event detection results can then be input into our proposed distributed video retrieval system to support the high-level semantic querying, selective video browsing and event-based video retrieval.


Author(s):  
Zhiyong Wang ◽  
Dagan Feng

Visual information has been immensely used in various domains such as web, education, health, and digital libraries, due to the advancements of computing technologies. Meanwhile, users realize that it has been more and more difficult to find desired visual content such as images. Though traditional content-based retrieval (CBR) systems allow users to access visual information through query-by-example with low level visual features (e.g. color, shape, and texture), the semantic gap is widely recognized as a hurdle for practical adoption of CBR systems. Wealthy visual information (e.g. user generated visual content) enables us to derive new knowledge at a large scale, which will significantly facilitate visual information management. Besides semantic concept detection, semantic relationship among concepts can also be explored in visual domain, other than traditional textual domain. Therefore, this chapter aims to provide an overview of the state-of-the-arts on discovering semantics in visual domain from two aspects, semantic concept detection and knowledge discovery from visual information at semantic level. For the first aspect, various aspects of visual information annotation are discussed, including content representation, machine learning based annotation methodologies, and widely used datasets. For the second aspect, a novel data driven based approach is introduced to discover semantic relevance among concepts in visual domain. Future research topics are also outlined.


Author(s):  
Wing-Yin Chau ◽  
Chia-Hung Wei ◽  
Yue Li

With the rapid increase in the amount of registered trademarks around the world, trademark image retrieval has been developed to deal with a vast amount of trademark images in a trademark registration system. Many different approaches have been developed throughout these years in an attempt to develop an effective TIR system. Some conventional approaches used in content-based image retrieval, such as moment invariants, Zernike moments, Fourier descriptors and curvature scale space descriptors, have also been widely used in TIR. These approaches, however, contain some major deficiencies when addressing the TIR problem. Therefore, this chapter proposes a novel approach in order to overcome the major deficiencies of the conventional approaches. The proposed approach combines the Zernike moments descriptors with the centroid distance representation and the curvature representation. The experimental results show that the proposed approach outperforms the conventional approaches in several circumstances. Details regarding to the proposed approach as well as the conventional approaches are presented in this chapter.


Author(s):  
Jiaxiong Pi ◽  
Yong Shi ◽  
Zhengxin Chen

Image content analysis plays an important role for adaptive multimedia retrieval. In this chapter, the authors present their work on using a useful spatial data structure, R*-tree, for similarity analysis and cluster analysis of image contents. First, they describe an R*-tree based similarity analysis tool for similarity retrieval of images. They then move on to discuss R*-tree based clustering methods for images, which has been a tricky issue: although objects stored in the same R* tree leaf node enjoys spatial proximity, it is well-known that R* trees cannot be used directly for cluster analysis. Nevertheless, R* tree’s indexing feature can be used to assist existing cluster analysis methods, thus enhancing their performance of cluster quality. In this chapter, the authors report their progress of using R* trees to improve well-known K-means and hierarchical clustering methods. Based on R*-Tree’s feature of indexing Minimum Bounding Box (MBB) according to spatial proximity, the authors extend R*-Tree’s application to cluster analysis containing image data. Two improved algorithms, KMeans-R and Hierarchy-R, are proposed. Experiments have shown that KMeans-R and Hierarchy-R have achieved better clustering quality.


Author(s):  
Kristoffer Jensen

Most music is generally published in a cluster of songs, called an album, although many, if not most people enjoy individual songs, commonly called singles. This study proposes to investigate whether or not there is a reason for assembling and enjoying full albums. Two different approaches are undertaken in order to investigate this, both based on audio features, calculated from the music, and related to the common music dimensions rhythm, timbre and chroma. In the first experiment, automatic segmentation is done on full music albums. If the segmentation is done on song boundaries, which is to be expected, as different fade-ins and –outs are employed, then songs are seen as the homogenous units, while if the boundaries are found within songs, then other homogenous units also exist. A second experiment on music sorting by similarity reveals findings on the sorting complexity of music albums. If the sorting complexity is high, then the albums are unordered; otherwise the album is ordered with regards to the features. A discussion of the results of the evaluation of the segment boundaries and sorting complexity reveals interesting findings.


Author(s):  
Zhen Guo ◽  
Christos Faloutsos ◽  
Zhongfei (Mark) Zhang ◽  
Zhongfei (Mark) Zhang

This chapter presents a highly scalable and adaptable co-learning framework on multimodal data mining in a multimedia database. The co-learning framework is based on the multiple instance learning theory. The framework enjoys a strong scalability in the sense that the query time complexity is a constant, independent of the database scale, and the mining effectiveness is also independent of the database scale, allowing facilitating a multimodal querying to a very large scale multimedia database. At the same time, this framework also enjoys a strong adaptability in the sense that it allows incrementally updating the database indexing with a constant operation when the database is dynamically updated with new information. Hence, this framework excels many of the existing multimodal data mining methods in the literature that are neither scalable nor adaptable at all. Theoretic analysis and empirical evaluations are provided to demonstrate the advantage of the strong scalability and adaptability. While this framework is general for multimodal data mining in any specific domains, to evaluate this framework, the authors apply it to the Berkeley Drosophila ISH embryo image database for the evaluations of the mining performance. They have compared the framework with a state-of-the-art multimodal data mining method to demonstrate the effectiveness and the promise of the framework.


Author(s):  
Hua-Tsung Chen ◽  
Suh-Yin Lee

The explosive proliferation of multimedia data necessitates the development of automatic systems and tools for content-based multimedia analysis. Recently, sports video analysis has been attracting more and more attention due to the potential commercial benefits, entertaining functionalities and mass audience requirements. Much research on shot classification, highlight extraction and event detection in sports video has been done to provide the general audience interactive video viewing systems for quick browsing, indexing and summarization. More keenly than ever, the audience desire professional insights into the games. The coach and the players demand automatic tactics analysis and performance evaluation with the aid of multimedia information retrieval technologies. It is also a growing trend to provide computer-assisted umpiring in sports games, such as the well-known Hawk eye system used in tennis. Therefore, sports video analysis is certainly a research issue worth investigation. In this chapter, the authors propose to review current research and give an insight into sports video analysis. The discussion on potential applications and encouraging future work is also presented.


Author(s):  
Bogdan Ionescu ◽  
Patrick Lambert ◽  
Didier Coquin ◽  
Alexandru Marin ◽  
Constantin Vertan

In this chapter the authors tackle the analysis and characterization of the artistic animated movies in view of constituting an automatic content-based retrieval system. First, they deal with temporal segmentation, and propose cut, fade and dissolve detection methods adapted to the constraints of this domain. Further, they discuss a fuzzy linguistic approach for automatic symbolic/semantic content annotation in terms of color techniques and action content and we test its potential in automatic video classification. The browsing issue is dealt by providing methods for both, static and dynamic video abstraction. For a quick browse of the movie’s visual content the authors create a storyboard-like summary, while for a “sneak peak” of the movie’s exciting action content they propose a trailer-like video skim. Finally, the authors discuss the architecture of a prototype client-server 3D virtual environment for interactive video retrieval. Several experimental results are presented.


Author(s):  
Isak Taksa ◽  
Sarah Zelikovitz ◽  
Amanda Spink

Background knowledge has been actively investigated as a possible means to improve performance of machine learning algorithms. Research has shown that background knowledge plays an especially critical role in three atypical text categorization tasks: short-text classification, limited labeled data, and non-topical classification. This chapter explores the use of machine learning for non-hierarchical classification of search queries, and presents an approach to background knowledge discovery by using information retrieval techniques. Two different sets of background knowledge that were obtained from the World Wide Web, one in 2006 and one in 2009, are used with the proposed approach to classify a commercial corpus of web query data by the age of the user. In the process, various classification scenarios are generated and executed, providing insight into choice, significance and range of tuning parameters, and exploring impact of the dynamic web on classification results.


Sign in / Sign up

Export Citation Format

Share Document