Graph-Based Methods in Computer Vision
Latest Publications


TOTAL DOCUMENTS

16
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By IGI Global

9781466618916, 9781466618923

Author(s):  
Shikui Wei ◽  
Yao Zhao ◽  
Zhenfeng Zhu

With the growing popularity of video sharing websites and editing tools, it is easy for people to involve the video content from different sources into their own work, which raises the copyright problem. Content-based video copy detection attempts to track the usage of the copyright-protected video content by using video analysis techniques, which deals with not only whether a copy occurs in a query video stream but also where the copy is located and where the copy is originated from. While a lot of work has addressed the problem with good performance, less effort has been made to consider the copy detection problem in the case of a continuous query stream, for which precise temporal localization and some complex video transformations like frame insertion and video editing need to be handled. In this chapter, the authors attack the problem by employing the graphical model to facilitate the frame fusion based video copy detection approach. The key idea is to convert frame fusion problem into graph model decoding problem with the temporal consistency constraint and three relaxed constraints. This work employs the HMM model to perform frame fusion and propose a Viterbi-like algorithm to speedup frame fusion process.


Author(s):  
Bao Bing-Kun ◽  
Yan Shuicheng

Graph-based learning provides a useful approach for modeling data in image annotation problems. In this chapter, the authors introduce how to construct a region-based graph to annotate large scale multi-label images. It has been well recognized that analysis in semantic region level may greatly improve image annotation performance compared to that in whole image level. However, the region level approach increases the data scale to several orders of magnitude and lays down new challenges to most existing algorithms. To this end, each image is firstly encoded as a Bag-of-Regions based on multiple image segmentations. And then, all image regions are constructed into a large k-nearest-neighbor graph with efficient Locality Sensitive Hashing (LSH) method. At last, a sparse and region-aware image-based graph is fed into the multi-label extension of the Entropic graph regularized semi-supervised learning algorithm (Subramanya & Bilmes, 2009). In combination they naturally yield the capability in handling large-scale dataset. Extensive experiments on NUS-WIDE (260k images) and COREL-5k datasets well validate the effectiveness and efficiency of the framework for region-aware and scalable multi-label propagation.


Author(s):  
Shang Liu ◽  
Xiao Bai

In this chapter, the authors present a new method to improve the performance of current bag-of-words based image classification process. After feature extraction, they introduce a pairwise image matching scheme to select the discriminative features. Only the label information from the training-sets is used to update the feature weights via an iterative matching processing. The selected features correspond to the foreground content of the images, and thus highlight the high level category knowledge of images. Visual words are constructed on these selected features. This novel method could be used as a refinement step for current image classification and retrieval process. The authors prove the efficiency of their method in three tasks: supervised image classification, semi-supervised image classification, and image retrieval.


Author(s):  
Jiangjian Xiao

Given a video sequence, obtaining accurate layer segmentation and alpha matting is very important for video representation, analysis, compression, and synthesis. By assuming that a scene can be approximately described by multiple planar or surface regions, this chapter describes a robust approach to automatically detect the region clusters and perform accurate layer segmentation for the scene. The approach starts from optical flow field or small corresponding seed regions and applies a clustering approach to estimate the layer number and support regions. Then, it uses graph cut algorithm combined with a general occlusion constraint over multiple frames to solve pixel assignment over multiple frames to obtain more accurate segmentation boundary and identify the occluded pixels. For the non-textured ambiguous regions, an alpha matting technique is further used to refine the segmentation and resolve the ambiguities by determining proper alpha values for the foreground and background, respectively. Based on the alpha mattes, the foreground object can be transferred into the other video sequence to generate a virtual video. The author’s experiments show that the proposed approach is effective and robust for both the challenging real and synthetic sequences.


Author(s):  
Mario Vento ◽  
Pasquale Foggia

Many computer vision applications require a comparison between two objects, or between an object and a reference model. When the objects or the scenes are represented by graphs, this comparison can be performed using some form of graph matching. The aim of this chapter is to introduce the main graph matching techniques that have been used for computer vision, and to relate each application with the techniques that are most suited to it.


Author(s):  
Horst Bunke ◽  
Kaspar Riesen

The domain of graphs contains only little mathematical structure. That is, most of the basic mathematical operations, actually required by many standard computer vision and pattern recognition algorithms, are not available for graphs. One of the few mathematical concepts that has been successfully transferred from the vector space to the graph domain is distance computation between graphs, commonly referred to as graph matching. Yet, distance-based pattern recognition is basically limited to nearest-neighbor classification. The present chapter reviews a novel approach for graph embedding in vector spaces built upon the concept of graph matching. The key-idea of the proposed embedding method is to use the distances of an input graph to a number of training graphs, termed prototypes, as vectorial description of the graph. That is, all graph matching procedures proposed in the literature during the last decades can be employed in this embedding framework. The rationale for such a graph embedding is to bridge the gap between the high representational power and flexibility of graphs and the large amount of algorithms available for object representations in terms of feature vectors. Hence, the proposed framework can be considered a contribution towards unifying the domains of structural and statistical pattern recognition.


Author(s):  
Wang Jinjun

Exemplary based image super-resolution (SR) approaches decompose low-resolution (LR) images into multiple overlapped local image patches, and find the best high-resolution (HR) pair for each LR patch to generate processed HR images. The super-resolving process models these multiple HR/LR patches in a Markov Network where there exists both confidence constraint between the LR patch and the selected HR patch from database, and the harmonic constraint between neighboring HR patches. Such a graphical structure, however, makes the optimization process extremely slow, and therefore extensive research efforts on improving the efficiency of exemplary based SR methods have been reported. In this chapter, the focus is on those methods that aim at generating high quality HR patches from the database, while ignoring the harmonic constraint to speed up processing, such as those that model the problem as an embedding process, or as a feature selection process. As shown in this chapter, these approaches can all be regarded as a coding system. The contributions of the paper are two-fold: First, the chapter introduces a coding system with resolution-invariance property, such that it is able to handle continues-scale image resizing as compared to traditional methods that only support single integer-scale upsizing; second, the author generalizes the graphical model where the typical non-linear coding process is approximated by an easier-to-compute function. In this way, the SR process can be highly parallelized by modern computer hardware. As demonstrated by the chapter, the proposed system gives very promising image SR results in various aspects.


Author(s):  
Xiang Bai ◽  
Chunyuan Li ◽  
Xingwei Yang ◽  
Longin Jan Latecki

Skeleton- is well-known to be superior to contour-based representation when shapes have large nonlinear variability, especially articulation. However, approaches to shape similarity based on skeletons suffer from the instability of skeletons, and matching of skeleton graphs is still an open problem. To deal with this problem for shape retrieval, the authors first propose to match skeleton graphs by comparing the geodesic paths between skeleton endpoints. In contrast to typical tree or graph matching methods, they do not explicitly consider the topological graph structure. Their approach is motivated by the fact that visually similar skeleton graphs may have completely different topological structures, while the paths between their end nodes still remain similar. The proposed comparison of geodesic paths between endpoints of skeleton graphs yields correct matching results in such cases. The experimental results demonstrate that the method is able to produce correct results in the presence of articulations, stretching, and contour deformations. The authors also utilize the geodesic skeleton paths for shape classification. Similar to shape retrieval, direct graph matching algorithms like graph edit distance have great difficulties with the instability of the skeleton graph structure. In contrast, the representation based on skeleton paths remains stable. Therefore, a simple Bayesian classifier is able to obtain excellent shape classification results.


Author(s):  
Guoxing Zhao ◽  
Jixin Ma

Many graph matching algorithms follow the approach of node-similarity measurement, that is, matching graphs by means of comparing the corresponding pairs of nodes of the graphs. Based on this idea, the authors propose a high-level schema for node-to-node graph matching, namely N2N graph matching algorithm schema. The chapter shows that such a N2N graph matching algorithm schema is versatile enough to subsume most of the representative node-to-node based graph matching algorithms. It is also shown that improved algorithms can be derived from this N2N graph matching schema, compared with various corresponding algorithms. In addition, the authors point out the limitation and constraints of the propose algorithm schema and suggest some possible treatments.


Author(s):  
Guangyu Zhu ◽  
Shuicheng Yan ◽  
Tony X. Han ◽  
Changsheng Xu

Activity understanding plays an essential role in video content analysis and remains a challenging open problem. Most of previous research is limited due to the use of excessively localized features without sufficiently encapsulating the interaction context or focus on simply discriminative models but totally ignoring the interaction patterns. In this chapter, a new approach is proposed to recognize human group activities. Firstly, the authors designed a new quaternion descriptor to describe the interactive insight of activities regarding the appearance, dynamic, causality, and feedback, respectively. The designed descriptor along with the conventional velocity and position are capable of delineating the individual and pairwise interactions in the activities. Secondly, considering both activity category and interaction variety, the authors propose an extended pLSA (probabilistic Latent Semantic Analysis) model with two hidden variables. This extended probabilistic graphic paradigm constructed on the quaternion descriptors facilitates the effective inference of activity categories as well as the exploration of activity interaction patterns. The extensive experiments on realistic movie and human group activity datasets validate that the multilevel features are effective for activity interaction representation and demonstrate that the graphic model is a promising paradigm for activity recognition.


Sign in / Sign up

Export Citation Format

Share Document