Constructing and Utilizing Video Ontology for Accurate and Fast Retrieval

Author(s):  
Kimiaki Shirahama ◽  
Kuniaki Uehara

This paper examines video retrieval based on Query-By-Example (QBE) approach, where shots relevant to a query are retrieved from large-scale video data based on their similarity to example shots. This involves two crucial problems: The first is that similarity in features does not necessarily imply similarity in semantic content. The second problem is an expensive computational cost to compute the similarity of a huge number of shots to example shots. The authors have developed a method that can filter a large number of shots irrelevant to a query, based on a video ontology that is knowledge base about concepts displayed in a shot. The method utilizes various concept relationships (e.g., generalization/specialization, sibling, part-of, and co-occurrence) defined in the video ontology. In addition, although the video ontology assumes that shots are accurately annotated with concepts, accurate annotation is difficult due to the diversity of forms and appearances of the concepts. Dempster-Shafer theory is used to account the uncertainty in determining the relevance of a shot based on inaccurate annotation of this shot. Experimental results on TRECVID 2009 video data validate the effectiveness of the method.

Author(s):  
Kimiaki Shirahama ◽  
Kuniaki Uehara

This paper examines video retrieval based on Query-By-Example (QBE) approach, where shots relevant to a query are retrieved from large-scale video data based on their similarity to example shots. This involves two crucial problems: The first is that similarity in features does not necessarily imply similarity in semantic content. The second problem is an expensive computational cost to compute the similarity of a huge number of shots to example shots. The authors have developed a method that can filter a large number of shots irrelevant to a query, based on a video ontology that is knowledge base about concepts displayed in a shot. The method utilizes various concept relationships (e.g., generalization/specialization, sibling, part-of, and co-occurrence) defined in the video ontology. In addition, although the video ontology assumes that shots are accurately annotated with concepts, accurate annotation is difficult due to the diversity of forms and appearances of the concepts. Dempster-Shafer theory is used to account the uncertainty in determining the relevance of a shot based on inaccurate annotation of this shot. Experimental results on TRECVID 2009 video data validate the effectiveness of the method.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Chen Zhang ◽  
Bin Hu ◽  
Yucong Suo ◽  
Zhiqiang Zou ◽  
Yimu Ji

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.


2015 ◽  
Vol 3 (3) ◽  
pp. 1-13 ◽  
Author(s):  
Hiroki Nomiya ◽  
Atsushi Morikuni ◽  
Teruhisa Hochin

A lifelog video retrieval framework is proposed for the better utilization of a large amount of lifelog video data. The proposed method retrieves emotional scenes such as the scenes in which a person in the video is smiling, considering that a certain important event could happen in most of emotional scenes. The emotional scene is detected on the basis of facial expression recognition using a wide variety of facial features. The authors adopt an unsupervised learning approach called ensemble clustering in order to recognize the facial expressions because supervised learning approaches require sufficient training data, which make it quite troublesome to apply to large-scale video databases. The retrieval performance of the proposed method is evaluated by means of an emotional scene detection experiment from the viewpoints of accuracy and efficiency. In addition, a prototype retrieval system is implemented based on the proposed emotional scene detection method.


Author(s):  
Lilac Al-Safadi ◽  
Janusz Getta

The advancement of multimedia technologies has enabled electronic processing of information to be recorded in formats that are different from the standard text format. These include image, audio and video formats. The video format is a rich and expressive form of media used in many areas of our everyday life, such as in education, medicine and engineering. The expressiveness of video documents is the main reason for their domination in future information systems. Therefore, effective and efficient access to video information that supports video-based applications has become a critical research area. This has led to the development of, for example, new digitizing and compression tools and technology, video data models and query languages, video data management systems and video analyzers. With applications of a vast amount of stored video data, such as news archives and digital television, video retrieval became, and still is, an active area of research.


1996 ◽  
Vol 6 (2) ◽  
pp. 167-188 ◽  
Author(s):  
Simon Ambler

Argumentation is a proof theoretic paradigm for reasoning under uncertainty. Whereas a ‘proof’ establishes its conclusion outright, an ‘argument’ can only lend a measure of support. Thus, the process of argumentation consists of identifying all the arguments for a particular hypothesis φ, and then calculating the support for φ from the weight attached to these individual arguments. Argumentation has been incorporated as the inference mechanism of a large scale medical expert system, the ‘Oxford System of Medicine’ (OSM), and it is therefore important to demonstrate that the approach is theoretically justified. This paper provides a formal semantics for the notion of argument embodied in the OSM. We present a categorical account in which arguments are the arrows of a semilattice enriched category. The axioms of a cartesian closed category are modified to give the notion of an ‘evidential closed category’, and we show that this provides the correct enriched setting in which to model the connectives of conjunction (&) and implication (⇒).Finally, we develop a theory of ‘confidence measures’ over such categories, and relate this to the Dempster-Shafer theory of evidence.


Author(s):  
Lilac Al-Safadi ◽  
Janusz Getta

The advancement of multimedia technologies has enabled electronic processing of information to be recorded in formats that are different from the standard text format. These include image, audio and video formats. The video format is a rich and expressive form of media used in many areas of our everyday life, such as in education, medicine and engineering. The expressiveness of video documents is the main reason for their domination in future information systems. Therefore, effective and efficient access to video information that supports video-based applications has become a critical research area. This has led to the development of, for example, new digitizing and compression tools and technology, video data models and query languages, video data management systems and video analyzers. With applications of a vast amount of stored video data, such as news archives and digital television, video retrieval became, and still is, an active area of research.


Author(s):  
Hoang-Viet Tran ◽  
Pham Ngoc Hung

Assume-guarantee reasoning, a well-known approach in component-based software (CBS) verification, is infact a language containment problem whose computational cost depends on the sizes of languages of the softwarecomponents under checking and the assumption to be generated. Therefore, the smaller language assumptions,the more computational cost we can reduce in software verification. Moreover, strong assumptions are moreimportant in CBS verification in the context of software evolution because they can be reused many times in theverification process. For this reason, this paper presents a method for generating locally strongest assumptions withlocally smallest languages during CBS verification. The key idea of this method is to create a variant techniquefor answering membership queries of the Teacher when responding to the Learner in the L–based assumptionlearning process. This variant technique is then integrated into an algorithm in order to generate locally strongestassumptions. These assumptions will effectively reduce the computational cost when verifying CBS, especiallyfor large–scale and evolving ones. The correctness proof, experimental results, and some discussions about theproposed method are also presented.Keywords: Assume-guarantee reasoning, Model checking, Component-based software verification, Locallystrongest assumptions, Locally smallest language assumptions.


Sensors ◽  
2020 ◽  
Vol 20 (12) ◽  
pp. 3374
Author(s):  
Ting-Yu Hsu ◽  
Xiang-Ju Kuo

Computer vision-based approaches are very useful for dynamic displacement measurement, damage detection, and structural health monitoring. However, for the application using a large number of existing cameras in buildings, the computational cost of videos from dozens of cameras using a centralized computer becomes a huge burden. Moreover, when a manual process is required for processing the videos, prompt safety assessment of tens of thousands of buildings after a catastrophic earthquake striking a megacity becomes very challenging. Therefore, a decentralized and fully automatic computer vision-based approach for prompt building safety assessment and decision-making is desired for practical applications. In this study, a prototype of a novel stand-alone smart camera system for measuring interstory drifts was developed. The proposed system is composed of a single camera, a single-board computer, and two accelerometers with a microcontroller unit. The system is capable of compensating for rotational effects of the camera during earthquake excitations. Furthermore, by fusing the camera-based interstory drifts with the accelerometer-based ones, the interstory drifts can be measured accurately even when residual interstory drifts exist. Algorithms used to compensate for the camera’s rotational effects, algorithms used to track the movement of three targets within three regions of interest, artificial neural networks used to convert the interstory drifts to engineering units, and some necessary signal processing algorithms, including interpolation, cross-correlation, and filtering algorithms, were embedded in the smart camera system. As a result, online processing of the video data and acceleration data using decentralized computational resources is achieved in each individual smart camera system to obtain interstory drifts. Using the maximum interstory drifts measured during an earthquake, the safety of a building can be assessed right after the earthquake excitation. We validated the feasibility of the prototype of the proposed smart camera system through the use of large-scale shaking table tests of a steel building. The results show that the proposed smart camera system had very promising results in terms of assessing the safety of steel building specimens after earthquake excitations.


Author(s):  
Jibiao Zhou ◽  
Xinhua Mao ◽  
Yiting Wang ◽  
Minjie Zhang ◽  
Sheng Dong

Urban Large-scale Public Spaces (ULPS) are important areas of urban culture and economic development, which are also places of the potential safety hazard. ULPS safety assessment has played a crucial role in the theory and practice of urban sustainable development. The primary objective of this study is to explore the interaction between ULPS safety risk and its influencing factors. In the first stage, an index sensitivity analysis method was applied to calculate and identify the safety risk assessment index system. Next, a Delphi method and information entropy method were also applied to collect and calculate the weight of risk assessment indicators. In the second stage, a Dempster-Shafer Theory (DST) method with evidence fusion technique was utilized to analyze the interaction between the ULPS safety risk level and the multiple-index variables, measured by four observed performance indicators, i.e., environmental factor, human factor, equipment factor, and management factor. Finally, an empirical study of DST approach for ULPS safety performance analysis was presented.


Sign in / Sign up

Export Citation Format

Share Document