RECENT ADVANCES IN VIDEO CONTENT ANALYSIS: FROM VISUAL FEATURES TO SEMANTIC VIDEO SEGMENTS

2001 ◽  
Vol 01 (01) ◽  
pp. 63-81 ◽  
Author(s):  
ALAN HANJALIC ◽  
REGINALD L. LAGENDIJK ◽  
JAN BIEMOND

This paper addresses the problem of automatically partitioning a video into semantic segments using visual low-level features only. Semantic segments may be understood as building content blocks of a video with a clear sequential content structure. Examples are reports in a news program, episodes in a movie, scenes of a situation comedy or topic segments of a documentary. In some video genres like news programs or documentaries, the usage of different media (visual, audio, speech, text) may be beneficial or is even unavoidable for reliably detecting the boundaries between semantic segments. In many other genres, however, the pay-off in using different media for the purpose of high-level segmentation is not high. On the one hand, relating the audio, speech or text to the semantic temporal structure of video content is generally very difficult. This is especially so in "acting" video genres like movies and situation comedies. On the other hand, the information contained in the visual stream of these video genres often seems to provide the major clue about the position of semantic segments boundaries. Partitioning a video into semantic segments can be performed by measuring the coherence of the content along neighboring video shots of a sequence. The segment boundaries are then found at places (e.g., shot boundaries) where the values of content coherence are sufficiently low. On the basis of two state-of-the-art techniques for content coherence modeling, we illustrate in this paper the current possibilities for detecting the boundaries of semantic segments using visual low-level features only.

2015 ◽  
Vol 2015 ◽  
pp. 1-10
Author(s):  
Shao-nian Huang ◽  
Dong-jun Huang ◽  
Mansoor Ahmed Khuhro

Video event detection is a challenging problem in many applications, such as video surveillance and video content analysis. In this paper, we propose a new framework to perceive high-level codewords by analyzing temporal relationship between different channels of video features. The low-level vocabulary words are firstly generated after different audio and visual feature extraction. A weighted undirected graph is constructed by exploring the Granger Causality between low-level words. Then, a greedy agglomerative graph-partitioning method is used to discover low-level word groups which have similar temporal pattern. The high-level codebooks representation is obtained by quantification of low-level words groups. Finally, multiple kernel learning, combined with our high-level codewords, is used to detect the video event. Extensive experimental results show that the proposed method achieves preferable results in video event detection.


2021 ◽  
Author(s):  
Shi Pui Donald Li ◽  
Michael F. Bonner

The scene-preferring portion of the human ventral visual stream, known as the parahippocampal place area (PPA), responds to scenes and landmark objects, which tend to be large in real-world size, fixed in location, and inanimate. However, the PPA also exhibits preferences for low-level contour statistics, including rectilinearity and cardinal orientations, that are not directly predicted by theories of scene- and landmark-selectivity. It is unknown whether these divergent findings of both low- and high-level selectivity in the PPA can be explained by a unified computational theory. To address this issue, we fit hierarchical computational models of mid-level tuning to the image-evoked fMRI responses of the PPA, and we performed a series of high-throughput experiments on these models. Our findings show that hierarchical encoding models of the PPA exhibit emergent selectivity across multiple levels of complexity, giving rise to high-level preferences along dimensions of real-world size, fixedness, and naturalness/animacy as well as low-level preferences for rectilinear shapes and cardinal orientations. These results reconcile disparate theories of PPA function in a unified model of mid-level visual representation, and they demonstrate how multifaceted selectivity profiles naturally emerge from the hierarchical computations of visual cortex and the natural statistics of images.


2017 ◽  
Vol 1 (1) ◽  
Author(s):  
Roseilla Nora Izaach

This study aimed to describe the level of grit in the Nursing Academy student X in the Aru Islands. Grit is the one of the latest theory in the study of Positive Psychology which emphasizes of two important aspects are perseverance of efforts and consistency of interest, that determines the success of individuals in achieving their life goals. The goal of achieving future success through education is the reason this research is conducted. Respondents in this study were students in 2014. The number of respondents are 51 people with entirely female. Measuring instrument used in this study was grit scale consists of 12 items with reliability of 0.85 and a validity coefficient range  from 0.44 to 0.82 ( Duckworth, et.al.,2007) . Based on the results of the processing of descriptive data, it was found that the majority of respondents have a low level of grit with percentage of 86.3%. Variable aspect of grit perseverance of efforts, the majority of respondents have a low level of 90.2%, and the consistency aspect of interest, the majority of respondents have a high level of 66.7%. The socioeconomic status of the students is based on the type of work of the parents, not indicating the tendency to be related to the degree of grit. Further research that can be done is to investigate more deeply about the contribution of personality factors, differences in cultural background and demographics that affect grit. Keywords: Grit, socioeconomic status, demographics


Author(s):  
Min Chen

The fast proliferation of video data archives has increased the need for automatic video content analysis and semantic video retrieval. Since temporal information is critical in conveying video content, in this chapter, an effective temporal-based event detection framework is proposed to support high-level video indexing and retrieval. The core is a temporal association mining process that systematically captures characteristic temporal patterns to help identify and define interesting events. This framework effectively tackles the challenges caused by loose video structure and class imbalance issues. One of the unique characteristics of this framework is that it offers strong generality and extensibility with the capability of exploring representative event patterns with little human interference. The temporal information and event detection results can then be input into our proposed distributed video retrieval system to support the high-level semantic querying, selective video browsing and event-based video retrieval.


Author(s):  
Nathan Saraiva ◽  
Nazrul Islam ◽  
Danny Alex Lachos Perez ◽  
Christian Esteve Rothenberg

Year after year, the growth of video traffic over the Internet keeps increasing. Video streaming over best-effort networks is considered inefficient and inappropriate to meet the expected Quality of Experience (QoE) of the new generation of multimedia services. Over the past few years, a number of technologies have emerged to improve the state of the art of video delivery, including HTTP Adaptive Streaming (HAS) that adapts the bitrate according to network conditions. At the crossroads, Software Defined Networking (SDN) offers options to meet Quality of Service (QoS) objectives for improved video quality by exploiting end-to-end programmability of network behaviour. However, traditional SDN approaches require dealing with low-level details from the underlying infrastructure, interfering in the automation and agility of service deployments. To alleviate these issues and overall provide a simpler approach, Intent-Based Networking (IBN) is being proposed to abstract low-level configurations through high-level policy interfaces. In this paper, we explore such an approach by implementing intent-based control loops for video service assurance. The proposed methods dynamically reconfigure the network for service-specific requirements using IBN to define the high-level behavior. We experimentally evaluate a use case where video traffic is rerouted based on network conditions to improve the QoS. The Proof-of-Concept results point to the potential of improving video content delivery through QoS-aware Intent-based approaches.


2016 ◽  
Vol 6 (3) ◽  
pp. 137-154 ◽  
Author(s):  
Hui Wei

Abstract We have two motivations. Firstly, semantic gap is a tough problem puzzling almost all sub-fields of Artificial Intelligence. We think semantic gap is the conflict between the abstractness of high-level symbolic definition and the details, diversities of low-level stimulus. Secondly, in object recognition, a pre-defined prototype of object is crucial and indispensable for bi-directional perception processing. On the one hand this prototype was learned from perceptional experience, and on the other hand it should be able to guide future downward processing. Human can do this very well, so physiological mechanism is simulated here. We utilize a mechanism of classical and non-classical receptive field (nCRF) to design a hierarchical model and form a multi-layer prototype of an object. This also is a realistic definition of concept, and a representation of denoting semantic. We regard this model as the most fundamental infrastructure that can ground semantics. Here a AND-OR tree is constructed to record prototypes of a concept, in which either raw data at low-level or symbol at high-level is feasible, and explicit production rules are also available. For the sake of pixel processing, knowledge should be represented in a data form; for the sake of scene reasoning, knowledge should be represented in a symbolic form. The physiological mechanism happens to be the bridge that can join them together seamlessly. This provides a possibility for finding a solution to semantic gap problem, and prevents discontinuity in low-order structures.


2021 ◽  
Vol 7 (2) ◽  
pp. 88-104
Author(s):  
Iia Gordiienko-Mytrofanova ◽  
◽  
Serhii Sauta ◽  

The article purpose: to describe in specific terms and enrich the psychological structure of fugitive as a component of playfulness / ludic competence on the basis of theoretical, methodological and empirical research. The study results has allowed us to make the following conclusions: 1) based on the analysed using of the “fugue” word in the scientific discourses in different fields, we have assumed that “fugue” was used by the Japanese colleagues for one of the of playfulness scales as a paronym of “fugitive”; 2) by generalization of dictionary definitions, we have determined the need to replace the term “fugue” as a component of playfulness with “fugitive”; 3) the distinguished and described levels of playfulness in examined literature, video content and cases allowed us to rethink the content of fugitive and to articulate such a component as the ability to “acquire” a new identity through simulation of feigned states; 4) an “acquired” new identity determines the genre specification of “Holy Fool” ludic position: on the one hand, the variability of its cognitive, affective, behavioural manifestations (in general) and verbal and non-verbal characteristics (in particular), and on the other hand, the stereotyped behaviour imitating the “symptoms” of feigned states; 5) the criteria for the development of fugitive can be: a high level of playfulness, tolerance for uncertainty, openness to new experience, resistance to shame, creativity, the ability for self-observation, an aggressive style of humour. We define fugitive, a component of playfulness, as an ability to “acquire” a new identity through simulation of feigned states, for example, another intellectual level - genius / stupidity / insanity; another stage of moral development; altered states of consciousness - alcoholic (or narcotic) intoxication / trance / ecstasy; a state with a reduced / absent response to the world around us - sleep / fainting / death. At the same time, feigned behaviour reflected by a player him/herself and observed by Other is aimed at enhancing the sense of identity.


Author(s):  
Max Hoffmann ◽  
Christof Paar

Hardware obfuscation is widely used in practice to counteract reverse engineering. In recent years, low-level obfuscation via camouflaged gates has been increasingly discussed in the scientific community and industry. In contrast to classical high-level obfuscation, such gates result in recovery of an erroneous netlist. This technology has so far been regarded as a purely defensive tool. We show that low-level obfuscation is in fact a double-edged sword that can also enable stealthy malicious functionalities.In this work, we present Doppelganger, the first generic design-level obfuscation technique that is based on low-level camouflaging. Doppelganger obstructs central control modules of digital designs, e.g., Finite State Machines (FSMs) or bus controllers, resulting in two different design functionalities: an apparent one that is recovered during reverse engineering and the actual one that is executed during operation. Notably, both functionalities are under the designer’s control.In two case studies, we apply Doppelganger to a universal cryptographic coprocessor. First, we show the defensive capabilities by presenting the reverse engineer with a different mode of operation than the one that is actually executed. Then, for the first time, we demonstrate the considerable threat potential of low-level obfuscation. We show how an invisible, remotely exploitable key-leakage Trojan can be injected into the same cryptographic coprocessor just through obfuscation. In both applications of Doppelganger, the resulting design size is indistinguishable from that of an unobfuscated design, depending on the choice of encodings.


Sign in / Sign up

Export Citation Format

Share Document