News Video Indexing and Abstraction by Specific Visual Cues

Author(s):  
Fan Jiang ◽  
Yu-Jin Zhang

This chapter addresses the tasks of providing the semantic structure and generating the abstraction of content in broadcast news. Based on extraction of two specific visual cues, Main Speaker Close-Up (MSC) and news caption, a hierarchy of news video index is automatically constructed for efficient access to multi-level contents. In addition, a unique MSC-based video abstraction is proposed to help satisfy the need for news preview and key-person highlighting. Experiments on news clips from MPEG-7 video content sets yield encouraging results that prove the efficiency of our video indexing and abstraction scheme.

2021 ◽  
Author(s):  
Judith M. Varkevisser ◽  
Ralph Simon ◽  
Ezequiel Mendoza ◽  
Martin How ◽  
Idse van Hijlkema ◽  
...  

AbstractBird song and human speech are learned early in life and for both cases engagement with live social tutors generally leads to better learning outcomes than passive audio-only exposure. Real-world tutor–tutee relations are normally not uni- but multimodal and observations suggest that visual cues related to sound production might enhance vocal learning. We tested this hypothesis by pairing appropriate, colour-realistic, high frame-rate videos of a singing adult male zebra finch tutor with song playbacks and presenting these stimuli to juvenile zebra finches (Taeniopygia guttata). Juveniles exposed to song playbacks combined with video presentation of a singing bird approached the stimulus more often and spent more time close to it than juveniles exposed to audio playback only or audio playback combined with pixelated and time-reversed videos. However, higher engagement with the realistic audio–visual stimuli was not predictive of better song learning. Thus, although multimodality increased stimulus engagement and biologically relevant video content was more salient than colour and movement equivalent videos, the higher engagement with the realistic audio–visual stimuli did not lead to enhanced vocal learning. Whether the lack of three-dimensionality of a video tutor and/or the lack of meaningful social interaction make them less suitable for facilitating song learning than audio–visual exposure to a live tutor remains to be tested.


2016 ◽  
Vol 40 (3) ◽  
pp. 885-895 ◽  
Author(s):  
Xuanpeng Li ◽  
Emmanuel Seignez

Driver inattention, either driver drowsiness or distraction, is a major contributor to serious traffic crashes. In general, most research on this topic studies driver drowsiness and distraction separately, and is often conducted in a well-controlled, simulated environment. By considering the reliability and flexibility of real-time driver monitoring systems, it is possible to evaluate driver inattention by the fusion of multiple selected cues in real life scenarios. This paper presents a real-time, visual-cue-based driver monitoring system, which can track both multi-level driver drowsiness and distraction simultaneously. A set of visual cues are adopted via analysis of drivers’ physical behaviour and driving performance. Driver drowsiness is evaluated using a multi-level scale, by applying evidence theory. Additionally, a general framework of extensive hierarchical combinations is used to generate a probabilistic evaluation of driving risk in real time. This driver inattention monitoring system with multimodal fusion has been proven to improve the accuracy of risk evaluation and reduce the rate of false alarms, and acceptance of the system is recommended.


Author(s):  
C.G.M. Snoek ◽  
M. Worring ◽  
J. Geusebroek ◽  
D.C. Koelma ◽  
F.J. Seinstra ◽  
...  
Keyword(s):  

2000 ◽  
Vol 43 (2) ◽  
pp. 64-70 ◽  
Author(s):  
Jean-Luc Gauvain ◽  
Lori Lamel ◽  
Gilles Adda

Sign in / Sign up

Export Citation Format

Share Document