Classifying web videos using a global video descriptor

2012 ◽  
Vol 24 (7) ◽  
pp. 1473-1485 ◽  
Author(s):  
Berkan Solmaz ◽  
Shayan Modiri Assari ◽  
Mubarak Shah
Keyword(s):  
2014 ◽  
Vol 5 (3) ◽  
pp. 1-22 ◽  
Author(s):  
Yicheng Song ◽  
Yongdong Zhang ◽  
Juan Cao ◽  
Jinhui Tang ◽  
Xingyu Gao ◽  
...  
Keyword(s):  

2018 ◽  
Vol 28 (10) ◽  
pp. 3019-3029 ◽  
Author(s):  
Nicolas Chesneau ◽  
Karteek Alahari ◽  
Cordelia Schmid

2014 ◽  
Vol 29 (5) ◽  
pp. 785-798 ◽  
Author(s):  
Zhi-Neng Chen ◽  
Chong-Wah Ngo ◽  
Wei Zhang ◽  
Juan Cao ◽  
Yu-Gang Jiang

Author(s):  
Felix Weninger ◽  
Claudia Wagner ◽  
Martin Wollmer ◽  
Bjorn Schuller ◽  
Louis-Philippe Morency
Keyword(s):  

Author(s):  
Jens Eder

Affective image operations are attempts to influence behaviour and stimulate action by evoking affects through images. The paper explores their forms and uses in political conflict, from video activism to war propaganda. Drawing together interdisciplinary research, the chapter develops a theoretical framework for analysing the affective and political force of still and moving images, arguing that the affective structure of images has four layers: Political affects and emotions are triggered by the specific interplay of visual forms, worlds, messages, and reflections. On the basis of this framework, several frequent types of affective image operations can be distinguished, illustrated by brief case studies of political web videos.


Symmetry ◽  
2019 ◽  
Vol 11 (1) ◽  
pp. 52 ◽  
Author(s):  
Xianzhang Pan ◽  
Wenping Guo ◽  
Xiaoying Guo ◽  
Wenshu Li ◽  
Junjie Xu ◽  
...  

The proposed method has 30 streams, i.e., 15 spatial streams and 15 temporal streams. Each spatial stream corresponds to each temporal stream. Therefore, this work correlates with the symmetry concept. It is a difficult task to classify video-based facial expression owing to the gap between the visual descriptors and the emotions. In order to bridge the gap, a new video descriptor for facial expression recognition is presented to aggregate spatial and temporal convolutional features across the entire extent of a video. The designed framework integrates a state-of-the-art 30 stream and has a trainable spatial–temporal feature aggregation layer. This framework is end-to-end trainable for video-based facial expression recognition. Thus, this framework can effectively avoid overfitting to the limited emotional video datasets, and the trainable strategy can learn to better represent an entire video. The different schemas for pooling spatial–temporal features are investigated, and the spatial and temporal streams are best aggregated by utilizing the proposed method. The extensive experiments on two public databases, BAUM-1s and eNTERFACE05, show that this framework has promising performance and outperforms the state-of-the-art strategies.


2010 ◽  
Vol 48 (6) ◽  
pp. 430-431
Author(s):  
Robert Ehrlich
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document