Action Key Frames Extraction Using L1-Norm and Accumulative Optical Flow for Compact Video Shot Summarisation

Author(s):  
Manar Abduljabbar Ahmad Mizher ◽  
Mei Choo Ang ◽  
Siti Norul Huda Sheikh Abdullah ◽  
Kok Weng Ng
Keyword(s):  
L1 Norm ◽  
Author(s):  
Manar Abduljabbar Ahmad Mizher ◽  
Ang Mei Choo ◽  
Siti Norul Huda Sheikh Abdullah ◽  
Kok Weng Ng

Key frame extraction is one of the critical techniques in computer vision fields such as video search, video identification and video forgeries detection. The extracted key frames should be sufficient key frames that preserve main actions in a video with compact representation. The objective of this work is to improve our previous action key frames extraction algorithm (AKF) by adapting a threshold which selects action key frames as final key frames. The threshold adaptation was achieved by using the mean value, the standard deviation, and the L1-norm instead of the comparison of user summaries evaluation method to obtain a fully automatic video summarisation algorithm, and by eliminating the conditions in selecting the final key frames to reduce the complexity of the algorithm. We have validated our proposed Improved AKF on complex colour video shots instead of the simple grey level video shots. The Improved AKF algorithm was able to extract a compact number of action key frames by preventing redundant key frames, reduce processing complexity, and preserve sufficient information about the main actions in a video shot. We then evaluated the Improved AKF algorithm with the-state-of-the-art algorithms in terms of compression ratio using Paul videos and Shih-Tang dataset. The evaluation results showed that the Improved AKF algorithm achieved better compression ratio and retained sufficient information in the extracted action key frames under different testing video shots. Therefore, the improved AKF algorithm is a suitable technique for applications in computer vision fields such as passive object-based video authentication systems  


2011 ◽  
Vol 10 (03) ◽  
pp. 247-259 ◽  
Author(s):  
Dianting Liu ◽  
Mei-Ling Shyu ◽  
Chao Chen ◽  
Shu-Ching Chen

In consequence of the popularity of family video recorders and the surge of Web 2.0, increasing amounts of videos have made the management and integration of the information in videos an urgent and important issue in video retrieval. Key frames, as a high-quality summary of videos, play an important role in the areas of video browsing, searching, categorisation, and indexing. An effective set of key frames should include major objects and events of the video sequence, and should contain minimum content redundancies. In this paper, an innovative key frame extraction method is proposed to select representative key frames for a video. By analysing the differences between frames and utilising the clustering technique, a set of key frame candidates (KFCs) is first selected at the shot level, and then the information within a video shot and between video shots is used to filter the candidate set to generate the final set of key frames. Experimental results on the TRECVID 2007 video dataset have demonstrated the effectiveness of our proposed key frame extraction method in terms of the percentage of the extracted key frames and the retrieval precision.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 400
Author(s):  
Sheng Lu ◽  
Zhaojie Luo ◽  
Feng Gao ◽  
Mingjie Liu ◽  
KyungHi Chang ◽  
...  

Lane detection is a significant technology for autonomous driving. In recent years, a number of lane detection methods have been proposed. However, the performance of fast and slim methods is not satisfactory in sophisticated scenarios and some robust methods are not fast enough. Consequently, we proposed a fast and robust lane detection method by combining a semantic segmentation network and an optical flow estimation network. Specifically, the whole research was divided into three parts: lane segmentation, lane discrimination, and mapping. In terms of lane segmentation, a robust semantic segmentation network was proposed to segment key frames and a fast and slim optical flow estimation network was used to track non-key frames. In the second part, density-based spatial clustering of applications with noise (DBSCAN) was adopted to discriminate lanes. Ultimately, we proposed a mapping method to map lane pixels from pixel coordinate system to camera coordinate system and fit lane curves in the camera coordinate system that are able to provide feedback for autonomous driving. Experimental results verified that the proposed method can speed up robust semantic segmentation network by three times at most and the accuracy fell 2% at most. In the best of circumstances, the result of the lane curve verified that the feedback error was 3%.


2019 ◽  
Vol 18 (2) ◽  
pp. 143-166
Author(s):  
Manar Abduljabbar Ahmad Mizher ◽  
Ang Mei Choo ◽  
Siti Norul Huda Sheikh Abdullah ◽  
Kok Weng Ng

2013 ◽  
Vol 29 (1) ◽  
pp. 109-117
Author(s):  
MIRCEA DAN RUS ◽  

The aim of this paper is to present a new approach for solving the minimization problem for a large class of energy functionals that appear in the differential models of optical flow estimation problems, and which are expressed using the discrete l1-norm. The choice of l1-energy minimization is motivated by the fact that quadratic l2 optimization is not robust to outliers and that l1-norm is a better choice for modeling real problems involving discrete signals. The method described in this paper is very general, thus the advantage of being applicable to almost every differential model that has been proposed so far for the optical flow estimation problem. In order to test and validate our method, a MATLAB implementation on several optical flow models is currently under development. Also, a multi-core implementation on GP-GPU is to be considered in the near future.


Author(s):  
Jharna Majumdar ◽  
Darshan K M ◽  
Abhijith Vijayendra

Video has become an interactive medium of communication in everyday life. The sheer volume of video makes it extremely difficult to browse through and find the required data. Hence extraction of key frames from the video which represents the abstract of the entire video becomes necessary. The aim of the video shot detection is to find the position of the shot boundaries, so that key frames can be selected from each shot for subsequent processing such as video summarization, indexing etc. For most of the surveillance applications like video summery, face recognition etc., the hardware (real time) implementation of these algorithms becomes necessary. Here in this paper we present the architecture for simultaneous accessing of consecutive frames, which are then used for the implementation of various Video Shot Detection algorithms. We also present the real time implementation of three video shot detection algorithms using the above mentioned architecture on FPGA (Field Programmable Gate Arrays).


2010 ◽  
Vol 129-131 ◽  
pp. 95-98
Author(s):  
Yong Liang Xiao ◽  
Shao Ping Zhu

Key frames play a very important role in video indexing and retrieval. In this paper, we propose a novel method to extract key frames based on information theory. We use improved Bayesian Information Criterion to determining the number of key frames, and then automatically extract key frames to represent video shot based on information bottleneck cluster method. Experimental results and the comparisons with other methods on various types of video sequences show the effectiveness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document