Rule-based scene boundary detection for semantic video segmentation

Author(s):  
Yue Feng ◽  
R. Ren ◽  
J. Jose
Author(s):  
JUNAID BABER ◽  
NITIN AFZULPURKAR ◽  
SHIN'ICHI SATOH

Rapid increase in video databases has forced the industry to have efficient and effective frameworks for video retrieval and indexing. Video segmentation into scenes is widely used for video summarization, partitioning, indexing and retrieval. In this paper, we propose a framework for scene detection mainly based on entropy and Speeded Up Robust Features (SURF) features. First, we detect the fade and abrupt boundaries based on frame entropy analysis and SURF features matching. Fade boundaries are smart indication of scenes beginning or ending in many videos and dramas, and are detected by frame entropy analysis. Before abrupt boundary detection, unnecessary frames which are obviously not abrupt boundaries, such as blank screens, high intensity influenced images, sliding credits, are removed. Candidate boundaries are detected to make SURF features efficient for abrupt boundary detection, and SURF features between candidate boundaries and their adjacent frames are used to detect the abrupt boundaries. Second, key frames are extracted from abrupt shots. We evaluate our key frame extraction with other famous algorithms and show the effectiveness of the key frames. Finally, scene boundaries are detected using sliding window of size K over the key frames in temporal order. In experimental evaluation on the TRECVID-2007 shot boundary test set, the algorithm for shot boundary achieves substantial improvements over state-of-the-art methods with the precision of 99% and the recall of 97.8%. Experimental results for video segmentation into scenes are also promising, compared to famous state-of-the-art techniques.


Temporal video segmentation is the primary step of content based video retrieval. The whole processes of video management are coming under the focus of content based video retrieval, which includes, video indexing, video retrieval, and video summarization etc. In this paper, we proposed a computationally efficient and discriminating shot boundary detection method, which uses a local feature descriptor named local Contrast and Ordering (LCO) for feature extraction. The results of the experiments, which are conducted on the video dataset TRECVid, analyzed and compared with some existing shot boundary detection methods. The proposed method has given a promising result, even in the cases of illumination changes, rotated images etc.


2019 ◽  
Vol 107 ◽  
pp. 1-17 ◽  
Author(s):  
Pravin Bhaskar Ramteke ◽  
Shashidhar G. Koolagudi

Author(s):  
Shanmukhappa Angadi ◽  
Vilas Naik

The Shot Boundary Detection (SBD) is an early step for most of the video applications involving understanding, indexing, characterization, or categorization of video. The SBD is temporal video segmentation and it has been an active topic of research in the area of content based video analysis. The research efforts have resulted in a variety of algorithms. The major methods that have been used for shot boundary detection include pixel intensity based, histogram-based, edge-based, and motion vectors based, technique. Recently researchers have attempted use of graph theory based methods for shot boundary detection. The proposed algorithm is one such graph based model and employs graph partition mechanism for detection of shot boundaries. Graph partition model is one of the graph theoretic segmentation algorithms, which offers data clustering by using a graph model. Pair-wise similarities between all data objects are used to construct a weighted graph represented as an adjacency matrix (weighted similarity matrix) that contains all necessary information for clustering. Representing the data set in the form of an edge-weighted graph converts the data clustering problem into a graph partitioning problem. The algorithm is experimented on sports and movie videos and the results indicate the promising performance.


2015 ◽  
Author(s):  
Sankaranaryanan Piramanayagam ◽  
Eli Saber ◽  
Nathan D. Cahill ◽  
David Messinger

Author(s):  
Vyacheslav Parshin ◽  
Liming Chen

Automatic video segmentation into semantic units is important to organize an effective content based access to long video. In this work we focus on the problem of video segmentation into narrative units called scenes - aggregates of shots unified by a common dramatic event or locale. In this work we derive a statistical video scene segmentation approach which detects scenes boundaries in one pass fusing multi-modal audio-visual features in a symmetrical and scalable manner. The approach deals properly with the variability of real-valued features and models their conditional dependence on the context. It also integrates prior information concerning the duration of scenes. Two kinds of features extracted in visual and audio domain are proposed. The results of experimental evaluations carried out on ground truth video are reported. They show that our approach effectively fuse multiple modalities with higher performance as compared with an alternative rule-based fusion technique.


Sign in / Sign up

Export Citation Format

Share Document