scholarly journals Multimodal Summarization of User-Generated Videos

2021 ◽  
Vol 11 (11) ◽  
pp. 5260
Author(s):  
Theodoros Psallidas ◽  
Panagiotis Koromilas ◽  
Theodoros Giannakopoulos ◽  
Evaggelos Spyrou

The exponential growth of user-generated content has increased the need for efficient video summarization schemes. However, most approaches underestimate the power of aural features, while they are designed to work mainly on commercial/professional videos. In this work, we present an approach that uses both aural and visual features in order to create video summaries from user-generated videos. Our approach produces dynamic video summaries, that is, comprising the most “important” parts of the original video, which are arranged so as to preserve their temporal order. We use supervised knowledge from both the aforementioned modalities and train a binary classifier, which learns to recognize the important parts of videos. Moreover, we present a novel user-generated dataset which contains videos from several categories. Every 1 sec part of each video from our dataset has been annotated by more than three annotators as being important or not. We evaluate our approach using several classification strategies based on audio, video and fused features. Our experimental results illustrate the potential of our approach.

Entropy ◽  
2018 ◽  
Vol 20 (10) ◽  
pp. 748 ◽  
Author(s):  
Shazia Saqib ◽  
Syed Kazmi

Multimedia information requires large repositories of audio-video data. Retrieval and delivery of video content is a very time-consuming process and is a great challenge for researchers. An efficient approach for faster browsing of large video collections and more efficient content indexing and access is video summarization. Compression of data through extraction of keyframes is a solution to these challenges. A keyframe is a representative frame of the salient features of the video. The output frames must represent the original video in temporal order. The proposed research presents a method of keyframe extraction using the mean of consecutive k frames of video data. A sliding window of size k / 2 is employed to select the frame that matches the median entropy value of the sliding window. This is called the Median of Entropy of Mean Frames (MME) method. MME is mean-based keyframes selection using the median of the entropy of the sliding window. The method was tested for more than 500 videos of sign language gestures and showed satisfactory results.


Author(s):  
D. Minola Davids ◽  
C. Seldev Christopher

The visual data attained from surveillance single-camera or multi-view camera networks is exponentially increasing every day. Identifying the important shots in the presented video which faithfully signify the original video is the major task in video summarization. For executing efficient video summarization of the surveillance systems, optimization algorithm like LFOB-COA is proposed in this paper. Data collection, pre-processing, deep feature extraction (FE), shot segmentation JSFCM, classification using Rectified Linear Unit activated BLSTM, and LFOB-COA are the proposed method’s five steps. Finally a post-processing step is utilized. For recognizing the proposed method’s effectiveness, the results are then contrasted with the existent methods.


2021 ◽  
pp. 648-654
Author(s):  
Rubel Biswas ◽  
Deisy Chaves ◽  
Laura Fernández-Robles ◽  
Eduardo Fidalgo ◽  
Enrique Alegre

Identifying key content from a video is essential for many security applications such as motion/action detection, person re-identification and recognition. Moreover, summarizing the key information from Child Sexual Exploitation Materials, especially videos, which mainly contain distinctive scenes including people’s faces is crucial to speed-up the investigation of Law Enforcement Agencies. In this paper, we present a video summarization strategy that combines perceptual hashing and face detection algorithms to keep the most relevant frames of a video containing people’s faces that may correspond to victims or offenders. Due to legal constraints to access Child Sexual Abuse datasets, we evaluated the performance of the proposed strategy during the detection of adult pornography content with the NDPI-800 dataset. Also, we assessed the capability of our strategy to create video summaries preserving frames with distinctive faces from the original video using ten additional short videos manually labeled. Results showed that our approach can detect pornography content with an accuracy of 84.15% at a speed of 8.05 ms/frame making this appropriate for realtime applications.


Video Surveillance System uses video cameras to capture images and videos that can be compressed, stored and send to place with the limited set of monitors .Now a Days all the public places such as bank, educational institutions, Offices, Hospitals are equipped with multiple surveillance cameras having overlapping field of view for security and environment monitoring purposes. A Video Summarization is a technique to generate the summary of entire Video Content either by still images or through video skim. The summarized video length should be less than the original video length and it should covers maximum information from the original video. Video summarization studies concentrating on monocular videos cannot be applied directly to multiple-view videos due to redundancy in multiple views. Generating Summary for Surveillance videos is more challenging because, videos Captured by surveillance cameras is long, contains uninteresting events, same scene recorded in different views leading to inter-view dependencies and variation in illuminations. In this paper, we present a survey on the research work carried on video summarization techniques for videos captured through multiple views. The summarized video generated can be used for the analysis of post-accident scenarios, identifying suspicious events, theft in public which supports Crime department for the investigation purposes.


2013 ◽  
Vol 433-435 ◽  
pp. 297-300
Author(s):  
Zong Yue Wang

Video summaries provide a compact video representation preserving the essential activities of the original video, but the summaries may be confusing when mixing different activities together. Summaries Clustered methodology, showing similar activities simultaneously, enables to view much easier and more efficiently. However, it is very time consuming in generating summaries, especially in calculating motion distance and collision cost. To improve the efficiency of generating summaries, a parallel video synopsis generation algorithm is proposed based on GPGPU. The experiment result shows generation efficiency is improved greatly through GPU parallel computing. The acceleration radio can reach at 5.75 when data size is above 1600*960*30000.


1998 ◽  
Author(s):  
Janne Saarela ◽  
Bernard Merialdo
Keyword(s):  

Author(s):  
Senthil Kumar Narayanasamy ◽  
Dinakaran Muruganantham

The exponential growth of data emerging out of social media is causing challenges in decision-making systems and poses a critical hindrance in searching for the potential information. The major objective of this chapter is to convert the unstructured data in social media into the meaningful structure format, which in return brings the robustness to the information extraction process. Further, it has the inherent capability to prune for named entities from the unstructured data and store the entities into the knowledge base for important facts. In this chapter, the authors explain the methods to identify all the critical interpretations taken over to find the named entities from Twitter streams and the techniques to proportionally link it with appropriate knowledge sources such as DBpedia.


Sign in / Sign up

Export Citation Format

Share Document