scholarly journals Interactive Indexing and Retrieval of Multimedia Content

Author(s):  
Marcel Worring ◽  
Andrew Bagdanov ◽  
Jan v. Gemert ◽  
Geusebroek Jan-Mark ◽  
Hoang Minh ◽  
...  
Author(s):  
Manolis Wallace ◽  
Yannis Avrithis ◽  
Giorgos Stamou ◽  
Stefanos Kollias

Author(s):  
Ioannis Kompatsiaris ◽  
Vasileios Mezaris ◽  
Michael G. Strintzis

The increase in the growth of multimedia technology leads to an increase in multimedia content in a large amount. Hence it is important to access only interesting video content instead of the whole video. For effective indexing and retrieving the interesting content from the whole video, the Content-Based Video Retrieval (CBVR) is used. Shot boundary detection is one of the most important and necessary steps. It is used to partitioning the video into shots that are necessary for indexing and retrieval of video. Therefore, segmentation plays a significant role in the field of digital image and media processing, computer vision and pattern recognition. In this paper, the recent development for shot boundary detection has been presented.


Author(s):  
B. Aparna ◽  
S. Madhavi ◽  
G. Mounika ◽  
P. Avinash ◽  
S. Chakravarthi

We propose a new design for large-scale multimedia content protection systems. Our design leverages cloud infrastructures to provide cost efficiency, rapid deployment, scalability, and elasticity to accommodate varying workloads. The proposed system can be used to protect different multimedia content types, including videos, images, audio clips, songs, and music clips. The system can be deployed on private and/or public clouds. Our system has two novel components: (i) method to create signatures of videos, and (ii) distributed matching engine for multimedia objects. The signature method creates robust and representative signatures of videos that capture the depth signals in these videos and it is computationally efficient to compute and compare as well as it requires small storage. The distributed matching engine achieves high scalability and it is designed to support different multimedia objects. We implemented the proposed system and deployed it on two clouds: Amazon cloud and our private cloud. Our experiments with more than 11,000 videos and 1 million images show the high accuracy and scalability of the proposed system. In addition, we compared our system to the protection system used by YouTube and our results show that the YouTube protection system fails to detect most copies of videos, while our system detects more than 98% of them.


1996 ◽  
Author(s):  
Vikrant Kobla ◽  
David Doermann ◽  
King-Ip Lin ◽  
Christos Faloutsos

2021 ◽  
pp. 1-11
Author(s):  
P. N. R. L. Chandra Sekhar Author ◽  
T. N. Shankar Author

In the era of digital technology, it becomes easy to share photographs and videos using smartphones and social networking sites to their loved ones. On the other hand, many photo editing tools evolved to make it effortless to alter multimedia content. It makes people accustomed to modifying their photographs or videos either for fun or extracting attention from others. This altering brings a questionable validity and integrity to the kind of multimedia content shared over the internet when used as evidence in Journalism and Court of Law. In multimedia forensics, intense research work is underway over the past two decades to bring trustworthiness to the multimedia content. This paper proposes an efficient way of identifying the manipulated region based on Noise Level inconsistencies of spliced mage. The spliced image segmented into irregular objects and extracts the noise features in both pixel and residual domains. The manipulated region is then exposed based on the cosine similarity of noise levels among pairs of individual objects. The experimental results reveal the effectiveness of the proposed method over other state-of-art methods.


Data ◽  
2021 ◽  
Vol 6 (8) ◽  
pp. 87
Author(s):  
Sara Ferreira ◽  
Mário Antunes ◽  
Manuel E. Correia

Deepfake and manipulated digital photos and videos are being increasingly used in a myriad of cybercrimes. Ransomware, the dissemination of fake news, and digital kidnapping-related crimes are the most recurrent, in which tampered multimedia content has been the primordial disseminating vehicle. Digital forensic analysis tools are being widely used by criminal investigations to automate the identification of digital evidence in seized electronic equipment. The number of files to be processed and the complexity of the crimes under analysis have highlighted the need to employ efficient digital forensics techniques grounded on state-of-the-art technologies. Machine Learning (ML) researchers have been challenged to apply techniques and methods to improve the automatic detection of manipulated multimedia content. However, the implementation of such methods have not yet been massively incorporated into digital forensic tools, mostly due to the lack of realistic and well-structured datasets of photos and videos. The diversity and richness of the datasets are crucial to benchmark the ML models and to evaluate their appropriateness to be applied in real-world digital forensics applications. An example is the development of third-party modules for the widely used Autopsy digital forensic application. This paper presents a dataset obtained by extracting a set of simple features from genuine and manipulated photos and videos, which are part of state-of-the-art existing datasets. The resulting dataset is balanced, and each entry comprises a label and a vector of numeric values corresponding to the features extracted through a Discrete Fourier Transform (DFT). The dataset is available in a GitHub repository, and the total amount of photos and video frames is 40,588 and 12,400, respectively. The dataset was validated and benchmarked with deep learning Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) methods; however, a plethora of other existing ones can be applied. Generically, the results show a better F1-score for CNN when comparing with SVM, both for photos and videos processing. CNN achieved an F1-score of 0.9968 and 0.8415 for photos and videos, respectively. Regarding SVM, the results obtained with 5-fold cross-validation are 0.9953 and 0.7955, respectively, for photos and videos processing. A set of methods written in Python is available for the researchers, namely to preprocess and extract the features from the original photos and videos files and to build the training and testing sets. Additional methods are also available to convert the original PKL files into CSV and TXT, which gives more flexibility for the ML researchers to use the dataset on existing ML frameworks and tools.


Sign in / Sign up

Export Citation Format

Share Document