scholarly journals Enhancing Feature Point-Based Video Watermarking against Geometric Attacks with Template

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Zhongze Lv ◽  
Hu Guan ◽  
Ying Huang ◽  
Shuwu Zhang ◽  
Yang Zheng

As the Internet and communication technologies have developed quickly, the spread and usage of online video content have become easier, which results in major infringement problems. While video watermarking may be a viable solution for digital video content copyright protection, overcoming geometric attacks is a significant challenge. Although feature point-based watermarking algorithms are expected to be very resistant to these attacks, they are sensitive to feature region localization errors, resulting in poor watermark extraction accuracy. To solve this issue, we introduce the template to enhance the location accuracy of feature point-based watermarking. Furthermore, a scene change-based frame allocation method is presented, which arranges the template and the watermark to be embedded into different frames and eliminates their mutual interference, enhancing the performance of the proposed algorithm. According to the experimental results, our algorithm outperforms state-of-the-art methods in terms of robustness against geometric attacks under close imperceptibility.

Author(s):  
Zeyang Yang ◽  
Mark Griffiths ◽  
Zhihao Yan ◽  
Wenting Xu

Watching online videos (including short-form videos) has become the most popular leisure activity in China. However, a few studies have reported the potential negative effects of online video watching behaviors (including the potential for ‘addiction’) among a minority of individuals. The present study investigated online video watching behaviors, motivational factors for watching online videos, and potentially addictive indicators of watching online videos. Semi-structured interviews were conducted among 20 young Chinese adults. Qualitative data were analyzed using thematic analysis. Eight themes were identified comprising: (i) content is key; (ii) types of online video watching; (iii) platform function hooks; (iv) personal interests; (v) watching becoming habitual; (vi) social interaction needs; (vii) reassurance needs; and (viii) addiction-like symptoms. Specific video content (e.g., mukbang, pornography), platform-driven continuous watching, and short-form videos were perceived by some participants as being potentially addictive. Specific features or content on Chinese online video platforms (e.g., ‘Danmu’ scrolling comments) need further investigation. Future studies should explore users’ addictive-like behaviors in relation to specific types of online video content and their social interaction on these platforms.


Author(s):  
Bingqian Lu ◽  
Jianyi Yang ◽  
Weiwen Jiang ◽  
Yiyu Shi ◽  
Shaolei Ren

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.


Author(s):  
Yongyi Tang ◽  
Lin Ma ◽  
Lianqiang Zhou

Appearance and motion are two key components to depict and characterize the video content. Currently, the two-stream models have achieved state-of-the-art performances on video classification. However, extracting motion information, specifically in the form of optical flow features, is extremely computationally expensive, especially for large-scale video classification. In this paper, we propose a motion hallucination network, namely MoNet, to imagine the optical flow features from the appearance features, with no reliance on the optical flow computation. Specifically, MoNet models the temporal relationships of the appearance features and exploits the contextual relationships of the optical flow features with concurrent connections. Extensive experimental results demonstrate that the proposed MoNet can effectively and efficiently hallucinate the optical flow features, which together with the appearance features consistently improve the video classification performances. Moreover, MoNet can help cutting down almost a half of computational and data-storage burdens for the two-stream video classification. Our code is available at: https://github.com/YongyiTang92/MoNet-Features


Author(s):  
Evangelos Alexiou ◽  
Irene Viola ◽  
Tomás M. Borges ◽  
Tiago A. Fonseca ◽  
Ricardo L. de Queiroz ◽  
...  

Abstract Recent trends in multimedia technologies indicate the need for richer imaging modalities to increase user engagement with the content. Among other alternatives, point clouds denote a viable solution that offers an immersive content representation, as witnessed by current activities in JPEG and MPEG standardization committees. As a result of such efforts, MPEG is at the final stages of drafting an emerging standard for point cloud compression, which we consider as the state-of-the-art. In this study, the entire set of encoders that have been developed in the MPEG committee are assessed through an extensive and rigorous analysis of quality. We initially focus on the assessment of encoding configurations that have been defined by experts in MPEG for their core experiments. Then, two additional experiments are designed and carried to address some of the identified limitations of current approach. As part of the study, state-of-the-art objective quality metrics are benchmarked to assess their capability to predict visual quality of point clouds under a wide range of radically different compression artifacts. To carry the subjective evaluation experiments, a web-based renderer is developed and described. The subjective and objective quality scores along with the rendering software are made publicly available, to facilitate and promote research on the field.


Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4558
Author(s):  
Yiping Xu ◽  
Hongbing Ji ◽  
Wenbo Zhang

Detecting and removing ghosts is an important challenge for moving object detection because ghosts will remain forever once formed, leading to the overall detection performance degradation. To deal with this issue, we first classified the ghosts into two categories according to the way they were formed. Then, the sample-based two-layer background model and histogram similarity of ghost areas were proposed to detect and remove the two types of ghosts, respectively. Furthermore, three important parameters in the two-layer model, i.e., the distance threshold, similarity threshold of local binary similarity pattern (LBSP), and time sub-sampling factor, were automatically determined by the spatial-temporal information of each pixel for adapting to the scene change rapidly. The experimental results on the CDnet 2014 dataset demonstrated that our proposed algorithm not only effectively eliminated ghost areas, but was also superior to the state-of-the-art approaches in terms of the overall performance.


First Monday ◽  
2015 ◽  
Author(s):  
Tatiana Pontes ◽  
Elizeu Santos-Neto ◽  
Jussara Almeida ◽  
Matei Ripeanu

Multimedia content is central to our experience on the Web. Specifically, users frequently search and watch videos online. The textual features that accompany such content (e.g., title, description, and tags) can generally be optimized to attract more search traffic and ultimately to increase the advertisement-generated revenue.This study investigates whether automating tag selection for online video content with the goal of increasing viewership is feasible. In summary, it shows that content producers can lower their operational costs for tag selection using a hybrid approach that combines dedicated personnel (often known as ‘channel managers’), crowdsourcing, and automatic tag suggestions. More concretely, this work provides the following insights: first, it offers evidence that existing tags for a sample of YouTube videos can be improved; second, this study shows that an automated tag recommendation process can be efficient in practice; and, finally it explores the impact of using information mined from various data sources associated with content items on the quality of the resulting tags.


2018 ◽  
Vol 24 (1) ◽  
pp. 524-526 ◽  
Author(s):  
Hendy Kasim ◽  
Asnan Furinto ◽  
Ronnie Resdianto Masman
Keyword(s):  

2021 ◽  
Vol 11 (16) ◽  
pp. 7472
Author(s):  
Mario Montagud ◽  
Cristian Hurtado ◽  
Juan Antonio De Rus ◽  
Sergi Fernández

All multimedia services must be accessible. Accessibility for multimedia content is typically provided by means of access services, of which subtitling is likely the most widespread approach. To date, numerous recommendations and solutions for subtitling classical 2D audiovisual services have been proposed. Similarly, recent efforts have been devoted to devising adequate subtitling solutions for VR360 video content. This paper, for the first time, extends the existing approaches to address the challenges remaining for efficiently subtitling 3D Virtual Reality (VR) content by exploring two key requirements: presentation modes and guiding methods. By leveraging insights from earlier work on VR360 content, this paper proposes novel presentation modes and guiding methods, to not only provide the freedom to explore omnidirectional scenes, but also to address the additional specificities of 3D VR compared to VR360 content: depth, 6 Degrees of Freedom (6DoF), and viewing perspectives. The obtained results prove that always-visible subtitles and a novel proposed comic-style presentation mode are significantly more appropriate than state-of-the-art fixed-positioned subtitles, particularly in terms of immersion, ease and comfort of reading, and identification of speakers, when applied to professional pieces of content with limited displacement of speakers and limited 6DoF (i.e., users are not expected to navigate around the virtual environment). Similarly, even in such limited movement scenarios, the results show that the use of indicators (arrows), as a guiding method, is well received. Overall, the paper provides relevant insights and paves the way for efficiently subtitling 3D VR content.


Sign in / Sign up

Export Citation Format

Share Document