EURASIP Journal on Image and Video Processing
Latest Publications


TOTAL DOCUMENTS

927
(FIVE YEARS 175)

H-INDEX

33
(FIVE YEARS 8)

Published By Springer (Biomed Central Ltd.)

1687-5281, 1687-5176

2022 ◽  
Vol 2022 (1) ◽  
Author(s):  
Shahi Dost ◽  
Faryal Saud ◽  
Maham Shabbir ◽  
Muhammad Gufran Khan ◽  
Muhammad Shahid ◽  
...  

AbstractWith the growing demand for image and video-based applications, the requirements of consistent quality assessment metrics of image and video have increased. Different approaches have been proposed in the literature to estimate the perceptual quality of images and videos. These approaches can be divided into three main categories; full reference (FR), reduced reference (RR) and no-reference (NR). In RR methods, instead of providing the original image or video as a reference, we need to provide certain features (i.e., texture, edges, etc.) of the original image or video for quality assessment. During the last decade, RR-based quality assessment has been a popular research area for a variety of applications such as social media, online games, and video streaming. In this paper, we present review and classification of the latest research work on RR-based image and video quality assessment. We have also summarized different databases used in the field of 2D and 3D image and video quality assessment. This paper would be helpful for specialists and researchers to stay well-informed about recent progress of RR-based image and video quality assessment. The review and classification presented in this paper will also be useful to gain understanding of multimedia quality assessment and state-of-the-art approaches used for the analysis. In addition, it will help the reader select appropriate quality assessment methods and parameters for their respective applications.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Qiang Ma ◽  
Ling Xing

AbstractPerceptual video hashing represents video perceptual content by compact hash. The binary hash is sensitive to content distortion manipulations, but robust to perceptual content preserving operations. Currently, boundary between sensitivity and robustness is often ambiguous and it is decided by an empirically defined threshold. This may result in large false positive rates when received video is to be judged similar or dissimilar in some circumstances, e.g., video content authentication. In this paper, we propose a novel perceptual hashing method for video content authentication based on maximized robustness. The developed idea of maximized robustness means that robustness is maximized on condition that security requirement of hash is first met. We formulate the video hashing as a constrained optimization problem, in which coefficients of features offset and robustness are to be learned. Then we adopt a stochastic optimization method to solve the optimization. Experimental results show that the proposed hashing is quite suitable for video content authentication in terms of security and robustness.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Jin Su Kim ◽  
Min-Gu Kim ◽  
Sung Bum Pan

AbstractConventional surveillance systems for preventing accidents and incidents do not identify 95% thereof after 22 min when one person monitors a plurality of closed circuit televisions (CCTV). To address this issue, while computer-based intelligent video surveillance systems have been studied to notify users of abnormal situations when they happen, it is not commonly used in real environment because of weakness of personal information leaks and high power consumption. To address this issue, intelligent video surveillance systems based on small devices have been studied. This paper suggests implement an intelligent video surveillance system based on embedded modules for intruder detection based on information learning, fire detection based on color and motion information, and loitering and fall detection based on human body motion. Moreover, an algorithm and an embedded module optimization method are applied for real-time processing. The implemented algorithm showed performance of 88.51% for intruder detection, 92.63% for fire detection, 80% for loitering detection and 93.54% for fall detection. The result of comparison before and after optimization about the algorithm processing time showed 50.53% of decrease, implying potential real-time driving of the intelligent image monitoring system based on embedded modules.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Ling Zhu ◽  
Hongqing Zhu ◽  
Suyi Yang ◽  
Pengyu Wang ◽  
Yang Yu

AbstractAccurate segmentation and classification of pulmonary nodules are of great significance to early detection and diagnosis of lung diseases, which can reduce the risk of developing lung cancer and improve patient survival rate. In this paper, we propose an effective network for pulmonary nodule segmentation and classification at one time based on adversarial training scheme. The segmentation network consists of a High-Resolution network with Multi-scale Progressive Fusion (HR-MPF) and a proposed Progressive Decoding Module (PDM) recovering final pixel-wise prediction results. Specifically, the proposed HR-MPF firstly incorporates boosted module to High-Resolution Network (HRNet) in a progressive feature fusion manner. In this case, feature communication is augmented among all levels in this high-resolution network. Then, downstream classification module would identify benign and malignant pulmonary nodules based on feature map from PDM. In the adversarial training scheme, a discriminator is set to optimize HR-MPF and PDM through back propagation. Meanwhile, a reasonably designed multi-task loss function optimizes performance of segmentation and classification overall. To improve the accuracy of boundary prediction crucial to nodule segmentation, a boundary consistency constraint is designed and incorporated in the segmentation loss function. Experiments on publicly available LUNA16 dataset show that the framework outperforms relevant advanced methods in quantitative evaluation and visual perception.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Yuanyuan Tian ◽  
Jingyu Cao

AbstractTo accurately identify fatigued driving, establishing a monitoring system is one of the important guarantees of improving traffic safety and reducing traffic accidents. Among many research methods, electrooculogram signal (EOG) has unique advantages. This paper presents a systematic literature review of these technologies and summarizes a basic framework of fatigue driving monitoring system based on EOGs. Then we summarize the advantages and disadvantages of existing technologies. In addition, 80 primary references published during the last decade were identified. The multi-feature fusion technique based on EOGs performs better than other traditional methods due to its low cost, low power consumption and low intrusion, while its application is still limited which needs more efforts to obtain good and generalizable results. And then, an overview of the literature on technology is given, revealing a premier and unbiased survey of the existing empirical research of classification techniques that have been applied to fatigue driving analysis. Finally, this paper adds value to the current literature by investigating the application of EOG signals in fatigued driving and the design of related systems, future guidelines have been provided to practitioners and researchers to grasp the major contributions and challenges in the state-of-the-art research.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Ricardo Sanchez-Matilla ◽  
Andrea Cavallaro

AbstractOut-of-home audience measurement aims to count and characterize the people exposed to advertising content in the physical world. While audience measurement solutions based on computer vision are of increasing interest, no commonly accepted benchmark exists to evaluate and compare their performance. In this paper, we propose the first benchmark for digital out-of-home audience measurement that evaluates the vision-based tasks of audience localization and counting, and audience demographics. The benchmark is composed of a novel, dataset captured at multiple locations and a set of performance measures. Using the benchmark, we present an in-depth comparison of eight open-source algorithms on four hardware platforms with GPU and CPU-optimized inferences and of two commercial off-the-shelf solutions for localization, count, age, and gender estimation. This benchmark and related open-source codes are available at http://ava.eecs.qmul.ac.uk.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Lifang Yu ◽  
Gang Cao ◽  
Huawei Tian ◽  
Peng Cao ◽  
Zhenzhen Zhang ◽  
...  

AbstractQuick Response (QR) codes are designed for information storage and high-speed reading applications. To store additional information, Two-Level QR (2LQR) codes replace black modules in standard QR codes with specific texture patterns. When the 2LQR code is printed, texture patterns are blurred and their sizes are smaller than$$0.5{\mathrm{cm}}^{2}$$ 0.5 cm 2 . Recognizing small-sized blurred texture patterns is challenging. In original 2LQR literature, recognition of texture patterns is based on maximizing the correlation between print-and-scanned texture patterns and the original digital ones. When employing desktop printers with large pixel extensions and low-resolution capture devices, the recognition accuracy of texture patterns greatly reduces. To improve the recognition accuracy under this situation, our work presents a dictionary learning based scheme to recognize printed texture patterns. To our best knowledge, it is the first attempt to use dictionary learning to promote the recognition accuracy of printed texture patterns. In our scheme, dictionaries for all kinds of texture patterns are learned from print-and-scanned texture modules in the training stage. And these learned dictionaries are employed to represent each texture module in the testing stage (extracting process) to recognize their texture pattern. Experimental results show that our proposed algorithm significantly reduces the recognition error of small-sized printed texture patterns.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Khurram Khan ◽  
Abid Imran ◽  
Hafiz Zia Ur Rehman ◽  
Adnan Fazil ◽  
Muhammad Zakwan ◽  
...  

AbstractMultiple-license plate recognition is gaining popularity in the Intelligent Transport System (ITS) applications for security monitoring and surveillance. Advancements in acquisition devices have increased the availability of high definition (HD) images, which can capture images of multiple vehicles. Since license plate (LP) occupies a relatively small portion of an image, therefore, detection of LP in an image is considered a challenging task. Moreover, the overall performance deteriorates when the aforementioned factor combines with varying illumination conditions, such as night, dusk, and rainy. As it is difficult to locate a small object in an entire image, this paper proposes a two-step approach for plate localization in challenging conditions. In the first step, the Faster-Region-based Convolutional Neural Network algorithm (Faster R-CNN) is used to detect all the vehicles in an image, which results in scaled information to locate plates. In the second step, morphological operations are employed to reduce non-plate regions. Meanwhile, geometric properties are used to localize plates in the HSI color space. This approach increases accuracy and reduces processing time. For character recognition, the look-up table (LUT) classifier using adaptive boosting with modified census transform (MCT) as a feature extractor is used. Both proposed plate detection and character recognition methods have significantly outperformed conventional approaches in terms of precision and recall for multiple plate recognition.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Yu Su ◽  
Shuijie Wang ◽  
Qianqian Cheng ◽  
Yuhe Qiu

AbstractWith regard to video streaming services under wireless networks, how to improve the quality of experience (QoE) has always been a challenging task. Especially after the arrival of the 5G era, more attention has been paid to analyze the experience quality of video streaming in more complex network scenarios (such as 5G-powered drone video transmission). Insufficient buffer in the video stream transmission process will cause the playback to freeze [1]. In order to cope with this defect, this paper proposes a buffer starvation evaluation model based on deep learning and a video stream scheduling model based on reinforcement learning. This approach uses the method of machine learning to extract the correlation between the buffer starvation probability distribution and the traffic load, thereby obtaining the explicit evaluation results of buffer starvation events and a series of resource allocation strategies that optimize long-term QoE. In order to deal with the noise problem caused by the random environment, the model introduces an internal reward mechanism in the scheduling process, so that the agent can fully explore the environment. Experiments have proved that our framework can effectively evaluate and improve the video service quality of 5G-powered UAV.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Jianfang Cao ◽  
Yiming Jia ◽  
Minmin Yan ◽  
Xiaodong Tian

AbstractA stable enhanced superresolution generative adversarial network (SESRGAN) algorithm was proposed in this study to address the low-resolution and blurred texture details in ancient murals. This algorithm makes improvements on the basis of GANs, which use dense residual blocks to extract image features. After two upsampling steps, the feature information of the image is input into the high-resolution (HR) image space to realize an improvement in resolution, and the reconstructed HR image is finally generated. The discriminator network uses VGG as its basic framework to judge the authenticity of the input image. This study further optimized the details of the network model. In addition, three loss optimization models, i.e., the perceptual loss, content loss, and adversarial loss models, were integrated into the proposed algorithm. The Wasserstein GAN-gradient penalty (WGAN-GP) theory was used to optimize the adversarial loss of the model when calculating the perceptual loss and when using the preactivation feature information for calculation purposes. In addition, public data sets were used to pretrain the generative network model to achieve a high-quality initialization. The simulation experiment results showed that the proposed algorithm outperforms other related superresolution algorithms in terms of both objective and subjective evaluation indicators. A subjective perception evaluation was also conducted, and the reconstructed images produced by our algorithm were more in line with the general public’s visual perception than those produced by the other compared algorithms.


Sign in / Sign up

Export Citation Format

Share Document