perceptual quality
Recently Published Documents


TOTAL DOCUMENTS

612
(FIVE YEARS 192)

H-INDEX

27
(FIVE YEARS 4)

2022 ◽  
Vol 12 (2) ◽  
pp. 824
Author(s):  
Kamran Javed ◽  
Nizam Ud Din ◽  
Ghulam Hussain ◽  
Tahir Farooq

Face photographs taken on a bright sunny day or in floodlight contain unnecessary shadows of objects on the face. Most previous works deal with removing shadow from scene images and struggle with doing so for facial images. Faces have a complex semantic structure, due to which shadow removal is challenging. The aim of this research is to remove the shadow of an object in facial images. We propose a novel generative adversarial network (GAN) based image-to-image translation approach for shadow removal in face images. The first stage of our model automatically produces a binary segmentation mask for the shadow region. Then, the second stage, which is a GAN-based network, removes the object shadow and synthesizes the effected region. The generator network of our GAN has two parallel encoders—one is standard convolution path and the other is a partial convolution. We find that this combination in the generator results not only in learning an incorporated semantic structure but also in disentangling visual discrepancies problems under the shadow area. In addition to GAN loss, we exploit low level L1, structural level SSIM and perceptual loss from a pre-trained loss network for better texture and perceptual quality, respectively. Since there is no paired dataset for the shadow removal problem, we created a synthetic shadow dataset for training our network in a supervised manner. The proposed approach effectively removes shadows from real and synthetic test samples, while retaining complex facial semantics. Experimental evaluations consistently show the advantages of the proposed method over several representative state-of-the-art approaches.


2022 ◽  
Vol 15 ◽  
Author(s):  
Chenxi Feng ◽  
Long Ye ◽  
Qin Zhang

This work proposes an end-to-end cross-domain feature similarity guided deep neural network for perceptual quality assessment. Our proposed blind image quality assessment approach is based on the observation that features similarity across different domains (e.g., Semantic Recognition and Quality Prediction) is well correlated with the subjective quality annotations. Such phenomenon is validated by thoroughly analyze the intrinsic interaction between an object recognition task and a quality prediction task in terms of characteristics of the human visual system. Based on the observation, we designed an explicable and self-contained cross-domain feature similarity guided BIQA framework. Experimental results on both authentical and synthetic image quality databases demonstrate the superiority of our approach, as compared to the state-of-the-art models.


2022 ◽  
Vol 2022 (1) ◽  
Author(s):  
Shahi Dost ◽  
Faryal Saud ◽  
Maham Shabbir ◽  
Muhammad Gufran Khan ◽  
Muhammad Shahid ◽  
...  

AbstractWith the growing demand for image and video-based applications, the requirements of consistent quality assessment metrics of image and video have increased. Different approaches have been proposed in the literature to estimate the perceptual quality of images and videos. These approaches can be divided into three main categories; full reference (FR), reduced reference (RR) and no-reference (NR). In RR methods, instead of providing the original image or video as a reference, we need to provide certain features (i.e., texture, edges, etc.) of the original image or video for quality assessment. During the last decade, RR-based quality assessment has been a popular research area for a variety of applications such as social media, online games, and video streaming. In this paper, we present review and classification of the latest research work on RR-based image and video quality assessment. We have also summarized different databases used in the field of 2D and 3D image and video quality assessment. This paper would be helpful for specialists and researchers to stay well-informed about recent progress of RR-based image and video quality assessment. The review and classification presented in this paper will also be useful to gain understanding of multimedia quality assessment and state-of-the-art approaches used for the analysis. In addition, it will help the reader select appropriate quality assessment methods and parameters for their respective applications.


Author(s):  
Charles Bonnineau ◽  
Wassim Hamidouche ◽  
Jerome Fournier ◽  
Naty Sidaty ◽  
Jean-Francois Travers ◽  
...  

2021 ◽  
Vol 13 (4) ◽  
pp. 1-24
Author(s):  
Jessica Chen ◽  
Henry Milner ◽  
Ion Stoica ◽  
Jibin Zhan

The HTTP adaptive streaming technique opened the door to cope with the fluctuating network conditions during the streaming process by dynamically adjusting the volume of the future chunks to be downloaded. The bitrate selection in this adjustment inevitably involves the task of predicting the future throughput of a video session, owing to which various heuristic solutions have been explored. The ultimate goal of the present work is to explore the theoretical upper bounds of the QoE that any ABR algorithm can possibly reach, therefore providing an essential step to benchmarking the performance evaluation of ABR algorithms. In our setting, the QoE is defined in terms of a linear combination of the average perceptual quality and the buffering ratio. The optimization problem is proven to be NP-hard when the perceptual quality is defined by chunk size and conditions are given under which the problem becomes polynomially solvable. Enriched by a global lower bound, a pseudo-polynomial time algorithm along the dynamic programming approach is presented. When the minimum buffering is given higher priority over higher perceptual quality, the problem is shown to be also NP-hard, and the above algorithm is simplified and enhanced by a sequence of lower bounds on the completion time of chunk downloading, which, according to our experiment, brings a 36.0% performance improvement in terms of computation time. To handle large amounts of data more efficiently, a polynomial-time algorithm is also introduced to approximate the optimal values when minimum buffering is prioritized. Besides its performance guarantee, this algorithm is shown to reach 99.938% close to the optimal results, while taking only 0.024% of the computation time compared to the exact algorithm in dynamic programming.


2021 ◽  
Vol 14 (1) ◽  
pp. 24
Author(s):  
Yuan Hu ◽  
Lei Chen ◽  
Zhibin Wang ◽  
Xiang Pan ◽  
Hao Li

Deep-learning-based radar echo extrapolation methods have achieved remarkable progress in the precipitation nowcasting field. However, they suffer from a common notorious problem—they tend to produce blurry predictions. Although some efforts have been made in recent years, the blurring problem is still under-addressed. In this work, we propose three effective strategies to assist deep-learning-based radar echo extrapolation methods to achieve more realistic and detailed prediction. Specifically, we propose a spatial generative adversarial network (GAN) and a spectrum GAN to improve image fidelity. The spatial and spectrum GANs aim at penalizing the distribution discrepancy between generated and real images from the spatial domain and spectral domain, respectively. In addition, a masked style loss is devised to further enhance the details by transferring the detailed texture of ground truth radar sequences to extrapolated ones. We apply a foreground mask to prevent the background noise from transferring to the outputs. Moreover, we also design a new metric termed the power spectral density score (PSDS) to quantify the perceptual quality from a frequency perspective. The PSDS metric can be applied as a complement to other visual evaluation metrics (e.g., LPIPS) to achieve a comprehensive measurement of image sharpness. We test our approaches with both ConvLSTM baseline and U-Net baseline, and comprehensive ablation experiments on the SEVIR dataset show that the proposed approaches are able to produce much more realistic radar images than baselines. Most notably, our methods can be readily applied to any deep-learning-based spatiotemporal forecasting models to acquire more detailed results.


2021 ◽  
pp. 1-11
Author(s):  
Haoran Wu ◽  
Fazhi He ◽  
Yansong Duan ◽  
Xiaohu Yan

Pose transfer, which synthesizes a new image of a target person in a novel pose, is valuable in several applications. Generative adversarial networks (GAN) based pose transfer is a new way for person re-identification (re-ID). Typical perceptual metrics, like Detection Score (DS) and Inception Score (IS), were employed to assess the visual quality after generation in pose transfer task. Thus, the existing GAN-based methods do not directly benefit from these metrics which are highly associated with human ratings. In this paper, a perceptual metrics guided GAN (PIGGAN) framework is proposed to intrinsically optimize generation processing for pose transfer task. Specifically, a novel and general model-Evaluator that matches well the GAN is designed. Accordingly, a new Sort Loss (SL) is constructed to optimize the perceptual quality. Morevover, PIGGAN is highly flexible and extensible and can incorporate both differentiable and indifferentiable indexes to optimize the attitude migration process. Extensive experiments show that PIGGAN can generate photo-realistic results and quantitatively outperforms state-of-the-art (SOTA) methods.


2021 ◽  
Vol 14 (12) ◽  
pp. 7729-7747
Author(s):  
Andrew Geiss ◽  
Joseph C. Hardin

Abstract. Missing and low-quality data regions are a frequent problem for weather radars. They stem from a variety of sources: beam blockage, instrument failure, near-ground blind zones, and many others. Filling in missing data regions is often useful for estimating local atmospheric properties and the application of high-level data processing schemes without the need for preprocessing and error-handling steps – feature detection and tracking, for instance. Interpolation schemes are typically used for this task, though they tend to produce unrealistically spatially smoothed results that are not representative of the atmospheric turbulence and variability that are usually resolved by weather radars. Recently, generative adversarial networks (GANs) have achieved impressive results in the area of photo inpainting. Here, they are demonstrated as a tool for infilling radar missing data regions. These neural networks are capable of extending large-scale cloud and precipitation features that border missing data regions into the regions while hallucinating plausible small-scale variability. In other words, they can inpaint missing data with accurate large-scale features and plausible local small-scale features. This method is demonstrated on a scanning C-band and vertically pointing Ka-band radar that were deployed as part of the Cloud Aerosol and Complex Terrain Interactions (CACTI) field campaign. Three missing data scenarios are explored: infilling low-level blind zones and short outage periods for the Ka-band radar and infilling beam blockage areas for the C-band radar. Two deep-learning-based approaches are tested, a convolutional neural network (CNN) and a GAN that optimize pixel-level error or combined pixel-level error and adversarial loss respectively. Both deep-learning approaches significantly outperform traditional inpainting schemes under several pixel-level and perceptual quality metrics.


2021 ◽  
Vol 923 (1) ◽  
pp. 76
Author(s):  
Junlan Deng ◽  
Wei Song ◽  
Dan Liu ◽  
Qin Li ◽  
Ganghua Lin ◽  
...  

Abstract In recent years, the new physics of the Sun has been revealed using advanced data with high spatial and temporal resolutions. The Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamic Observatory has accumulated abundant observation data for the study of solar activity with sufficient cadence, but their spatial resolution (about 1″) is not enough to analyze the subarcsecond structure of the Sun. On the other hand, high-resolution observation from large-aperture ground-based telescopes, such as the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory, can achieve a much higher resolution on the order of 0.″1 (about 70 km). However, these high-resolution data only became available in the past 10 yr, with a limited time period during the day and with a very limited field of view. The Generative Adversarial Network (GAN) has greatly improved the perceptual quality of images in image translation tasks, and the self-attention mechanism can retrieve rich information from images. This paper uses HMI and GST images to construct a precisely aligned data set based on the scale-invariant feature transform algorithm and t0 reconstruct the HMI continuum images with four times better resolution. Neural networks based on the conditional GAN and self-attention mechanism are trained to restore the details of solar active regions and to predict the reconstruction error. The experimental results show that the reconstructed images are in good agreement with GST images, demonstrating the success of resolution improvement using machine learning.


Sign in / Sign up

Export Citation Format

Share Document