scene content
Recently Published Documents


TOTAL DOCUMENTS

93
(FIVE YEARS 20)

H-INDEX

12
(FIVE YEARS 2)

2022 ◽  
Vol 54 (9) ◽  
pp. 1-36
Author(s):  
Xiongkuo Min ◽  
Ke Gu ◽  
Guangtao Zhai ◽  
Xiaokang Yang ◽  
Wenjun Zhang ◽  
...  

Screen content, which is often computer-generated, has many characteristics distinctly different from conventional camera-captured natural scene content. Such characteristic differences impose major challenges to the corresponding content quality assessment, which plays a critical role to ensure and improve the final user-perceived quality of experience (QoE) in various screen content communication and networking systems. Quality assessment of such screen content has attracted much attention recently, primarily because the screen content grows explosively due to the prevalence of cloud and remote computing applications in recent years, and due to the fact that conventional quality assessment methods can not handle such content effectively. As the most technology-oriented part of QoE modeling, image/video content/media quality assessment has drawn wide attention from researchers, and a large amount of work has been carried out to tackle the problem of screen content quality assessment. This article is intended to provide a systematic and timely review on this emerging research field, including (1) background of natural scene vs. screen content quality assessment; (2) characteristics of natural scene vs. screen content; (3) overview of screen content quality assessment methodologies and measures; (4) relevant benchmarks and comprehensive evaluation of the state-of-the-art; (5) discussions on generalizations from screen content quality assessment to QoE assessment, and other techniques beyond QoE assessment; and (6) unresolved challenges and promising future research directions. Throughout this article, we focus on the differences and similarities between screen content and conventional natural scene content. We expect that this review article shall provide readers with an overview of the background, history, recent progress, and future of the emerging screen content quality assessment research.


Author(s):  
Weijie Yang ◽  
Yueting Hui

Image scene analysis is to analyze image scene content through image semantic segmentation, which can identify the categories and positions of different objects in an image. However, due to the loss of spatial detail information, the accuracy of image scene analysis is often affected, resulting in rough edges of FCN, inconsistent class labels of target regions and missing small targets. To address these problems, this paper increases the receptive field, conducts multi-scale fusion and changes the weight of different sensitive channels, so as to improve the feature discrimination and maintain or restore spatial detail information. Furthermore, the deep neural network FCN is used to build the base model of semantic segmentation. The ASPP, data augmentation, SENet, decoder and global pooling are added to the baseline to optimize the model structure and improve the effect of semantic segmentation. Finally, the more accurate results of scene analysis are obtained.


2021 ◽  
Author(s):  
Nicola C Anderson ◽  
Oliver Jacobs ◽  
Walter F. Bischof ◽  
Alan Kingstone

It has long been thought that visual perception is represented in sensorimotor processes that unfold over time. One prominent theory predicts that our memory for a scene consists of both the scene content and the motor commands (i.e., eye movements) used to explore that scene. This Scanpath Theory (Noton & Stark, Science 171 (1971) 308-311) has long been contested, with many studies providing evidence both for, and against it. That past work, however, has failed to account for the fact that visual perception is embodied within an active system of effectors, namely, that people routinely move both their eyes and head to explore visible space. In the present work we tested Scanpath Theory while observers were free to move within a 360-degree VR environment. Their task was to encode and later recognise panoramic scenes within this fully immersive world. During both encoding and recognition, we recorded their eye and head movements using a VR headset equipped with eye and head tracking. Our results reveal that eye and head movement patterns are diagnostic of memory performance; and that scene recognition improves when certain movements that had occurred during encoding are repeated. Finally, including head movement measures enhances performance prediction, strengthening the evidence for Scanpath Theory, and reinforcing the fact that the head moves in service of the eyes in allocating attention.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kevin Allan ◽  
Nir Oren ◽  
Jacqui Hutchison ◽  
Douglas Martin

AbstractIf artificial intelligence (AI) is to help solve individual, societal and global problems, humans should neither underestimate nor overestimate its trustworthiness. Situated in-between these two extremes is an ideal ‘Goldilocks’ zone of credibility. But what will keep trust in this zone? We hypothesise that this role ultimately falls to the social cognition mechanisms which adaptively regulate conformity between humans. This novel hypothesis predicts that human-like functional biases in conformity should occur during interactions with AI. We examined multiple tests of this prediction using a collaborative remembering paradigm, where participants viewed household scenes for 30 s vs. 2 min, then saw 2-alternative forced-choice decisions about scene content originating either from AI- or human-sources. We manipulated the credibility of different sources (Experiment 1) and, from a single source, the estimated-likelihood (Experiment 2) and objective accuracy (Experiment 3) of specific decisions. As predicted, each manipulation produced functional biases for AI-sources mirroring those found for human-sources. Participants conformed more to higher credibility sources, and higher-likelihood or more objectively accurate decisions, becoming increasingly sensitive to source accuracy when their own capability was reduced. These findings support the hypothesised role of social cognition in regulating AI’s influence, raising important implications and new directions for research on human–AI interaction.


2021 ◽  
Vol 145 ◽  
pp. 8-15
Author(s):  
Fabio Bellavia ◽  
Marco Fanfani ◽  
Carlo Colombo ◽  
Alessandro Piva

2021 ◽  
pp. 1-13
Author(s):  
Elissa M. Aminoff ◽  
Michael J. Tarr

Abstract Rapid visual perception is often viewed as a bottom–up process. Category-preferred neural regions are often characterized as automatic, default processing mechanisms for visual inputs of their categorical preference. To explore the sensitivity of such regions to top–down information, we examined three scene-preferring brain regions, the occipital place area (OPA), the parahippocampal place area (PPA), and the retrosplenial complex (RSC), and tested whether the processing of outdoor scenes is influenced by the functional contexts in which they are seen. Context was manipulated by presenting real-world landscape images as if being viewed through a window or within a picture frame—manipulations that do not affect scene content but do affect one's functional knowledge regarding the scene. This manipulation influences neural scene processing (as measured by fMRI): The OPA and the PPA exhibited greater neural activity when participants viewed images as if through a window as compared with within a picture frame, whereas the RSC did not show this difference. In a separate behavioral experiment, functional context affected scene memory in predictable directions (boundary extension). Our interpretation is that the window context denotes three-dimensionality, therefore rendering the perceptual experience of viewing landscapes as more realistic. Conversely, the frame context denotes a 2-D image. As such, more spatially biased scene representations in the OPA and the PPA are influenced by differences in top–down, perceptual expectations generated from context. In contrast, more semantically biased scene representations in the RSC are likely to be less affected by top–down signals that carry information about the physical layout of a scene.


Author(s):  
Oliver van Zwanenberg ◽  
Sophie Triantaphillidou ◽  
Alexandra Psarrou ◽  
Robin B. Jenkin

The Natural Scene derived Spatial Frequency Response (NS-SFR) framework automatically extracts suitable step-edges from natural pictorial scenes and processes these edges via the edge-based ISO12233 (e-SFR) algorithm. Previously, a novel methodology was presented to estimate the standard e-SFR from NS-SFR data. This paper implements this method using diverse natural scene image datasets from three characterized camera systems. Quantitative analysis was carried out on the system e-SFR estimates to validate accuracy of the method. Both linear and non-linear camera systems were evaluated. To investigate how scene content and dataset size affect system e-SFR estimates, analysis was conducted on entire datasets, as well as subsets of various sizes and scene group types. Results demonstrate that system e-SFR estimates strongly correlate with results from test chart inputs, with accuracy comparable to that of the ISO12233. Further work toward improving and fine-tuning the proposed methodology for practical implementation is discussed.


2020 ◽  
Author(s):  
Yaelan Jung ◽  
Dirk B. Walther

AbstractNatural scenes deliver rich sensory information about the world. Decades of research has shown that the scene-selective network in the visual cortex represents various aspects of scenes. It is, however, unknown how such complex scene information is processed beyond the visual cortex, such as in the prefrontal cortex. It is also unknown how task context impacts the process of scene perception, modulating which scene content is represented in the brain. In this study, we investigate these questions using scene images from four natural scene categories, which also depict two types of global scene properties, temperature (warm or cold), and sound-level (noisy or quiet). A group of healthy human subjects from both sexes participated in the present study using fMRI. In the study, participants viewed scene images under two different task conditions; temperature judgment and sound-level judgment. We analyzed how different scene attributes (scene categories, temperature, and sound-level information) are represented across the brain under these task conditions. Our findings show that global scene properties are only represented in the brain, especially in the prefrontal cortex, when they are task-relevant. However, scene categories are represented in the brain, in both the parahippocampal place area and the prefrontal cortex, regardless of task context. These findings suggest that the prefrontal cortex selectively represents scene content according to task demands, but this task selectivity depends on the types of scene content; task modulates neural representations of global scene properties but not of scene categories.


Sign in / Sign up

Export Citation Format

Share Document