scene features
Recently Published Documents


TOTAL DOCUMENTS

37
(FIVE YEARS 13)

H-INDEX

6
(FIVE YEARS 1)

2021 ◽  
Vol 13 (24) ◽  
pp. 5076
Author(s):  
Di Wang ◽  
Jinhui Lan

Remote sensing scene classification converts remote sensing images into classification information to support high-level applications, so it is a fundamental problem in the field of remote sensing. In recent years, many convolutional neural network (CNN)-based methods have achieved impressive results in remote sensing scene classification, but they have two problems in extracting remote sensing scene features: (1) fixed-shape convolutional kernels cannot effectively extract features from remote sensing scenes with complex shapes and diverse distributions; (2) the features extracted by CNN contain a large number of redundant and invalid information. To solve these problems, this paper constructs a deformable convolutional neural network to adapt the convolutional sampling positions to the shape of objects in the remote sensing scene. Meanwhile, the spatial and channel attention mechanisms are used to focus on the effective features while suppressing the invalid ones. The experimental results indicate that the proposed method is competitive to the state-of-the-art methods on three remote sensing scene classification datasets (UCM, NWPU, and AID).


2021 ◽  
Vol 12 (5) ◽  
pp. 1-19
Author(s):  
Yuan Cheng ◽  
Yuchao Yang ◽  
Hai-Bao Chen ◽  
Ngai Wong ◽  
Hao Yu

Real-time segmentation and understanding of driving scenes are crucial in autonomous driving. Traditional pixel-wise approaches extract scene information by segmenting all pixels in a frame, and hence are inefficient and slow. Proposal-wise approaches only learn from the proposed object candidates, but still require multiple steps on the expensive proposal methods. Instead, this work presents a fast single-shot segmentation strategy for video scene understanding. The proposed net, called S3-Net, quickly locates and segments target sub-scenes , and meanwhile extracts attention-aware time-series sub-scene features ( ats-features ) as inputs to an attention-aware spatio-temporal model (ASM) . Utilizing tensorization and quantization techniques, S3-Net is intended to be lightweight for edge computing. Experiments results on CityScapes, UCF11, HMDB51, and MOMENTS datasets demonstrate that the proposed S3-Net achieves an accuracy improvement of 8.1% versus the 3D-CNN based approach on UCF11, a storage reduction of 6.9× and an inference speed of 22.8 FPS on CityScapes with a GTX1080Ti GPU.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Taylor R. Hayes ◽  
John M. Henderson

AbstractDeep saliency models represent the current state-of-the-art for predicting where humans look in real-world scenes. However, for deep saliency models to inform cognitive theories of attention, we need to know how deep saliency models prioritize different scene features to predict where people look. Here we open the black box of three prominent deep saliency models (MSI-Net, DeepGaze II, and SAM-ResNet) using an approach that models the association between attention, deep saliency model output, and low-, mid-, and high-level scene features. Specifically, we measured the association between each deep saliency model and low-level image saliency, mid-level contour symmetry and junctions, and high-level meaning by applying a mixed effects modeling approach to a large eye movement dataset. We found that all three deep saliency models were most strongly associated with high-level and low-level features, but exhibited qualitatively different feature weightings and interaction patterns. These findings suggest that prominent deep saliency models are primarily learning image features associated with high-level scene meaning and low-level image saliency and highlight the importance of moving beyond simply benchmarking performance.


2021 ◽  
Author(s):  
Taylor R. Hayes ◽  
John M. Henderson

Abstract Deep saliency models represent the current state-of-the-art for predicting where humans look in real-world scenes. However, for deep saliency models to inform cognitive theories of attention, we need to know how deep saliency models predict where people look. Here we open the black box of deep saliency models using an approach that models the association between the output of 3 prominent deep saliency models (MSI-Net, DeepGaze II, and SAM-ResNet) and low-, mid-, and high-level scene features. Specifically, we measured the association between each deep saliency model and low-level image saliency, mid-level contour symmetry and junctions, and high-level meaning by applying a mixed effects modeling approach to a large eye movement dataset. We found that despite different architectures, training regimens, and loss functions, all three deep saliency models were most strongly associated with high-level meaning. These findings suggest that deep saliency models are primarily learning image features associated with scene meaning.


Author(s):  
Raksha Ramesh ◽  
Vishal Anand ◽  
Ziyin Wang ◽  
Tianle Zhu ◽  
Wenfeng Lyu ◽  
...  
Keyword(s):  

2020 ◽  
Author(s):  
Nina Kowalewski ◽  
Janne Kauttonen ◽  
Patricia L. Stan ◽  
Brian B. Jeon ◽  
Thomas Fuchs ◽  
...  

SummaryThe development of the visual system is known to be shaped by early-life experience. To identify response properties that contribute to enhanced natural scene representation, we performed calcium imaging of excitatory neurons in the primary visual cortex (V1) of awake mice raised in three different conditions (standard-reared, dark-reared, and delayed-visual experience) and compared neuronal responses to natural scene features relative to simpler grating stimuli that varied in orientation and spatial frequency. We assessed population selectivity in V1 using decoding methods and found that natural scene discriminability increased by 75% between the ages of 4 to 6 weeks. Both natural scene and grating discriminability were higher in standard-reared animals compared to those raised in the dark. This increase in discriminability was accompanied by a reduction in the number of neurons that responded to low-spatial frequency gratings. At the same time there was an increase in neuronal preference for natural scenes. Light exposure restricted to a 2-4 week window during adulthood did not induce improvements in natural scene nor in grating stimulus discriminability. Our results demonstrate that experience reduces the number of neurons required to effectively encode grating stimuli and that early visual experience enhances natural scene discriminability by directly increasing responsiveness to natural scene features.


Author(s):  
L. Joachim ◽  
T. Storch

Abstract. Cloud detection for night-time panchromatic visible and near-infrared (VNIR) satellite imagery is typically performed based on synchronized observations in the thermal infrared (TIR). To be independent of TIR and to improve existing algorithms, we realize and analyze cloud detection based on VNIR only, here NPP/VIIRS/DNB observations. Using Random Forest for classifying cloud vs. clear and focusing on urban areas, we illustrate the importance of features describing a) the scattering by clouds especially over urban areas with their inhomogeneous light emissions and b) the normalized differences between Earth’s surface and cloud albedo especially in presence of Moon illumination. The analyses substantiate the influences of a) the training site and scene selections and b) the consideration of single scene or multi-temporal scene features on the results for the test sites. As test sites, diverse urban areas and the challenging land covers ocean, desert, and snow are considered. Accuracies of up to 85% are achieved for urban test sites.


Sign in / Sign up

Export Citation Format

Share Document