scene representation
Recently Published Documents


TOTAL DOCUMENTS

132
(FIVE YEARS 32)

H-INDEX

16
(FIVE YEARS 4)

2021 ◽  
Vol 13 (17) ◽  
pp. 9681
Author(s):  
Matteo Miani ◽  
Matteo Dunnhofer ◽  
Christian Micheloni ◽  
Andrea Marini ◽  
Nicola Baldo

Improving pedestrian safety at urban intersections requires intelligent systems that should not only understand the actual vehicle–pedestrian (V2P) interaction state but also proactively anticipate the event’s future severity pattern. This paper presents a Gated Recurrent Unit-based system that aims to predict, up to 3 s ahead in time, the severity level of V2P encounters, depending on the current scene representation drawn from on-board radars’ data. A car-driving simulator experiment has been designed to collect sequential mobility features on a cohort of 65 licensed university students who faced different V2P conflicts on a planned urban route. To accurately describe the pedestrian safety condition during the encounter process, a combination of surrogate safety indicators, namely TAdv (Time Advantage) and T2 (Nearness of the Encroachment), are considered for modeling. Due to the nature of these indicators, multiple recurrent neural networks are trained to separately predict T2 continuous values and TAdv categories. Afterwards, their predictions are exploited to label serious conflict interactions. As a comparison, an additional Gated Recurrent Unit (GRU) neural network is developed to directly predict the severity level of inner-city encounters. The latter neural model reaches the best performance on the test set, scoring a recall value of 0.899. Based on selected threshold values, the presented models can be used to label pedestrians near accident events and to enhance existing intelligent driving systems.


2021 ◽  
Vol 13 (16) ◽  
pp. 3113
Author(s):  
Ming Li ◽  
Lin Lei ◽  
Yuqi Tang ◽  
Yuli Sun ◽  
Gangyao Kuang

Remote sensing image scene classification (RSISC) has broad application prospects, but related challenges still exist and urgently need to be addressed. One of the most important challenges is how to learn a strong discriminative scene representation. Recently, convolutional neural networks (CNNs) have shown great potential in RSISC due to their powerful feature learning ability; however, their performance may be restricted by the complexity of remote sensing images, such as spatial layout, varying scales, complex backgrounds, category diversity, etc. In this paper, we propose an attention-guided multilayer feature aggregation network (AGMFA-Net) that attempts to improve the scene classification performance by effectively aggregating features from different layers. Specifically, to reduce the discrepancies between different layers, we employed the channel–spatial attention on multiple high-level convolutional feature maps to capture more accurately semantic regions that correspond to the content of the given scene. Then, we utilized the learned semantic regions as guidance to aggregate the valuable information from multilayer convolutional features, so as to achieve stronger scene features for classification. Experimental results on three remote sensing scene datasets indicated that our approach achieved competitive classification performance in comparison to the baselines and other state-of-the-art methods.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1446
Author(s):  
Zhouyan He ◽  
Yang Song ◽  
Caiming Zhong ◽  
Li Li

The multi-exposure fusion (MEF) technique provides humans a new opportunity for natural scene representation, and the related quality assessment issues are urgent to be considered for validating the effectiveness of these techniques. In this paper, a curvature and entropy statistics-based blind MEF image quality assessment (CE-BMIQA) method is proposed to perceive the quality degradation objectively. The transformation process from multiple images with different exposure levels to the final MEF image leads to the loss of structure and detail information, so that the related curvature statistics features and entropy statistics features are utilized to portray the above distortion presentation. The former features are extracted from the histogram statistics of surface type map calculated by mean curvature and Gaussian curvature of MEF image. Moreover, contrast energy weighting is attached to consider the contrast variation of the MEF image. The latter features refer to spatial entropy and spectral entropy. All extracted features based on a multi-scale scheme are aggregated by training the quality regression model via random forest. Since the MEF image and its feature representation are spatially symmetric in physics, the final prediction quality is symmetric to and representative of the image distortion. Experimental results on a public MEF image database demonstrate that the proposed CE-BMIQA method achieves more outstanding performance than the state-of-the-art blind image quality assessment ones.


2021 ◽  
Author(s):  
Jieting Luo ◽  
Yan Wo ◽  
Bicheng Wu ◽  
Guoqiang Han
Keyword(s):  

Author(s):  
E. K. Stathopoulou ◽  
S. Rigon ◽  
R. Battisti ◽  
F. Remondino

Abstract. Mesh models generated by multi view stereo (MVS) algorithms often fail to represent in an adequate manner the sharp, natural edge details of the scene. The harsh depth discontinuities of edge regions are eventually a challenging task for dense reconstruction, while vertex displacement during mesh refinement frequently leads to smoothed edges that do not coincide with the fine details of the scene. Meanwhile, 3D edges have been used for scene representation, particularly man-made built environments, which are dominated by regular planar and linear structures. Indeed, 3D edge detection and matching are commonly exploited either to constrain camera pose estimation, or to generate an abstract representation of the most salient parts of the scene, and even to support mesh reconstruction. In this work, we attempt to jointly use 3D edge extraction and MVS mesh generation to promote edge detail preservation in the final result. Salient 3D edges of the scene are reconstructed with state-of-the-art algorithms and integrated in the dense point cloud to be further used in order to support the mesh triangulation step. Experimental results on benchmark dataset sequences using metric and appearance-based measures are performed in order to evaluate our hypothesis.


2021 ◽  
Author(s):  
Kathirvel N ◽  
Thanabal M.S

Abstract In this paper, a simple at the same time effective recognition system for indoor scenes is presented. The proposed system has two phases, namely, creation of mandatory and desirable objects and an indoor scene recognition system. In the first phase a list of probable objects and their classification, such as mandatory and desirable objects, for any generic scene is created based on real time indoor environment clubbed with human knowledge on standard datasets. In the second phase, the proposed system contains four stages. In the first stage, the proposed indoor scene recognition system identifies and recognizes the objects of the given key frame based on simplified version of CNN architecture of YOLO v3. In the second stage, the identified objects are divided into two sets of mandatory and desirable objects with a simple dictionary look-up. In the third stage, the objects are identified to belong to a probable scene and this technique is called scene-object identification. Simple algorithms have been proposed to effect the above three stages. In the final stage, a novel Binary Scene Representation (BSR) is proposed for each of the probable scenes and the final scene recognition is obtained with a new scene-number, obtained after converting the binary BSR into decimal number system. The effect of proposed indoor scene recognition system has been experimented with standard input datasets and measured in terms of standard measures, besides comparison with existing schemes. The results are encouraging.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1243
Author(s):  
Saba Arshad ◽  
Gon-Woo Kim

Loop closure detection is of vital importance in the process of simultaneous localization and mapping (SLAM), as it helps to reduce the cumulative error of the robot’s estimated pose and generate a consistent global map. Many variations of this problem have been considered in the past and the existing methods differ in the acquisition approach of query and reference views, the choice of scene representation, and associated matching strategy. Contributions of this survey are many-fold. It provides a thorough study of existing literature on loop closure detection algorithms for visual and Lidar SLAM and discusses their insight along with their limitations. It presents a taxonomy of state-of-the-art deep learning-based loop detection algorithms with detailed comparison metrics. Also, the major challenges of conventional approaches are identified. Based on those challenges, deep learning-based methods were reviewed where the identified challenges are tackled focusing on the methods providing long-term autonomy in various conditions such as changing weather, light, seasons, viewpoint, and occlusion due to the presence of mobile objects. Furthermore, open challenges and future directions were also discussed.


Sign in / Sign up

Export Citation Format

Share Document