Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.


2021 ◽  
Vol 13 (16) ◽  
pp. 3099
Author(s):  
Stephan Nebiker ◽  
Jonas Meyer ◽  
Stefan Blaser ◽  
Manuela Ammann ◽  
Severin Rhyner

A successful application of low-cost 3D cameras in combination with artificial intelligence (AI)-based 3D object detection algorithms to outdoor mobile mapping would offer great potential for numerous mapping, asset inventory, and change detection tasks in the context of smart cities. This paper presents a mobile mapping system mounted on an electric tricycle and a procedure for creating on-street parking statistics, which allow government agencies and policy makers to verify and adjust parking policies in different city districts. Our method combines georeferenced red-green-blue-depth (RGB-D) imagery from two low-cost 3D cameras with state-of-the-art 3D object detection algorithms for extracting and mapping parked vehicles. Our investigations demonstrate the suitability of the latest generation of low-cost 3D cameras for real-world outdoor applications with respect to supported ranges, depth measurement accuracy, and robustness under varying lighting conditions. In an evaluation of suitable algorithms for detecting vehicles in the noisy and often incomplete 3D point clouds from RGB-D cameras, the 3D object detection network PointRCNN, which extends region-based convolutional neural networks (R-CNNs) to 3D point clouds, clearly outperformed all other candidates. The results of a mapping mission with 313 parking spaces show that our method is capable of reliably detecting parked cars with a precision of 100% and a recall of 97%. It can be applied to unslotted and slotted parking and different parking types including parallel, perpendicular, and angle parking.


2015 ◽  
Vol 66 ◽  
pp. 1-17 ◽  
Author(s):  
Rãzvan-George Mihalyi ◽  
Kaustubh Pathak ◽  
Narunas Vaskevicius ◽  
Tobias Fromm ◽  
Andreas Birk

2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
A. A. M. Muzahid ◽  
Wanggen Wan ◽  
Li Hou

The advancement of low-cost RGB-D and LiDAR three-dimensional (3D) sensors has permitted the obtainment of the 3D model easier in real-time. However, making intricate 3D features is crucial for the advancement of 3D object classifications. The existing volumetric voxel-based CNN approaches have achieved remarkable progress, but they generate huge computational overhead that limits the extraction of global features at higher resolutions of 3D objects. In this paper, a low-cost 3D volumetric deep convolutional neural network is proposed for 3D object classification based on joint multiscale hierarchical and subvolume supervised learning strategies. Our proposed deep neural network inputs 3D data, which are preprocessed by implementing memory-efficient octree representation, and we propose to limit the full layer octree depth to a certain level based on the predefined input volume resolution for storing high-precision contour features. Multiscale features are concatenated from multilevel octree depths inside the network, aiming to adaptively generate high-level global features. The strategy of the subvolume supervision approach is to train the network on subparts of the 3D object in order to learn local features. Our framework has been evaluated with two publicly available 3D repositories. Experimental results demonstrate the effectiveness of our proposed method where the classification accuracy is improved in comparison to existing volumetric approaches, and the memory consumption ratio and run-time are significantly reduced.


2016 ◽  
Vol 14 (3) ◽  
pp. 585-604 ◽  
Author(s):  
Alberto Garcia-Garcia ◽  
Sergio Orts-Escolano ◽  
Jose Garcia-Rodriguez ◽  
Miguel Cazorla

Author(s):  
Eduardo Filgueiras Damasceno ◽  
Douglas Lopes Farias ◽  
Renan Cleverson Laureano Flor da Rosa ◽  
Rafael de Souza Fernandes
Keyword(s):  
Low Cost ◽  

Author(s):  
R. Shults

In the paper, the questions of using the technologies of low-cost photogrammetry in combination with the additional capabilities of modern smartphones are considered. The research was carried out on the example of documenting the historical construction of the II World War – the Kiev Fortified Region. Brief historical information about the object of research is given. The possibilities of using modern smartphones as measuring instruments are considered. To get high-quality results, the camera of the smartphone was calibrated. The calibration results were used in the future to perform 3D modeling of defense facilities. Photographing of three defense structures in a different state: destroyed, partially destroyed and operating was carried out. Based on the results of photography using code targets, 3D object models were constructed. To verify the accuracy of the 3D modelling, control measurements of the lines between the code targets at the objects were performed. The obtained results are satisfying, and the technology considered in the paper can be recommended for use in performing archaeological and historical studies.


Author(s):  
Pedro B. Pascoal ◽  
Daniel Mendes ◽  
Diogo Henriques ◽  
Isabel Trancoso ◽  
Alfredo Ferreira

The number of available 3D digital objects has been increasing considerably. As such, searching in large collections has been subject of vast research. However, the main focus has been on algorithms and techniques for classification, indexing and retrieval. While some works have been done on query interfaces and results visualization, they do not explore natural interactions. The authors propose a speech interface for 3D object retrieval in immersive virtual environments. As a proof of concept, they developed the LS3D prototype, using the context of LEGO blocks to understand how people naturally describe such objects. Through a preliminary study, it was found that participants mainly resorted to verbal descriptions. Considering these descriptions and using a low cost visualization device, the authors developed their solution. They compared it with a commercial application through a user evaluation. Results suggest that LS3D can outperform its contestant, and ensures better performance and results perception than traditional approaches for 3D object retrieval.


Sign in / Sign up

Export Citation Format

Share Document