Object Detection with Neural Models, Deep Learning and Common Sense to Aid Smart Mobility

Author(s):  
Abidha Pandey ◽  
Manish Puri ◽  
Aparna Varde
2021 ◽  
Vol 7 (8) ◽  
pp. 145
Author(s):  
Antoine Mauri ◽  
Redouane Khemmar ◽  
Benoit Decoux ◽  
Madjid Haddad ◽  
Rémi Boutteau

For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, we introduce in this work a new real-time deep learning approach for 3D multi-object detection for smart mobility not only on roads, but also on railways. To obtain the 3D bounding boxes of the objects, we modified a proven real-time 2D detector, YOLOv3, to predict 3D object localization, object dimensions, and object orientation. Our method has been evaluated on KITTI’s road dataset as well as on our own hybrid virtual road/rail dataset acquired from the video game Grand Theft Auto (GTA) V. The evaluation of our method on these two datasets shows good accuracy, but more importantly that it can be used in real-time conditions, in road and rail traffic environments. Through our experimental results, we also show the importance of the accuracy of prediction of the regions of interest (RoIs) used in the estimation of 3D bounding box parameters.


JURTEKSI ◽  
2020 ◽  
Vol 7 (1) ◽  
pp. 67-74
Author(s):  
Reny Medikawati Taufiq ◽  
Sunanto Sunanto ◽  
Yoze Rizki

Abstract: Pekanbaru still using conventional traffic light control system. Pekanbaru as the capital of Riau Province is predicted  udergo the  increased of urban population by 54.5% in 2025. It is important for Pekanbaru to immediately implement smart and efficient traffic management system, so that traffic congestion can be resolved quickly. This research paper provides a design solution for smart traffic light management (Smart Traffic Control System), based on object detection technology that uses deep learning to detect the number and type of vehicles. The number of vehicle is the basis for determining the green light timer automatically. The Smart Traffic Control System (STCS) is integrated with a web based geographic information system (smart map) that can display the current condition  (picture, the number of vehicle, congestion level) of each STCS location. This integrated system has been tested on a traffic light prototype, using a mini computer and a miniature vehicle. This integrated system is able to detect 9 out of 12 vehicles, and able to send data regularly to the smart map.  Keywords: deep learning; smart mobility; smart traffic control system Abstrak: Pengaturan lampu lalu lintas di Kota Pekanbaru masih dilakukan secara  konvensional. Pekanbaru sebagai ibukota Provinsi Riau diprediksikan akan mengalami peningkatan jumlah penduduk  perkotaan sebesar 54,5% pada tahun 2025. Dengan melihat predikisi ini, penting bagi kota Pekanbaru untuk segera memiliki tata kelola lalu lintas yang cerdas dan efisien agar kemacetan dapat ditanggulangi dengan cepat. Penelitian ini memberikan rancangan solusi untuk tata kelola  lampu lalu lintas cerdas (Smart Traffic Control System), berbasis teknologi object detection  yang menggunakan deep learning untuk mendeteksi jumlah dan jenis kendaraan. Jumlah kendaraan menjadi dasar penentuan timer lampu hijau secara otomatis. Smart Traffic Control System (STCS) terintegrasi dengan sistem informasi geografis berbasis web (smart map) yang secara kontinu menerima informasi kepadatan (gambar terkini, jumlah kendaraan, level kepadatan), kemudian menampilkannya diatas peta Kota Pekanbaru. Solusi sistem terintegrasi ini telah diujikan pada sebuah prototipe lampu lalu lintas, menggunakan komputer mini  dan  miniatur kendaraan. Sistem terintegrasi ini mampu mendeteksi 9 dari 12 kendaraan, dan mampu mengirimkan data secara berkala kepada smart map. Kata kunci: deep learning; smart mobility; smart traffic control system


Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 532 ◽  
Author(s):  
Antoine Mauri ◽  
Redouane Khemmar ◽  
Benoit Decoux ◽  
Nicolas Ragot ◽  
Romain Rossi ◽  
...  

In core computer vision tasks, we have witnessed significant advances in object detection, localisation and tracking. However, there are currently no methods to detect, localize and track objects in road environments, and taking into account real-time constraints. In this paper, our objective is to develop a deep learning multi object detection and tracking technique applied to road smart mobility. Firstly, we propose an effective detector-based on YOLOv3 which we adapt to our context. Subsequently, to localize successfully the detected objects, we put forward an adaptive method aiming to extract 3D information, i.e., depth maps. To do so, a comparative study is carried out taking into account two approaches: Monodepth2 for monocular vision and MADNEt for stereoscopic vision. These approaches are then evaluated over datasets containing depth information in order to discern the best solution that performs better in real-time conditions. Object tracking is necessary in order to mitigate the risks of collisions. Unlike traditional tracking approaches which require target initialization beforehand, our approach consists of using information from object detection and distance estimation to initialize targets and to track them later. Expressly, we propose here to improve SORT approach for 3D object tracking. We introduce an extended Kalman filter to better estimate the position of objects. Extensive experiments carried out on KITTI dataset prove that our proposal outperforms state-of-the-art approches.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2611
Author(s):  
Andrew Shepley ◽  
Greg Falzon ◽  
Christopher Lawson ◽  
Paul Meek ◽  
Paul Kwan

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.


Sign in / Sign up

Export Citation Format

Share Document