An Integrated Approach for Traffic Scene Understanding from Monocular Cameras

2021 ◽  
Author(s):  
Malte Oeljeklaus

This thesis investigates methods for traffic scene perception with monocular cameras for a basic environment model in the context of automated vehicles. The developed approach is designed with special attention to the computational limitations present in practical systems. For this purpose, three different scene representations are investigated. These consist of the prevalent road topology as the global scene context, the drivable road area and the detection and spatial reconstruction of other road users. An approach is developed that allows for the simultaneous perception of all environment representations based on a multi-task convolutional neural network. The obtained results demonstrate the efficiency of the multi-task approach. In particular, the effects of shareable image features for the perception of the individual scene representations were found to improve the computational performance. Contents Nomenclature VII 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Outline and contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Related Work and Fundamental Background 8 2.1 Advances in CNN...

Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 703
Author(s):  
Jun Zhang ◽  
Jiaze Liu ◽  
Zhizhong Wang

Owing to the increased use of urban rail transit, the flow of passengers on metro platforms tends to increase sharply during peak periods. Monitoring passenger flow in such areas is important for security-related reasons. In this paper, in order to solve the problem of metro platform passenger flow detection, we propose a CNN (convolutional neural network)-based network called the MP (metro platform)-CNN to accurately count people on metro platforms. The proposed method is composed of three major components: a group of convolutional neural networks is used on the front end to extract image features, a multiscale feature extraction module is used to enhance multiscale features, and transposed convolution is used for upsampling to generate a high-quality density map. Currently, existing crowd-counting datasets do not adequately cover all of the challenging situations considered in this study. Therefore, we collected images from surveillance videos of a metro platform to form a dataset containing 627 images, with 9243 annotated heads. The results of the extensive experiments showed that our method performed well on the self-built dataset and the estimation error was minimum. Moreover, the proposed method could compete with other methods on four standard crowd-counting datasets.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1999 ◽  
Author(s):  
Donghang Yu ◽  
Qing Xu ◽  
Haitao Guo ◽  
Chuan Zhao ◽  
Yuzhun Lin ◽  
...  

Classifying remote sensing images is vital for interpreting image content. Presently, remote sensing image scene classification methods using convolutional neural networks have drawbacks, including excessive parameters and heavy calculation costs. More efficient and lightweight CNNs have fewer parameters and calculations, but their classification performance is generally weaker. We propose a more efficient and lightweight convolutional neural network method to improve classification accuracy with a small training dataset. Inspired by fine-grained visual recognition, this study introduces a bilinear convolutional neural network model for scene classification. First, the lightweight convolutional neural network, MobileNetv2, is used to extract deep and abstract image features. Each feature is then transformed into two features with two different convolutional layers. The transformed features are subjected to Hadamard product operation to obtain an enhanced bilinear feature. Finally, the bilinear feature after pooling and normalization is used for classification. Experiments are performed on three widely used datasets: UC Merced, AID, and NWPU-RESISC45. Compared with other state-of-art methods, the proposed method has fewer parameters and calculations, while achieving higher accuracy. By including feature fusion with bilinear pooling, performance and accuracy for remote scene classification can greatly improve. This could be applied to any remote sensing image classification task.


Author(s):  
Attila Zoltán Jenei ◽  
Gábor Kiss

In the present study, we attempt to estimate the severity of depression using a Convolutional Neural Network (CNN). The method is special because an auto- and cross-correlation structure has been crafted rather than using an actual image for the input of the network. The importance to investigate the possibility of this research is that depression has become one of the leading mental disorders in the world. With its appearance, it can significantly reduce an individual's quality of life even at an early stage, and in severe cases, it may threaten with suicide. It is therefore important that the disorder be recognized as early as possible. Furthermore, it is also important to determine the disorder severity of the individual, so that a treatment order can be established. During the examination, speech acoustic features were obtained from recordings. Among the features, MFCC coefficients and formant frequencies were used based on preliminary studies. From its subsets, correlation structure was created. We applied this quadratic structure to the input of a convolutional network. Two models were crafted: single and double input versions. Altogether, the lowest RMSE value (10.797) was achieved using the two features, which has a moderate strength correlation of 0.61 (between estimated and original).


Author(s):  
Pramod Sekharan Nair ◽  
Tsrity Asefa Berihu ◽  
Varun Kumar

Gangrene disease is one of the deadliest diseases on the globe which is caused by lack of blood supply to the body parts or any kind of infection. The gangrene disease often affects the human body parts such as fingers, limbs, toes but there are many cases of on muscles and organs. In this paper, the gangrene disease classification is being done from the given images of high resolution. The convolutional neural network (CNN) is used for feature extraction on disease images. The first layer of the convolutional neural network was used to capture the elementary image features such as dots, edges and blobs. The intermediate layers or the hidden layers of the convolutional neural network extracts detailed image features such as shapes, brightness, and contrast as well as color. Finally, the CNN extracted features are given to the Support Vector Machine to classify the gangrene disease. The experiment results show the approach adopted in this study performs better and acceptable.


Sensors ◽  
2022 ◽  
Vol 22 (1) ◽  
pp. 348
Author(s):  
Francisco de Melo ◽  
Horácio C. Neto ◽  
Hugo Plácido da Silva

Biometric identification systems are a fundamental building block of modern security. However, conventional biometric methods cannot easily cope with their intrinsic security liabilities, as they can be affected by environmental factors, can be easily “fooled” by artificial replicas, among other caveats. This has lead researchers to explore other modalities, in particular based on physiological signals. Electrocardiography (ECG) has seen a growing interest, and many ECG-enabled security identification devices have been proposed in recent years, as electrocardiography signals are, in particular, a very appealing solution for today’s demanding security systems—mainly due to the intrinsic aliveness detection advantages. These Electrocardiography (ECG)-enabled devices often need to meet small size, low throughput, and power constraints (e.g., battery-powered), thus needing to be both resource and energy-efficient. However, to date little attention has been given to the computational performance, in particular targeting the deployment with edge processing in limited resource devices. As such, this work proposes an implementation of an Artificial Intelligence (AI)-enabled ECG-based identification embedded system, composed of a RISC-V based System-on-a-Chip (SoC). A Binary Convolutional Neural Network (BCNN) was implemented in our SoC’s hardware accelerator that, when compared to a software implementation of a conventional, non-binarized, Convolutional Neural Network (CNN) version of our network, achieves a 176,270× speedup, arguably outperforming all the current state-of-the-art CNN-based ECG identification methods.


2018 ◽  
Vol 8 (10) ◽  
pp. 1715 ◽  
Author(s):  
Sivaramakrishnan Rajaraman ◽  
Sema Candemir ◽  
Incheol Kim ◽  
George Thoma ◽  
Sameer Antani

Pneumonia affects 7% of the global population, resulting in 2 million pediatric deaths every year. Chest X-ray (CXR) analysis is routinely performed to diagnose the disease. Computer-aided diagnostic (CADx) tools aim to supplement decision-making. These tools process the handcrafted and/or convolutional neural network (CNN) extracted image features for visual recognition. However, CNNs are perceived as black boxes since their performance lack explanations. This is a serious bottleneck in applications involving medical screening/diagnosis since poorly interpreted model behavior could adversely affect the clinical decision. In this study, we evaluate, visualize, and explain the performance of customized CNNs to detect pneumonia and further differentiate between bacterial and viral types in pediatric CXRs. We present a novel visualization strategy to localize the region of interest (ROI) that is considered relevant for model predictions across all the inputs that belong to an expected class. We statistically validate the models’ performance toward the underlying tasks. We observe that the customized VGG16 model achieves 96.2% and 93.6% accuracy in detecting the disease and distinguishing between bacterial and viral pneumonia respectively. The model outperforms the state-of-the-art in all performance metrics and demonstrates reduced bias and improved generalization.


Author(s):  
MUHAMMAD EFAN ABDULFATTAH ◽  
LEDYA NOVAMIZANTI ◽  
SYAMSUL RIZAL

ABSTRAKBencana di Indonesia didominasi oleh bencana hidrometeorologi yang mengakibatkan kerusakan dalam skala besar. Melalui pemetaan, penanganan yang menyeluruh dapat dilakukan guna membantu analisa dan penindakan selanjutnya. Unmanned Aerial Vehicle (UAV) dapat digunakan sebagai alat bantu pemetaan dari udara. Namun, karena faktor kamera maupun perangkat pengolah citra yang tidak memenuhi spesifikasi, hasilnya menjadi kurang informatif. Penelitian ini mengusulkan Super Resolution pada citra udara berbasis Convolutional Neural Network (CNN) dengan model DCSCN. Model terdiri atas Feature Extraction Network untuk mengekstraksi ciri citra, dan Reconstruction Network untuk merekonstruksi citra. Performa DCSCN dibandingkan dengan Super Resolution CNN (SRCNN). Eksperimen dilakukan pada dataset Set5 dengan nilai scale factor 2, 3 dan 4. Secara berurutan SRCNN menghasilkan nilai PSNR dan SSIM sebesar 36.66 dB / 0.9542, 32.75 dB / 0.9090 dan 30.49 dB / 0.8628. Performa DCSCN meningkat menjadi 37.614dB / 0.9588, 33.86 dB / 0.9225 dan 31.48 dB / 0.8851.Kata kunci: citra udara, deep learning, super resolution ABSTRACTDisasters in Indonesia are dominated by hydrometeorological disasters, which cause large-scale damage. Through mapping, comprehensive handling can be done to help the analysis and subsequent action. Unmanned Aerial Vehicle (UAV) can be used as an aerial mapping tool. However, due to the camera and image processing devices that do not meet specifications, the results are less informative. This research proposes Super Resolution on aerial imagery based on Convolutional Neural Network (CNN) with the DCSCN model. The model consists of Feature Extraction Network for extracting image features and Reconstruction Network for reconstructing images. DCSCN's performance is compared to CNN Super Resolution (SRCNN). Experiments were carried out on the Set5 dataset with scale factor values 2, 3, and 4. The SRCNN sequentially produced PSNR and SSIM values of 36.66dB / 0.9542, 32.75dB / 0.9090 and 30.49dB / 0.8628. DCSCN's performance increased to 37,614dB / 0.9588, 33.86dB / 0.9225 and 31.48dB / 0.8851.Keywords: aerial imagery, deep learning, super resolution


Sign in / Sign up

Export Citation Format

Share Document