scholarly journals How Many Features is an Image Worth? Multi-Channel CNN for Steering Angle Prediction in Autonomous Vehicles

2021 ◽  
Author(s):  
Jason Munger ◽  
Carlos W. Morato

This project explores how raw image data obtained from AV cameras can provide a model with more spatial information than can be learned from simple RGB images alone. This paper leverages the advances of deep neural networks to demonstrate steering angle predictions of autonomous vehicles through an end-to-end multi-channel CNN model using only the image data provided from an onboard camera. Image data is processed through existing neural networks to provide pixel segmentation and depth estimates and input to a new neural network along with the raw input image to provide enhanced feature signals from the environment. Various input combinations of Multi-Channel CNNs are evaluated, and their effectiveness is compared to single CNN networks using the individual data inputs. The model with the most accurate steering predictions is identified and performance compared to previous neural networks.

Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 229
Author(s):  
Xianzhong Tian ◽  
Juan Zhu ◽  
Ting Xu ◽  
Yanjun Li

The latest results in Deep Neural Networks (DNNs) have greatly improved the accuracy and performance of a variety of intelligent applications. However, running such computation-intensive DNN-based applications on resource-constrained mobile devices definitely leads to long latency and huge energy consumption. The traditional way is performing DNNs in the central cloud, but it requires significant amounts of data to be transferred to the cloud over the wireless network and also results in long latency. To solve this problem, offloading partial DNN computation to edge clouds has been proposed, to realize the collaborative execution between mobile devices and edge clouds. In addition, the mobility of mobile devices is easily to cause the computation offloading failure. In this paper, we develop a mobility-included DNN partition offloading algorithm (MDPO) to adapt to user’s mobility. The objective of MDPO is minimizing the total latency of completing a DNN job when the mobile user is moving. The MDPO algorithm is suitable for both DNNs with chain topology and graphic topology. We evaluate the performance of our proposed MDPO compared to local-only execution and edge-only execution, experiments show that MDPO significantly reduces the total latency and improves the performance of DNN, and MDPO can adjust well to different network conditions.


2021 ◽  
Vol 10 (6) ◽  
pp. 377
Author(s):  
Chiao-Ling Kuo ◽  
Ming-Hua Tsai

The importance of road characteristics has been highlighted, as road characteristics are fundamental structures established to support many transportation-relevant services. However, there is still huge room for improvement in terms of types and performance of road characteristics detection. With the advantage of geographically tiled maps with high update rates, remarkable accessibility, and increasing availability, this paper proposes a novel simple deep-learning-based approach, namely joint convolutional neural networks (CNNs) adopting adaptive squares with combination rules to detect road characteristics from roadmap tiles. The proposed joint CNNs are responsible for the foreground and background image classification and various types of road characteristics classification from previous foreground images, raising detection accuracy. The adaptive squares with combination rules help efficiently focus road characteristics, augmenting the ability to detect them and provide optimal detection results. Five types of road characteristics—crossroads, T-junctions, Y-junctions, corners, and curves—are exploited, and experimental results demonstrate successful outcomes with outstanding performance in reality. The information of exploited road characteristics with location and type is, thus, converted from human-readable to machine-readable, the results will benefit many applications like feature point reminders, road condition reports, or alert detection for users, drivers, and even autonomous vehicles. We believe this approach will also enable a new path for object detection and geospatial information extraction from valuable map tiles.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Fuyong Xing ◽  
Yuanpu Xie ◽  
Xiaoshuang Shi ◽  
Pingjun Chen ◽  
Zizhao Zhang ◽  
...  

Abstract Background Nucleus or cell detection is a fundamental task in microscopy image analysis and supports many other quantitative studies such as object counting, segmentation, tracking, etc. Deep neural networks are emerging as a powerful tool for biomedical image computing; in particular, convolutional neural networks have been widely applied to nucleus/cell detection in microscopy images. However, almost all models are tailored for specific datasets and their applicability to other microscopy image data remains unknown. Some existing studies casually learn and evaluate deep neural networks on multiple microscopy datasets, but there are still several critical, open questions to be addressed. Results We analyze the applicability of deep models specifically for nucleus detection across a wide variety of microscopy image data. More specifically, we present a fully convolutional network-based regression model and extensively evaluate it on large-scale digital pathology and microscopy image datasets, which consist of 23 organs (or cancer diseases) and come from multiple institutions. We demonstrate that for a specific target dataset, training with images from the same types of organs might be usually necessary for nucleus detection. Although the images can be visually similar due to the same staining technique and imaging protocol, deep models learned with images from different organs might not deliver desirable results and would require model fine-tuning to be on a par with those trained with target data. We also observe that training with a mixture of target and other/non-target data does not always mean a higher accuracy of nucleus detection, and it might require proper data manipulation during model training to achieve good performance. Conclusions We conduct a systematic case study on deep models for nucleus detection in a wide variety of microscopy images, aiming to address several important but previously understudied questions. We present and extensively evaluate an end-to-end, pixel-to-pixel fully convolutional regression network and report a few significant findings, some of which might have not been reported in previous studies. The model performance analysis and observations would be helpful to nucleus detection in microscopy images.


2018 ◽  
Vol 12 (04) ◽  
pp. 481-500 ◽  
Author(s):  
Naifan Zhuang ◽  
The Duc Kieu ◽  
Jun Ye ◽  
Kien A. Hua

With the growth of crowd phenomena in the real world, crowd scene understanding is becoming an important task in anomaly detection and public security. Visual ambiguities and occlusions, high density, low mobility, and scene semantics, however, make this problem a great challenge. In this paper, we propose an end-to-end deep architecture, convolutional nonlinear differential recurrent neural networks (CNDRNNs), for crowd scene understanding. CNDRNNs consist of GoogleNet Inception V3 convolutional neural networks (CNNs) and nonlinear differential recurrent neural networks (RNNs). Different from traditional non-end-to-end solutions which separate the steps of feature extraction and parameter learning, CNDRNN utilizes a unified deep model to optimize the parameters of CNN and RNN hand in hand. It thus has the potential of generating a more harmonious model. The proposed architecture takes sequential raw image data as input, and does not rely on tracklet or trajectory detection. It thus has clear advantages over the traditional flow-based and trajectory-based methods, especially in challenging crowd scenarios of high density and low mobility. Taking advantage of CNN and RNN, CNDRNN can effectively analyze the crowd semantics. Specifically, CNN is good at modeling the semantic crowd scene information. On the other hand, nonlinear differential RNN models the motion information. The individual and increasing orders of derivative of states (DoS) in differential RNN can progressively build up the ability of the long short-term memory (LSTM) gates to detect different levels of salient dynamical patterns in deeper stacked layers modeling higher orders of DoS. Lastly, existing LSTM-based crowd scene solutions explore deep temporal information and are claimed to be “deep in time.” Our proposed method CNDRNN, however, models the spatial and temporal information in a unified architecture and achieves “deep in space and time.” Extensive performance studies on the Violent-Flows, CUHK Crowd, and NUS-HGA datasets show that the proposed technique significantly outperforms state-of-the-art methods.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Shazia Akbar ◽  
Mohammad Peikari ◽  
Sherine Salama ◽  
Azadeh Yazdan Panah ◽  
Sharon Nofech-Mozes ◽  
...  

Abstract The residual cancer burden index is an important quantitative measure used for assessing treatment response following neoadjuvant therapy for breast cancer. It has shown to be predictive of overall survival and is composed of two key metrics: qualitative assessment of lymph nodes and the percentage of invasive or in situ tumour cellularity (TC) in the tumour bed (TB). Currently, TC is assessed through eye-balling of routine histopathology slides estimating the proportion of tumour cells within the TB. With the advances in production of digitized slides and increasing availability of slide scanners in pathology laboratories, there is potential to measure TC using automated algorithms with greater precision and accuracy. We describe two methods for automated TC scoring: 1) a traditional approach to image analysis development whereby we mimic the pathologists’ workflow, and 2) a recent development in artificial intelligence in which features are learned automatically in deep neural networks using image data alone. We show strong agreements between automated and manual analysis of digital slides. Agreements between our trained deep neural networks and experts in this study (0.82) approach the inter-rater agreements between pathologists (0.89). We also reveal properties that are captured when we apply deep neural network to whole slide images, and discuss the potential of using such visualisations to improve upon TC assessment in the future.


2018 ◽  
Vol 15 (9) ◽  
pp. 1451-1455 ◽  
Author(s):  
Grant J. Scott ◽  
Kyle C. Hagan ◽  
Richard A. Marcum ◽  
James Alex Hurt ◽  
Derek T. Anderson ◽  
...  

2020 ◽  
Vol 34 (04) ◽  
pp. 5216-5223 ◽  
Author(s):  
Sina Mohseni ◽  
Mandar Pitale ◽  
JBS Yadawa ◽  
Zhangyang Wang

The real-world deployment of Deep Neural Networks (DNNs) in safety-critical applications such as autonomous vehicles needs to address a variety of DNNs' vulnerabilities, one of which being detecting and rejecting out-of-distribution outliers that might result in unpredictable fatal errors. We propose a new technique relying on self-supervision for generalizable out-of-distribution (OOD) feature learning and rejecting those samples at the inference time. Our technique does not need to pre-know the distribution of targeted OOD samples and incur no extra overheads compared to other methods. We perform multiple image classification experiments and observe our technique to perform favorably against state-of-the-art OOD detection methods. Interestingly, we witness that our method also reduces in-distribution classification risk via rejecting samples near the boundaries of the training set distribution.


2021 ◽  
Vol 20 (5s) ◽  
pp. 1-25
Author(s):  
Elbruz Ozen ◽  
Alex Orailoglu

As deep learning algorithms are widely adopted, an increasing number of them are positioned in embedded application domains with strict reliability constraints. The expenditure of significant resources to satisfy performance requirements in deep neural network accelerators has thinned out the margins for delivering safety in embedded deep learning applications, thus precluding the adoption of conventional fault tolerance methods. The potential of exploiting the inherent resilience characteristics of deep neural networks remains though unexplored, offering a promising low-cost path towards safety in embedded deep learning applications. This work demonstrates the possibility of such exploitation by juxtaposing the reduction of the vulnerability surface through the proper design of the quantization schemes with shaping the parameter distributions at each layer through the guidance offered by appropriate training methods, thus delivering deep neural networks of high resilience merely through algorithmic modifications. Unequaled error resilience characteristics can be thus injected into safety-critical deep learning applications to tolerate bit error rates of up to at absolutely zero hardware, energy, and performance costs while improving the error-free model accuracy even further.


2021 ◽  
Vol 13 (19) ◽  
pp. 4017
Author(s):  
Winthrop Harvey ◽  
Chase Rainwater ◽  
Jackson Cothren

Unmanned aerial vehicles (UAVs) must keep track of their location in order to maintain flight plans. Currently, this task is almost entirely performed by a combination of Inertial Measurement Units (IMUs) and reference to GNSS (Global Navigation Satellite System). Navigation by GNSS, however, is not always reliable, due to various causes both natural (reflection and blockage from objects, technical fault, inclement weather) and artificial (GPS spoofing and denial). In such GPS-denied situations, it is desirable to have additional methods for aerial geolocalization. One such method is visual geolocalization, where aircraft use their ground facing cameras to localize and navigate. The state of the art in many ground-level image processing tasks involve the use of Convolutional Neural Networks (CNNs). We present here a study of how effectively a modern CNN designed for visual classification can be applied to the problem of Absolute Visual Geolocalization (AVL, localization without a prior location estimate). An Xception based architecture is trained from scratch over a >1000 km2 section of Washington County, Arkansas to directly regress latitude and longitude from images from different orthorectified high-altitude survey flights. It achieves average localization accuracy on unseen image sets over the same region from different years and seasons with as low as 115 meters average error, which localizes to 0.004% of the training area, or about 8% of the width of the 1.5 × 1.5km input image. This demonstrates that CNNs are expressive enough to encode robust landscape information for geolocalization over large geographic areas. Furthermore, discussed are methods of providing uncertainty for CNN regression outputs, and future areas of potential improvement for use of deep neural networks in visual geolocalization.


Sign in / Sign up

Export Citation Format

Share Document