Improving object recognition in aerial image and ambulatory assessment analysis by deep learning

2019 ◽  
Author(s):  
◽  
Peng Sun

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the widespread usage of many different types of sensors in recent years, large amounts of diverse and complex sensor data have been generated and analyzed to extract useful information. This dissertation focuses on two types of data: aerial images and physiological sensor data. Several new methods have been proposed based on deep learning techniques to advance the state-of-the-art in analyzing these data. For aerial images, a new method for designing effective loss functions for training deep neural networks for object detection, called adaptive salience biased loss (ASBL), has been proposed. In addition, several state-of-the-art deep neural network models for object detection, including RetinaNet, UNet, Yolo, etc., have been adapted and modified to achieve improved performance on a new set of real-world aerial images for bird detection. For physiological sensor data, a deep learning method for alcohol usage detection, called Deep ADA, has been proposed to improve the automatic detection of alcohol usage (ADA) system, which is statistical data analysis pipeline to detect drinking episodes based on wearable physiological sensor data collected from real subjects. Object detection in aerial images remains a challenging problem due to low image resolutions, complex backgrounds, and variations of sizes and orientations of objects in images. The new ASBL method has been designed for training deep neural network object detectors to achieve improved performance. ASBL can be implemented at the image level, which is called image-based ASBL, or at the anchor level, which is called anchor-based ASBL. The method computes saliency information of input images and anchors generated by deep neural network object detectors, and weights different training examples and anchors differently based on their corresponding saliency measurements. It gives complex images and difficult targets more weights during training. In our experiments using two of the largest public benchmark data sets of aerial images, DOTA and NWPU VHR-10, the existing RetinaNet was trained using ASBL to generate an one-stage detector, ASBL-RetinaNet. ASBL-RetinaNet significantly outperformed the original RetinaNet by 3.61 mAP and 12.5 mAP on the two data sets, respectively. In addition, ASBL-RetinaNet outperformed 10 other state-of-art object detection methods. To improve bird detection in aerial images, the Little Birds in Aerial Imagery (LBAI) dataset has been created from real-life aerial imagery data. LBAI contains various flocks and species of birds that are small in size, ranging from 10 by 10 pixel to 40 by 40 pixel. The dataset was labeled and further divided into two subsets, Easy and Hard, based on the complex of background. We have applied and improved some of the best deep learning models to LBAI images, including object detection techniques, such as YOLOv3, SSD, and RetinaNet, and semantic segmentation techniques, such as U-Net and Mask R-CNN. Experimental results show that RetinaNet performed the best overall, outperforming other models by 1.4 and 4.9 F1 scores on the Easy and Hard LBAI dataset, respectively. For physiological sensor data analysis, Deep ADA has been developed to extract features from physiological signals and predict alcohol usage of real subjects in their daily lives. The features extracted are using Convolutional Neural Networks without any human intervention. A large amount of unlabeled data has been used in an unsupervised learning matter to improve the quality of learned features. The method outperformed traditional feature extraction methods by up to 19% higher accuracy.

Author(s):  
W. Yuan ◽  
Z. Fan ◽  
X. Yuan ◽  
J. Gong ◽  
R. Shibasaki

Abstract. Dense image matching is essential to photogrammetry applications, including Digital Surface Model (DSM) generation, three dimensional (3D) reconstruction, and object detection and recognition. The development of an efficient and robust method for dense image matching has been one of the technical challenges due to high variations in illumination and ground features of aerial images of large areas. Nowadays, due to the development of deep learning technology, deep neural network-based algorithms outperform traditional methods on a variety of tasks such as object detection, semantic segmentation and stereo matching. The proposed network includes cost-volume computation, cost-volume aggregation, and disparity prediction. It starts with a pre-trained VGG-16 network as a backend and using the U-net architecture with nine layers for feature map extraction and a correlation layer for cost volume calculation, after that a guided filter based cost aggregation is adopted for cost volume filtering and finally the soft Argmax function is utilized for disparity prediction. The experimental conducted on a UAV dataset demonstrated that the proposed method achieved the RMSE (root mean square error) of the reprojection error better than 1 pixel in image coordinate and in-ground positioning accuracy within 2.5 ground sample distance. The comparison experiments on KITTI 2015 dataset shows the proposed unsupervised method even comparably with other supervised methods.


2019 ◽  
Vol 11 (13) ◽  
pp. 1584 ◽  
Author(s):  
Yang Chen ◽  
Won Suk Lee ◽  
Hao Gan ◽  
Natalia Peres ◽  
Clyde Fraisse ◽  
...  

Strawberry growers in Florida suffer from a lack of efficient and accurate yield forecasts for strawberries, which would allow them to allocate optimal labor and equipment, as well as other resources for harvesting, transportation, and marketing. Accurate estimation of the number of strawberry flowers and their distribution in a strawberry field is, therefore, imperative for predicting the coming strawberry yield. Usually, the number of flowers and their distribution are estimated manually, which is time-consuming, labor-intensive, and subjective. In this paper, we develop an automatic strawberry flower detection system for yield prediction with minimal labor and time costs. The system used a small unmanned aerial vehicle (UAV) (DJI Technology Co., Ltd., Shenzhen, China) equipped with an RGB (red, green, blue) camera to capture near-ground images of two varieties (Sensation and Radiance) at two different heights (2 m and 3 m) and built orthoimages of a 402 m2 strawberry field. The orthoimages were automatically processed using the Pix4D software and split into sequential pieces for deep learning detection. A faster region-based convolutional neural network (R-CNN), a state-of-the-art deep neural network model, was chosen for the detection and counting of the number of flowers, mature strawberries, and immature strawberries. The mean average precision (mAP) was 0.83 for all detected objects at 2 m heights and 0.72 for all detected objects at 3 m heights. We adopted this model to count strawberry flowers in November and December from 2 m aerial images and compared the results with a manual count. The average deep learning counting accuracy was 84.1% with average occlusion of 13.5%. Using this system could provide accurate counts of strawberry flowers, which can be used to forecast future yields and build distribution maps to help farmers observe the growth cycle of strawberry fields.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3813
Author(s):  
Athanasios Anagnostis ◽  
Aristotelis C. Tagarakis ◽  
Dimitrios Kateris ◽  
Vasileios Moysiadis ◽  
Claus Grøn Sørensen ◽  
...  

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.


2021 ◽  
pp. 1063293X2110251
Author(s):  
K Vijayakumar ◽  
Vinod J Kadam ◽  
Sudhir Kumar Sharma

Deep Neural Network (DNN) stands for multilayered Neural Network (NN) that is capable of progressively learn the more abstract and composite representations of the raw features of the input data received, with no need for any feature engineering. They are advanced NNs having repetitious hidden layers between the initial input and the final layer. The working principle of such a standard deep classifier is based on a hierarchy formed by the composition of linear functions and a defined nonlinear Activation Function (AF). It remains uncertain (not clear) how the DNN classifier can function so well. But it is clear from many studies that within DNN, the AF choice has a notable impact on the kinetics of training and the success of tasks. In the past few years, different AFs have been formulated. The choice of AF is still an area of active study. Hence, in this study, a novel deep Feed forward NN model with four AFs has been proposed for breast cancer classification: hidden layer 1: Swish, hidden layer, 2:-LeakyReLU, hidden layer 3: ReLU, and final output layer: naturally Sigmoidal. The purpose of the study is twofold. Firstly, this study is a step toward a more profound understanding of DNN with layer-wise different AFs. Secondly, research is also aimed to explore better DNN-based systems to build predictive models for breast cancer data with improved accuracy. Therefore, the benchmark UCI dataset WDBC was used for the validation of the framework and evaluated using a ten-fold CV method and various performance indicators. Multiple simulations and outcomes of the experimentations have shown that the proposed solution performs in a better way than the Sigmoid, ReLU, and LeakyReLU and Swish activation DNN in terms of different parameters. This analysis contributes to producing an expert and precise clinical dataset classification method for breast cancer. Furthermore, the model also achieved improved performance compared to many established state-of-the-art algorithms/models.


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


2021 ◽  
Vol 11 (15) ◽  
pp. 7050
Author(s):  
Zeeshan Ahmad ◽  
Adnan Shahid Khan ◽  
Kashif Nisar ◽  
Iram Haider ◽  
Rosilah Hassan ◽  
...  

The revolutionary idea of the internet of things (IoT) architecture has gained enormous popularity over the last decade, resulting in an exponential growth in the IoT networks, connected devices, and the data processed therein. Since IoT devices generate and exchange sensitive data over the traditional internet, security has become a prime concern due to the generation of zero-day cyberattacks. A network-based intrusion detection system (NIDS) can provide the much-needed efficient security solution to the IoT network by protecting the network entry points through constant network traffic monitoring. Recent NIDS have a high false alarm rate (FAR) in detecting the anomalies, including the novel and zero-day anomalies. This paper proposes an efficient anomaly detection mechanism using mutual information (MI), considering a deep neural network (DNN) for an IoT network. A comparative analysis of different deep-learning models such as DNN, Convolutional Neural Network, Recurrent Neural Network, and its different variants, such as Gated Recurrent Unit and Long Short-term Memory is performed considering the IoT-Botnet 2020 dataset. Experimental results show the improvement of 0.57–2.6% in terms of the model’s accuracy, while at the same time reducing the FAR by 0.23–7.98% to show the effectiveness of the DNN-based NIDS model compared to the well-known deep learning models. It was also observed that using only the 16–35 best numerical features selected using MI instead of 80 features of the dataset result in almost negligible degradation in the model’s performance but helped in decreasing the overall model’s complexity. In addition, the overall accuracy of the DL-based models is further improved by almost 0.99–3.45% in terms of the detection accuracy considering only the top five categorical and numerical features.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 949
Author(s):  
Jiangyi Wang ◽  
Min Liu ◽  
Xinwu Zeng ◽  
Xiaoqiang Hua

Convolutional neural networks have powerful performances in many visual tasks because of their hierarchical structures and powerful feature extraction capabilities. SPD (symmetric positive definition) matrix is paid attention to in visual classification, because it has excellent ability to learn proper statistical representation and distinguish samples with different information. In this paper, a deep neural network signal detection method based on spectral convolution features is proposed. In this method, local features extracted from convolutional neural network are used to construct the SPD matrix, and a deep learning algorithm for the SPD matrix is used to detect target signals. Feature maps extracted by two kinds of convolutional neural network models are applied in this study. Based on this method, signal detection has become a binary classification problem of signals in samples. In order to prove the availability and superiority of this method, simulated and semi-physical simulated data sets are used. The results show that, under low SCR (signal-to-clutter ratio), compared with the spectral signal detection method based on the deep neural network, this method can obtain a gain of 0.5–2 dB on simulated data sets and semi-physical simulated data sets.


Sign in / Sign up

Export Citation Format

Share Document