scholarly journals Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures

Entropy ◽  
2021 ◽  
Vol 24 (1) ◽  
pp. 67
Author(s):  
Xiyu Shi ◽  
Varuna De-Silva ◽  
Yusuf Aslan ◽  
Erhan Ekmekcioglu ◽  
Ahmet Kondoz

Deep learning has proven to be an important element of modern data processing technology, which has found its application in many areas such as multimodal sensor data processing and understanding, data generation and anomaly detection. While the use of deep learning is booming in many real-world tasks, the internal processes of how it draws results is still uncertain. Understanding the data processing pathways within a deep neural network is important for transparency and better resource utilisation. In this paper, a method utilising information theoretic measures is used to reveal the typical learning patterns of convolutional neural networks, which are commonly used for image processing tasks. For this purpose, training samples, true labels and estimated labels are considered to be random variables. The mutual information and conditional entropy between these variables are then studied using information theoretical measures. This paper shows that more convolutional layers in the network improve its learning and unnecessarily higher numbers of convolutional layers do not improve the learning any further. The number of convolutional layers that need to be added to a neural network to gain the desired learning level can be determined with the help of theoretic information quantities including entropy, inequality and mutual information among the inputs to the network. The kernel size of convolutional layers only affects the learning speed of the network. This study also shows that where the dropout layer is applied to has no significant effects on the learning of networks with a lower dropout rate, and it is better placed immediately after the last convolutional layer with higher dropout rates.

Micromachines ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1504
Author(s):  
Mingming Shen ◽  
Jing Yang ◽  
Shaobo Li ◽  
Ansi Zhang ◽  
Qiang Bai

Deep neural networks are widely used in the field of image processing for micromachines, such as in 3D shape detection in microelectronic high-speed dispensing and object detection in microrobots. It is already known that hyperparameters and their interactions impact neural network model performance. Taking advantage of the mathematical correlations between hyperparameters and the corresponding deep learning model to adjust hyperparameters intelligently is the key to obtaining an optimal solution from a deep neural network model. Leveraging these correlations is also significant for unlocking the “black box” of deep learning by revealing the mechanism of its mathematical principle. However, there is no complete system for studying the combination of mathematical derivation and experimental verification methods to quantify the impacts of hyperparameters on the performances of deep learning models. Therefore, in this paper, the authors analyzed the mathematical relationships among four hyperparameters: the learning rate, batch size, dropout rate, and convolution kernel size. A generalized multiparameter mathematical correlation model was also established, which showed that the interaction between these hyperparameters played an important role in the neural network’s performance. Different experiments were verified by running convolutional neural network algorithms to validate the proposal on the MNIST dataset. Notably, this research can help establish a universal multiparameter mathematical correlation model to guide the deep learning parameter adjustment process.


Sensors ◽  
2019 ◽  
Vol 19 (17) ◽  
pp. 3782 ◽  
Author(s):  
Julius Venskus ◽  
Povilas Treigys ◽  
Jolita Bernatavičienė ◽  
Gintautas Tamulevičius ◽  
Viktor Medvedev

The automated identification system of vessel movements receives a huge amount of multivariate, heterogeneous sensor data, which should be analyzed to make a proper and timely decision on vessel movements. The large number of vessels makes it difficult and time-consuming to detect abnormalities, thus rapid response algorithms should be developed for a decision support system to identify abnormal movements of vessels in areas of heavy traffic. This paper extends the previous study on a self-organizing map application for processing of sensor stream data received by the maritime automated identification system. The more data about the vessel’s movement is registered and submitted to the algorithm, the higher the accuracy of the algorithm should be. However, the task cannot be guaranteed without using an effective retraining strategy with respect to precision and data processing time. In addition, retraining ensures the integration of the latest vessel movement data, which reflects the actual conditions and context. With a view to maintaining the quality of the results of the algorithm, data batching strategies for the neural network retraining to detect anomalies in streaming maritime traffic data were investigated. The effectiveness of strategies in terms of modeling precision and the data processing time were estimated on real sensor data. The obtained results show that the neural network retraining time can be shortened by half while the sensitivity and precision only change slightly.


Author(s):  
Thomas T. Lu ◽  
Kevin Payumo ◽  
Landan Seguin ◽  
Alexander Huyen ◽  
Edward Chow ◽  
...  

Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 714 ◽  
Author(s):  
Andrea Soro ◽  
Gino Brunner ◽  
Simon Tanner ◽  
Roger Wattenhofer

Activity recognition using off-the-shelf smartwatches is an important problem in humanactivity recognition. In this paper, we present an end-to-end deep learning approach, able to provideprobability distributions over activities from raw sensor data. We apply our methods to 10 complexfull-body exercises typical in CrossFit, and achieve a classification accuracy of 99.96%. We additionallyshow that the same neural network used for exercise recognition can also be used in repetitioncounting. To the best of our knowledge, our approach to repetition counting is novel and performswell, counting correctly within an error of 1 repetitions in 91% of the performed sets.


2019 ◽  
Author(s):  
◽  
Peng Sun

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the widespread usage of many different types of sensors in recent years, large amounts of diverse and complex sensor data have been generated and analyzed to extract useful information. This dissertation focuses on two types of data: aerial images and physiological sensor data. Several new methods have been proposed based on deep learning techniques to advance the state-of-the-art in analyzing these data. For aerial images, a new method for designing effective loss functions for training deep neural networks for object detection, called adaptive salience biased loss (ASBL), has been proposed. In addition, several state-of-the-art deep neural network models for object detection, including RetinaNet, UNet, Yolo, etc., have been adapted and modified to achieve improved performance on a new set of real-world aerial images for bird detection. For physiological sensor data, a deep learning method for alcohol usage detection, called Deep ADA, has been proposed to improve the automatic detection of alcohol usage (ADA) system, which is statistical data analysis pipeline to detect drinking episodes based on wearable physiological sensor data collected from real subjects. Object detection in aerial images remains a challenging problem due to low image resolutions, complex backgrounds, and variations of sizes and orientations of objects in images. The new ASBL method has been designed for training deep neural network object detectors to achieve improved performance. ASBL can be implemented at the image level, which is called image-based ASBL, or at the anchor level, which is called anchor-based ASBL. The method computes saliency information of input images and anchors generated by deep neural network object detectors, and weights different training examples and anchors differently based on their corresponding saliency measurements. It gives complex images and difficult targets more weights during training. In our experiments using two of the largest public benchmark data sets of aerial images, DOTA and NWPU VHR-10, the existing RetinaNet was trained using ASBL to generate an one-stage detector, ASBL-RetinaNet. ASBL-RetinaNet significantly outperformed the original RetinaNet by 3.61 mAP and 12.5 mAP on the two data sets, respectively. In addition, ASBL-RetinaNet outperformed 10 other state-of-art object detection methods. To improve bird detection in aerial images, the Little Birds in Aerial Imagery (LBAI) dataset has been created from real-life aerial imagery data. LBAI contains various flocks and species of birds that are small in size, ranging from 10 by 10 pixel to 40 by 40 pixel. The dataset was labeled and further divided into two subsets, Easy and Hard, based on the complex of background. We have applied and improved some of the best deep learning models to LBAI images, including object detection techniques, such as YOLOv3, SSD, and RetinaNet, and semantic segmentation techniques, such as U-Net and Mask R-CNN. Experimental results show that RetinaNet performed the best overall, outperforming other models by 1.4 and 4.9 F1 scores on the Easy and Hard LBAI dataset, respectively. For physiological sensor data analysis, Deep ADA has been developed to extract features from physiological signals and predict alcohol usage of real subjects in their daily lives. The features extracted are using Convolutional Neural Networks without any human intervention. A large amount of unlabeled data has been used in an unsupervised learning matter to improve the quality of learned features. The method outperformed traditional feature extraction methods by up to 19% higher accuracy.


Author(s):  
Thangavel M. ◽  
Abiramie Shree T. G. R. ◽  
Priyadharshini P. ◽  
Saranya T.

In today's world, everyone is generating a large amount of data on their own. With this amount of data generation, there is a change of security compromise of our data. This leads us to extend the security needs beyond the traditional approach which emerges the field of cyber security. Cyber security's core functionality is to protect all types of information, which includes hardware and software from cyber threats. The number of threats and attacks is increasing each year with a high difference between them. Machine learning and deep learning applications can be done to this attack, reducing the complexity to solve the problem and helping us to recover very easily. The algorithms used by both approaches are support vector machine (SVM), Bayesian algorithm, deep belief network (DBN), and deep random neural network (Deep RNN). These techniques provide better results than that of the traditional approach. The companies which use this approach in the real time scenarios are also covered in this chapter.


Symmetry ◽  
2020 ◽  
Vol 12 (9) ◽  
pp. 1570
Author(s):  
Sakorn Mekruksavanich ◽  
Anuchit Jitpattanakul ◽  
Phichai Youplao ◽  
Preecha Yupapin

The creation of the Internet of Things (IoT), along with the latest developments in wearable technology, has provided new opportunities in human activity recognition (HAR). The modern smartwatch offers the potential for data from sensors to be relayed to novel IoT platforms, which allow the constant tracking and monitoring of human movement and behavior. Recently, traditional activity recognition techniques have done research in advance by choosing machine learning methods such as artificial neural network, decision tree, support vector machine, and naive Bayes. Nonetheless, these conventional machine learning techniques depend inevitably on heuristically handcrafted feature extraction, in which human domain knowledge is normally limited. This work proposes a hybrid deep learning model called CNN-LSTM that employed Long Short-Term Memory (LSTM) networks for activity recognition with the Convolution Neural Network (CNN). The study makes use of HAR involving smartwatches to categorize hand movements. Using the study based on the Wireless Sensor Data Mining (WISDM) public benchmark dataset, the recognition abilities of the deep learning model can be accessed. The accuracy, precision, recall, and F-measure statistics are employed using the evaluation metrics to assess the recognition abilities of LSTM models proposed. The findings indicate that this hybrid deep learning model offers better performance than its rivals, where the achievement of 96.2% accuracy, while the f-measure is 96.3%, is obtained. The results show that the proposed CNN-LSTM can support an improvement of the performance of activity recognition.


Sensors ◽  
2019 ◽  
Vol 19 (4) ◽  
pp. 887 ◽  
Author(s):  
Zheng-An Zhu ◽  
Yun-Chung Lu ◽  
Chih-Hsiang You ◽  
Chen-Kuo Chiang

In this paper, a multipath convolutional neural network (MP-CNN) is proposed for rehabilitation exercise recognition using sensor data. It consists of two novel components: a dynamic convolutional neural network (D-CNN) and a state transition probability CNN (S-CNN). In the D-CNN, Gaussian mixture models (GMMs) are exploited to capture the distribution of sensor data for the body movements of the physical rehabilitation exercises. Then, the input signals and the GMMs are screened into different segments. These form multiple paths in the CNN. The S-CNN uses a modified Lempel–Ziv–Welch (LZW) algorithm to extract the transition probabilities of hidden states as discriminate features of different movements. Then, the D-CNN and the S-CNN are combined to build the MP-CNN. To evaluate the rehabilitation exercise, a special evaluation matrix is proposed along with the deep learning classifier to learn the general feature representation for each class of rehabilitation exercise at different levels. Then, for any rehabilitation exercise, it can be classified by the deep learning model and compared to the learned best features. The distance to the best feature is used as the score for the evaluation. We demonstrate our method with our collected dataset and several activity recognition datasets. The classification results are superior when compared to those obtained using other deep learning models, and the evaluation scores are effective for practical applications.


Sign in / Sign up

Export Citation Format

Share Document