scholarly journals Early Prediction of DNN Activation Using Hierarchical Computations

Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3130
Author(s):  
Bharathwaj Suresh ◽  
Kamlesh Pillai ◽  
Gurpreet Singh Kalsi ◽  
Avishaii Abuhatzera ◽  
Sreenivas Subramoney

Deep Neural Networks (DNNs) have set state-of-the-art performance numbers in diverse fields of electronics (computer vision, voice recognition), biology, bioinformatics, etc. However, the process of learning (training) from the data and application of the learnt information (inference) process requires huge computational resources. Approximate computing is a common method to reduce computation cost, but it introduces loss in task accuracy, which limits their application. Using an inherent property of Rectified Linear Unit (ReLU), a popular activation function, we propose a mathematical model to perform MAC operation using reduced precision for predicting negative values early. We also propose a method to perform hierarchical computation to achieve the same results as IEEE754 full precision compute. Applying this method on ResNet50 and VGG16 shows that up to 80% of ReLU zeros (which is 50% of all ReLU outputs) can be predicted and detected early by using just 3 out of 23 mantissa bits. This method is equally applicable to other floating-point representations.

2020 ◽  
Author(s):  
Zhe Yang ◽  
Dejan Gjorgjevikj ◽  
Jian-Yu Long ◽  
Yan-Yang Zi ◽  
Shao-Hui Zhang ◽  
...  

Abstract Novelty detection is a challenging task for the machinery fault diagnosis. A novel fault diagnostic method is developed for dealing with not only diagnosing the known type of defect, but also detecting novelties, i.e. the occurrence of new types of defects which have never been recorded. To this end, a sparse autoencoder-based multi-head Deep Neural Network (DNN) is presented to jointly learn a shared encoding representation for both unsupervised reconstruction and supervised classification of the monitoring data. The detection of novelties is based on the reconstruction error. Moreover, the computational burden is reduced by directly training the multi-head DNN with rectified linear unit activation function, instead of performing the pre-training and fine-tuning phases required for classical DNNs. The addressed method is applied to a benchmark bearing case study and to experimental data acquired from a delta 3D printer. The results show that it is able to accurately diagnose known types of defects, as well as to detect unknown defects, outperforming other state-of-the-art methods.


2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Fernando Mattioli ◽  
Daniel Caetano ◽  
Alexandre Cardoso ◽  
Eduardo Naves ◽  
Edgard Lamounier

The choice of a good topology for a deep neural network is a complex task, essential for any deep learning project. This task normally demands knowledge from previous experience, as the higher amount of required computational resources makes trial and error approaches prohibitive. Evolutionary computation algorithms have shown success in many domains, by guiding the exploration of complex solution spaces in the direction of the best solutions, with minimal human intervention. In this sense, this work presents the use of genetic algorithms in deep neural networks topology selection. The evaluated algorithms were able to find competitive topologies while spending less computational resources when compared to state-of-the-art methods.


Author(s):  
Hengjie Chen ◽  
Zhong Li

By applying fundamental mathematical knowledge, this paper proves that the function [Formula: see text] is an integer no less than [Formula: see text] has the property that the difference between the function value of middle point of arbitrarily two adjacent equidistant distribution nodes on [Formula: see text] and the mean of function values of these two nodes is a constant depending only on the number of nodes if and only if [Formula: see text] By them, we establish an important result about deep neural networks that the function [Formula: see text] can be interpolated by a deep Rectified Linear Unit (ReLU) network with depth [Formula: see text] on the equidistant distribution nodes in interval [Formula: see text] and the error of approximation is [Formula: see text] Then based on the main result that has just been proven and the Chebyshev orthogonal polynomials, we construct a deep network and give the error estimate of approximation to polynomials and continuous functions, respectively. In addition, this paper constructs one deep network with local sparse connections, shared weights and activation function [Formula: see text] and discusses its density and complexity.


2019 ◽  
Author(s):  
W. F. Magalhães ◽  
H. M. Gomes ◽  
L. B. Marinho ◽  
G. S. Aguiar ◽  
P. Silveira

With the advent of smart IoT applications empowered with AI, together with the democratization of mobile devices, moving the computation from cloud to edge is a natural trend in both academia and industry. A major challenge in this direction is enabling the deployment of Deep Neural Networks (DNNs), which usually demand lots of computational resources (i.e. memory, disk, CPU/GPU, and power), in resource limited edge devices. Among the possible strategies to tackle this challenge are: (i) running the entire DNN on the edge device (sometimes not feasible), (ii) distributing the computation between edge and cloud or (iii) running the entire DNN on the cloud. All these strategies involve trade-offs in terms of latency, communication, and financial costs. In this article we investigate such trade-offs in a real-world scenario involving object detection from video surveillance feeds. We conduct several experiments on two different versions of YOLO (You Only Look Once), a state-of-the-art DNN designed for fast and accurate object detection and location. Our experimental setup for DNN model partitioning includes a Raspberry PI 3 B+ and a cloud server equipped with a GPU. Experiments using different network bandwidths are performed. Our results provide useful insights about the aforementioned trade-offs.


2019 ◽  
Vol 12 (3) ◽  
pp. 156-161 ◽  
Author(s):  
Aman Dureja ◽  
Payal Pahwa

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.


2021 ◽  
Vol 11 (15) ◽  
pp. 6704
Author(s):  
Jingyong Cai ◽  
Masashi Takemoto ◽  
Yuming Qiu ◽  
Hironori Nakajo

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.


2021 ◽  
Vol 34 (1) ◽  
Author(s):  
Zhe Yang ◽  
Dejan Gjorgjevikj ◽  
Jianyu Long ◽  
Yanyang Zi ◽  
Shaohui Zhang ◽  
...  

AbstractSupervised fault diagnosis typically assumes that all the types of machinery failures are known. However, in practice unknown types of defect, i.e., novelties, may occur, whose detection is a challenging task. In this paper, a novel fault diagnostic method is developed for both diagnostics and detection of novelties. To this end, a sparse autoencoder-based multi-head Deep Neural Network (DNN) is presented to jointly learn a shared encoding representation for both unsupervised reconstruction and supervised classification of the monitoring data. The detection of novelties is based on the reconstruction error. Moreover, the computational burden is reduced by directly training the multi-head DNN with rectified linear unit activation function, instead of performing the pre-training and fine-tuning phases required for classical DNNs. The addressed method is applied to a benchmark bearing case study and to experimental data acquired from a delta 3D printer. The results show that its performance is satisfactory both in detection of novelties and fault diagnosis, outperforming other state-of-the-art methods. This research proposes a novel fault diagnostics method which can not only diagnose the known type of defect, but also detect unknown types of defects.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1511
Author(s):  
Taylor Simons ◽  
Dah-Jye Lee

There has been a recent surge in publications related to binarized neural networks (BNNs), which use binary values to represent both the weights and activations in deep neural networks (DNNs). Due to the bitwise nature of BNNs, there have been many efforts to implement BNNs on ASICs and FPGAs. While BNNs are excellent candidates for these kinds of resource-limited systems, most implementations still require very large FPGAs or CPU-FPGA co-processing systems. Our work focuses on reducing the computational cost of BNNs even further, making them more efficient to implement on FPGAs. We target embedded visual inspection tasks, like quality inspection sorting on manufactured parts and agricultural produce sorting. We propose a new binarized convolutional layer, called the neural jet features layer, that learns well-known classic computer vision kernels that are efficient to calculate as a group. We show that on visual inspection tasks, neural jet features perform comparably to standard BNN convolutional layers while using less computational resources. We also show that neural jet features tend to be more stable than BNN convolution layers when training small models.


Author(s):  
Manjunath K. E. ◽  
Srinivasa Raghavan K. M. ◽  
K. Sreenivasa Rao ◽  
Dinesh Babu Jayagopi ◽  
V. Ramasubramanian

In this study, we evaluate and compare two different approaches for multilingual phone recognition in code-switched and non-code-switched scenarios. First approach is a front-end Language Identification (LID)-switched to a monolingual phone recognizer (LID-Mono), trained individually on each of the languages present in multilingual dataset. In the second approach, a common multilingual phone-set derived from the International Phonetic Alphabet (IPA) transcription of the multilingual dataset is used to develop a Multilingual Phone Recognition System (Multi-PRS). The bilingual code-switching experiments are conducted using Kannada and Urdu languages. In the first approach, LID is performed using the state-of-the-art i-vectors. Both monolingual and multilingual phone recognition systems are trained using Deep Neural Networks. The performance of LID-Mono and Multi-PRS approaches are compared and analysed in detail. It is found that the performance of Multi-PRS approach is superior compared to more conventional LID-Mono approach in both code-switched and non-code-switched scenarios. For code-switched speech, the effect of length of segments (that are used to perform LID) on the performance of LID-Mono system is studied by varying the window size from 500 ms to 5.0 s, and full utterance. The LID-Mono approach heavily depends on the accuracy of the LID system and the LID errors cannot be recovered. But, the Multi-PRS system by virtue of not having to do a front-end LID switching and designed based on the common multilingual phone-set derived from several languages, is not constrained by the accuracy of the LID system, and hence performs effectively on code-switched and non-code-switched speech, offering low Phone Error Rates than the LID-Mono system.


2021 ◽  
Vol 47 (1) ◽  
Author(s):  
Fabian Laakmann ◽  
Philipp Petersen

AbstractWe demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations. For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth. Therefore, approximation of these functions suffers from a curse of dimension. We demonstrate that through their inherent compositionality deep neural networks can resolve the characteristic flow underlying the transport equations and thereby allow approximation rates independent of the parameter dimension.


Sign in / Sign up

Export Citation Format

Share Document