scholarly journals Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing

Author(s):  
Hanzhang Hu ◽  
Debadeepta Dey ◽  
Martial Hebert ◽  
J. Andrew Bagnell

This work considers the trade-off between accuracy and testtime computational cost of deep neural networks (DNNs) via anytime predictions from auxiliary predictions. Specifically, we optimize auxiliary losses jointly in an adaptive weighted sum, where the weights are inversely proportional to average of each loss. Intuitively, this balances the losses to have the same scale. We demonstrate theoretical considerations that motivate this approach from multiple viewpoints, including connecting it to optimizing the geometric mean of the expectation of each loss, an objective that ignores the scale of losses. Experimentally, the adaptive weights induce more competitive anytime predictions on multiple recognition data-sets and models than non-adaptive approaches including weighing all losses equally. In particular, anytime neural networks (ANNs) can achieve the same accuracy faster using adaptive weights on a small network than using static constant weights on a large one. For problems with high performance saturation, we also show a sequence of exponentially deepening ANNs can achieve near-optimal anytime results at any budget, at the cost of a const fraction of extra computation.

Author(s):  
Jingyuan Wang ◽  
Kai Feng ◽  
Junjie Wu

The deep network model, with the majority built on neural networks, has been proved to be a powerful framework to represent complex data for high performance machine learning. In recent years, more and more studies turn to nonneural network approaches to build diverse deep structures, and the Deep Stacking Network (DSN) model is one of such approaches that uses stacked easy-to-learn blocks to build a parameter-training-parallelizable deep network. In this paper, we propose a novel SVM-based Deep Stacking Network (SVM-DSN), which uses the DSN architecture to organize linear SVM classifiers for deep learning. A BP-like layer tuning scheme is also proposed to ensure holistic and local optimizations of stacked SVMs simultaneously. Some good math properties of SVM, such as the convex optimization, is introduced into the DSN framework by our model. From a global view, SVM-DSN can iteratively extract data representations layer by layer as a deep neural network but with parallelizability, and from a local view, each stacked SVM can converge to its optimal solution and obtain the support vectors, which compared with neural networks could lead to interesting improvements in anti-saturation and interpretability. Experimental results on both image and text data sets demonstrate the excellent performances of SVM-DSN compared with some competitive benchmark models.


2020 ◽  
Vol 20 (11) ◽  
pp. 6603-6608 ◽  
Author(s):  
Sung-Tae Lee ◽  
Suhwan Lim ◽  
Jong-Ho Bae ◽  
Dongseok Kwon ◽  
Hyeong-Su Kim ◽  
...  

Deep learning represents state-of-the-art results in various machine learning tasks, but for applications that require real-time inference, the high computational cost of deep neural networks becomes a bottleneck for the efficiency. To overcome the high computational cost of deep neural networks, spiking neural networks (SNN) have been proposed. Herein, we propose a hardware implementation of the SNN with gated Schottky diodes as synaptic devices. In addition, we apply L1 regularization for connection pruning of the deep spiking neural networks using gated Schottky diodes as synap-tic devices. Applying L1 regularization eliminates the need for a re-training procedure because it prunes the weights based on the cost function. The compressed hardware-based SNN is energy efficient while achieving a classification accuracy of 97.85% which is comparable to 98.13% of the software deep neural networks (DNN).


Geophysics ◽  
2021 ◽  
pp. 1-77
Author(s):  
Hanchen Wang ◽  
Tariq Alkhalifah

The ample size of time-lapse data often requires significant event detection and source location efforts, especially in areas like shale gas exploration regions where a large number of micro-seismic events are often recorded. In many cases, the real-time monitoring and locating of these events are essential to production decisions. Conventional methods face considerable drawbacks. For example, traveltime-based methods require traveltime picking of often noisy data, while migration and waveform inversion methods require expensive wavefield solutions and event detection. Both tasks require some human intervention, and this becomes a big problem when too many sources need to be located, which is common in micro-seismic monitoring. Machine learning has recently been used to identify micro-seismic events or locate their sources once they are identified and picked. We propose to use a novel artificial neural network framework to directly map seismic data, without any event picking or detection, to their potential source locations. We train two convolutional neural networks on labeled synthetic acoustic data containing simulated micro-seismic events to fulfill such requirements. One convolutional neural network, which has a global average pooling layer to reduce the computational cost while maintaining high-performance levels, aims to classify the number of events in the data. The other network predicts the source locations and other source features such as the source peak frequencies and amplitudes. To reduce the size of the input data to the network, we correlate the recorded traces with a central reference trace to allow the network to focus on the curvature of the input data near the zero-lag region. We train the networks to handle single, multi, and no event segments extracted from the data. Tests on a simple vertical varying model and a more realistic Otway field model demonstrate the approach's versatility and potential.


For classifying the hyperspectral image (HSI), convolution neural networks are used widely as it gives high performance and better results. For stronger prediction this paper presents new structure that benefit from both MS - MA BT (multi-scale multi-angle breaking ties) and CNN algorithm. We build a new MS - MA BT and CNN architecture. It obtains multiple characteristics from the raw image as an input. This algorithm generates relevant feature maps which are fed into concatenating layer to form combined feature map. The obtained mixed feature map is then placed into the subsequent stages to estimate the final results for each hyperspectral pixel. Not only does the suggested technique benefit from improved extraction of characteristics from CNNs and MS-MA BT, but it also allows complete combined use of visual and temporal data. The performance of the suggested technique is evaluated using SAR data sets, and the results indicate that the MS-MA BT-based multi-functional training algorithm considerably increases identification precision. Recently, convolution neural networks have proved outstanding efficiency on multiple visual activities, including the ranking of common two-dimensional pictures. In this paper, the MS-MA BT multi-scale multi-angle CNN algorithm is used to identify hyperspectral images explicitly in the visual domain. Experimental outcomes based on several SAR image data sets show that the suggested technique can attain greater classification efficiency than some traditional techniques, such as support vector machines and conventional deep learning techniques.


2019 ◽  
Vol 4 (4) ◽  

Detection of skin cancer involves several steps of examinations first being visual diagnosis that is followed by dermoscopic analysis, a biopsy, and histopathological examination. The classification of skin lesions in the first step is critical and challenging as classes vary by minute appearance in skin lesions. Deep convolutional neural networks (CNNs) have great potential in multicategory image-based classification by considering coarse-to-fine image features. This study aims to demonstrate how to classify skin lesions, in particular, melanoma, using CNN trained on data sets with disease labels. We developed and trained our own CNN model using a subset of the images from International Skin Imaging Collaboration (ISIC) Dermoscopic Archive. To test the performance of the proposed model, we used a different subset of images from the same archive as the test set. Our model is trained to classify images into two categories: malignant melanoma and nevus and is shown to achieve excellent classification results with high test accuracy (91.16%) and high performance as measured by various metrics. Our study demonstrated the potential of using deep neural networks to assist early detection of melanoma and thereby improve the patient survival rate from this aggressive skin cancer.


Author(s):  
Yang Yi ◽  
Feng Ni ◽  
Yuexin Ma ◽  
Xinge Zhu ◽  
Yuankai Qi ◽  
...  

State-of-the-art hand gesture recognition methods have investigated the spatiotemporal features based on 3D convolutional neural networks (3DCNNs) or convolutional long short-term memory (ConvLSTM). However, they often suffer from the inefficiency due to the high computational complexity of their network structures. In this paper, we focus instead on the 1D convolutional neural networks and propose a simple and efficient architectural unit, Multi-Kernel Temporal Block (MKTB), that models the multi-scale temporal responses by explicitly applying different temporal kernels. Then, we present a Global Refinement Block (GRB), which is an attention module for shaping the global temporal features based on the cross-channel similarity. By incorporating the MKTB and GRB, our architecture can effectively explore the spatiotemporal features within tolerable computational cost. Extensive experiments conducted on public datasets demonstrate that our proposed model achieves the state-of-the-art with higher efficiency. Moreover, the proposed MKTB and GRB are plug-and-play modules and the experiments on other tasks, like video understanding and video-based person re-identification, also display their good performance in efficiency and capability of generalization.


2020 ◽  
Vol 10 (12) ◽  
pp. 4125
Author(s):  
David Zabala-Blanco ◽  
Marco Mora ◽  
Ricardo J. Barrientos ◽  
Ruber Hernández-García ◽  
José Naranjo-Torres

Fingerprint classification is a stage of biometric identification systems that aims to group fingerprints and reduce search times and computational complexity in the databases of fingerprints. The most recent works on this problem propose methods based on deep convolutional neural networks (CNNs) by adopting fingerprint images as inputs. These networks have achieved high classification performances, but with a high computational cost in the network training process, even by using high-performance computing techniques. In this paper, we introduce a novel fingerprint classification approach based on feature extractor models, and basic and modified extreme learning machines (ELMs), being the first time that this approach is adopted. The weighted ELMs naturally address the problem of unbalanced data, such as fingerprint databases. Some of the best and most recent extractors (Capelli02, Hong08, and Liu10), which are based on the most relevant visual characteristics of the fingerprint image, are considered. Considering the unbalanced classes for fingerprint identification schemes, we optimize the ELMs (standard, original weighted, and decay weighted) in terms of the geometric mean by estimating their hyper-parameters (regularization parameter, number of hidden neurons, and decay parameter). At the same time, the classic accuracy and penetration-rate metrics are computed for comparison purposes with the superior CNN-based methods reported in the literature. The experimental results show that weighted ELM with the presence of the golden-ratio in the weighted matrix (W-ELM2) overall outperforms the rest of the ELMs. The combination of the Hong08 extractor and W-ELM2 competes with CNNs in terms of the fingerprint classification efficacy, but the ELMs-based methods have been demonstrated their extremely fast training speeds in any context.


2020 ◽  
Vol 34 (04) ◽  
pp. 4569-4576
Author(s):  
Sangho Lee ◽  
Simyung Chang ◽  
Nojun Kwak

Convolutional Neural Networks are widely used to process spatial scenes, but their computational cost is fixed and depends on the structure of the network used. There are methods to reduce the cost by compressing networks or varying its computational path dynamically according to the input image. However, since a user can not control the size of the learned model, it is difficult to respond dynamically if the amount of service requests suddenly increases. We propose User-Resizable Residual Networks (URNet), which allows users to adjust the computational cost of the network as needed during evaluation. URNet includes Conditional Gating Module (CGM) that determines the use of each residual block according to the input image and the desired cost. CGM is trained in a supervised manner using the newly proposed scale(cost) loss and its corresponding training methods. URNet can control the amount of computation and its inference path according to user's demand without degrading the accuracy significantly. In the experiments on ImageNet, URNet based on ResNet-101 maintains the accuracy of the baseline even when resizing it to approximately 80% of the original network, and demonstrates only about 1% accuracy degradation when using about 65% of the computation.


2021 ◽  
Vol 13 (1) ◽  
pp. 01-08
Author(s):  
Allana dos Santos Campos ◽  
César Alberto Bravo Pariente

Initially, neural networks were developed with the objective of creating a computational system that models the functioning of the human brain, however they started to be used to solve specific tasks. Adaline and Perceptron are two neural networks that calculate an input function using a set of adaptive weights and a bias, despite their similarities, it is known that the Adaline neural network converges to a result more quickly than the Perceptron neural network. This work was designed as a didactic exercise, in order to present how such conclusions are obtained, using the IRIS database as data for classification and training. Throughout the work, the programming languages Processing, was used to develop neural networks, and Python for visual presentation of results. The results found show the high performance of the Adaline neural network over the Perceptron, showing the database classes that can be linearly separated and those that cannot, the metric used to evaluate the performance between the neural networks is defined by the percentage of correct answers in the data classifications. Adaline showed the best performance in the classification for length and width of the petal between the Iris-setosa and Iris-virginica classes among all the other classifications.


2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Qiang Lan ◽  
Zelong Wang ◽  
Mei Wen ◽  
Chunyuan Zhang ◽  
Yijie Wang

Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version.


Sign in / Sign up

Export Citation Format

Share Document