Soft Threshold Ternary Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/318 ◽

2020 ◽

Author(s):

Weixiang Xu ◽

Xiangyu He ◽

Tianli Zhao ◽

Qinghao Hu ◽

Peisong Wang ◽

...

Keyword(s):

Neural Networks ◽

Mobile Devices ◽

State Of The Art ◽

Performance Gap ◽

Training Time ◽

Current State ◽

The Arts ◽

Soft Threshold ◽

And Storage ◽

Selection Of

Large neural networks are difficult to deploy on mobile devices because of intensive computation and storage. To alleviate it, we study ternarization, a balance between efficiency and accuracy that quantizes both weights and activations into ternary values. In previous ternarized neural networks, a hard threshold Δ is introduced to determine quantization intervals. Although the selection of Δ greatly affects the training results, previous works estimate Δ via an approximation or treat it as a hyper-parameter, which is suboptimal. In this paper, we present the Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine quantization intervals instead of depending on a hard threshold. Concretely, we replace the original ternary kernel with the addition of two binary kernels at training time, where ternary values are determined by the combination of two corresponding binary values. At inference time, we add up the two binary kernels to obtain a single ternary kernel. Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and extreme low bit networks. Experiments on ImageNet with AlexNet (Top-1 55.6%), ResNet-18 (Top-1 66.2%) achieves new state-of-the-art.

Sparsity-Inducing Binarized Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6900 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12192-12199 ◽

Cited By ~ 1

Author(s):

Peisong Wang ◽

Xiangyu He ◽

Gang Li ◽

Tianli Zhao ◽

Jian Cheng

Keyword(s):

Neural Network ◽

Neural Networks ◽

High Efficiency ◽

State Of The Art ◽

Feature Representation ◽

Binary Representation ◽

Performance Gap ◽

Sign Function ◽

Current State ◽

The Arts

Binarization of feature representation is critical for Binarized Neural Networks (BNNs). Currently, sign function is the commonly used method for feature binarization. Although it works well on small datasets, the performance on ImageNet remains unsatisfied. Previous methods mainly focus on minimizing quantization error, improving the training strategies and decomposing each convolution layer into several binary convolution modules. However, whether sign is the only option for binarization has been largely overlooked. In this work, we propose the Sparsity-inducing Binarized Neural Network (Si-BNN), to quantize the activations to be either 0 or +1, which introduces sparsity into binary representation. We further introduce trainable thresholds into the backward function of binarization to guide the gradient propagation. Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and BNNs on mainstream architectures, achieving the new state-of-the-art on binarized AlexNet (Top-1 50.5%), ResNet-18 (Top-1 59.7%), and VGG-Net (Top-1 63.2%). At inference time, Si-BNN still enjoys the high efficiency of exclusive-not-or (xnor) operations.

AI-driven deep CNN approach for multi-label pathology classification using chest X-Rays

PeerJ Computer Science ◽

10.7717/peerj-cs.495 ◽

2021 ◽

Vol 7 ◽

pp. e495

Author(s):

Saleh Albahli ◽

Hafiz Tayyab Rauf ◽

Abdulelah Algosaibi ◽

Valentina Emilia Balas

Keyword(s):

Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Synthetic Data ◽

X Rays ◽

Deep Convolutional Neural Networks ◽

Current State ◽

Pathology Classification ◽

Wide Range ◽

Multi Class Classification

Artificial intelligence (AI) has played a significant role in image analysis and feature extraction, applied to detect and diagnose a wide range of chest-related diseases. Although several researchers have used current state-of-the-art approaches and have produced impressive chest-related clinical outcomes, specific techniques may not contribute many advantages if one type of disease is detected without the rest being identified. Those who tried to identify multiple chest-related diseases were ineffective due to insufficient data and the available data not being balanced. This research provides a significant contribution to the healthcare industry and the research community by proposing a synthetic data augmentation in three deep Convolutional Neural Networks (CNNs) architectures for the detection of 14 chest-related diseases. The employed models are DenseNet121, InceptionResNetV2, and ResNet152V2; after training and validation, an average ROC-AUC score of 0.80 was obtained competitive as compared to the previous models that were trained for multi-class classification to detect anomalies in x-ray images. This research illustrates how the proposed model practices state-of-the-art deep neural networks to classify 14 chest-related diseases with better accuracy.

Gaussian Transformer: A Lightweight Approach for Natural Language Inference

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016489 ◽

2019 ◽

Vol 33 ◽

pp. 6489-6496 ◽

Cited By ~ 2

Author(s):

Maosheng Guo ◽

Yu Zhang ◽

Ting Liu

Keyword(s):

Neural Networks ◽

Natural Language ◽

State Of The Art ◽

Research Area ◽

High Order Interaction ◽

Training Time ◽

Attention Networks ◽

Local Dependency ◽

Active Research ◽

Active Research Area

Natural Language Inference (NLI) is an active research area, where numerous approaches based on recurrent neural networks (RNNs), convolutional neural networks (CNNs), and self-attention networks (SANs) has been proposed. Although obtaining impressive performance, previous recurrent approaches are hard to train in parallel; convolutional models tend to cost more parameters, while self-attention networks are not good at capturing local dependency of texts. To address this problem, we introduce a Gaussian prior to selfattention mechanism, for better modeling the local structure of sentences. Then we propose an efficient RNN/CNN-free architecture named Gaussian Transformer for NLI, which consists of encoding blocks modeling both local and global dependency, high-order interaction blocks collecting the evidence of multi-step inference, and a lightweight comparison block saving lots of parameters. Experiments show that our model achieves new state-of-the-art performance on both SNLI and MultiNLI benchmarks with significantly fewer parameters and considerably less training time. Besides, evaluation using the Hard NLI datasets demonstrates that our approach is less affected by the undesirable annotation artifacts.

Complexity of Deep Convolutional Neural Networks in Mobile Computing

Complexity ◽

10.1155/2020/3853780 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Saad Naeem ◽

Noreen Jamil ◽

Habib Ullah Khan ◽

Shah Nazir

Keyword(s):

Neural Networks ◽

Mobile Devices ◽

Training Data ◽

Deep Convolutional Neural Networks ◽

Compression Technique ◽

Training Time ◽

Network Pruning ◽

Highly Nonlinear ◽

Process Work ◽

Time Required

Neural networks employ massive interconnection of simple computing units called neurons to compute the problems that are highly nonlinear and could not be hard coded into a program. These neural networks are computation-intensive, and training them requires a lot of training data. Each training example requires heavy computations. We look at different ways in which we can reduce the heavy computation requirement and possibly make them work on mobile devices. In this paper, we survey various techniques that can be matched and combined in order to improve the training time of neural networks. Additionally, we also review some extra recommendations to make the process work for mobile devices as well. We finally survey deep compression technique that tries to solve the problem by network pruning, quantization, and encoding the network weights. Deep compression reduces the time required for training the network by first pruning the irrelevant connections, i.e., the pruning stage, which is then followed by quantizing the network weights via choosing centroids for each layer. Finally, at the third stage, it employs Huffman encoding algorithm to deal with the storage issue of the remaining weights.

Detecting Emotions in English and Arabic Tweets

Information ◽

10.3390/info10030098 ◽

2019 ◽

Vol 10 (3) ◽

pp. 98 ◽

Cited By ~ 4

Author(s):

Tariq Ahmad ◽

Allan Ramsay ◽

Hanady Ahmed

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Algorithms ◽

General Purpose ◽

Machine Learning Algorithms ◽

Current State ◽

Optimal Thresholds ◽

Alternative Approach

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.

PPDIST, global 0.1° daily and 3-hourly precipitation probability distribution climatologies for 1979–2018

Scientific Data ◽

10.1038/s41597-020-00631-x ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Hylke E. Beck ◽

Seth Westra ◽

Jackson Tan ◽

Florian Pappenberger ◽

George J. Huffman ◽

...

Keyword(s):

Neural Networks ◽

Probability Distribution ◽

State Of The Art ◽

Intertropical Convergence Zone ◽

Coefficient Of Determination ◽

Current State ◽

Peak Intensity ◽

Global Land ◽

The Neural Networks ◽

Better Than

Abstract We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979–2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained with daily P observations from 93,138 gauges and hourly P observations (resampled to 3-hourly) from 11,881 gauges worldwide. Mean validation coefficient of determination (R2) values ranged from 0.76 to 0.80 for the daily P occurrence indices, and from 0.44 to 0.84 for the daily peak P intensity indices. The neural networks performed significantly better than current state-of-the-art reanalysis (ERA5) and satellite (IMERG) products for all P indices. Using a 0.1 mm 3 h−1 threshold, P was estimated to occur 12.2%, 7.4%, and 14.3% of the time, on average, over the global, land, and ocean domains, respectively. The highest P intensities were found over parts of Central America, India, and Southeast Asia, along the western equatorial coast of Africa, and in the intertropical convergence zone. The PPDIST dataset is available via www.gloh2o.org/ppdist.

The secret is at the crossways: Hodotopic organization and nonlinear dynamics of brain neural networks

Behavioral and Brain Sciences ◽

10.1017/s0140525x13001386 ◽

2013 ◽

Vol 36 (6) ◽

pp. 623-624 ◽

Cited By ~ 2

Author(s):

Tobias A. Mattei

Keyword(s):

Nonlinear Dynamics ◽

Neural Networks ◽

Cognitive Neuroscience ◽

State Of The Art ◽

Current State

AbstractBy integrating the classic psychological principles of ancient art of memory (AAOM) with the most recent paradigms in cognitive neuroscience (i.e., the concepts of hodotopic organization and nonlinear dynamics of brain neural networks), Llewellyn provides an up-to-date model of the complex psychological relationships between memory, imagination, and dreams in accordance with current state-of-the-art principles in neuroscience.

OPWUM: Opportunistic MAC Protocol Leveraging Wake-Up Receivers in WSNs

Journal of Sensors ◽

10.1155/2016/6263719 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 20

Author(s):

Fayçal Ait Aoudia ◽

Matthieu Gautier ◽

Olivier Berder

Keyword(s):

Power Consumption ◽

Analytical Study ◽

State Of The Art ◽

Mac Protocol ◽

Promising Technique ◽

Improve Energy Efficiency ◽

Current State ◽

Unreliable Links ◽

Careful Design ◽

Selection Of

Opportunistic forwarding has emerged as a promising technique to address the problem of unreliable links typical in wireless sensor networks and improve energy efficiency by exploiting multiuser diversity. Timer-based solutions, such as timer-based contention, form promising schemes to allow opportunistic next hop relay selection. However, they can incur significant idle listening and thus reduce the lifetime of the network. To tackle this problem, we propose to exploit emerging wake-up receiver technologies that have the potential to considerably reduce the power consumption of wireless communications. A careful design of MAC protocols is required to efficiently employ these new devices. In this work, we propose Opportunistic Wake-Up MAC (OPWUM), a novel multihop MAC protocol using timer-based contention. It enables the opportunistic selection of the best receiver among its neighboring nodes according to a given metric (e.g., the remaining energy), without requiring any knowledge about them. Moreover, OPWUM exploits emerging wake-up receivers to drastically reduce nodes power consumption. Through analytical study and exhaustive networks simulations, we show the effectiveness of OPWUM compared to the current state-of-the-art protocols using timer-based contention.

Training a neural network to learn other dimensionality reduction removes data size restrictions in bioinformatics and provides a new route to exploring data representations

10.1101/2020.09.03.269555 ◽

2020 ◽

Cited By ~ 1

Author(s):

Alex Dexter ◽

Spencer A. Thomas ◽

Rory T. Steven ◽

Kenneth N. Robinson ◽

Adam J. Taylor ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Dimensionality Reduction ◽

Computational Analysis ◽

New Technologies ◽

State Of The Art ◽

Current State ◽

Data Representations ◽

Non Linear ◽

Linear Dimensionality Reduction

AbstractHigh dimensionality omics and hyperspectral imaging datasets present difficult challenges for feature extraction and data mining due to huge numbers of features that cannot be simultaneously examined. The sample numbers and variables of these methods are constantly growing as new technologies are developed, and computational analysis needs to evolve to keep up with growing demand. Current state of the art algorithms can handle some routine datasets but struggle when datasets grow above a certain size. We present a training deep learning via neural networks on non-linear dimensionality reduction, in particular t-distributed stochastic neighbour embedding (t-SNE), to overcome prior limitations of these methods.One Sentence SummaryAnalysis of prohibitively large datasets by combining deep learning via neural networks with non-linear dimensionality reduction.

How to Model Tendon-Driven Continuum Robots and Benchmark Modelling Performance

Frontiers in Robotics and AI ◽

10.3389/frobt.2020.630245 ◽

2021 ◽

Vol 7 ◽

Author(s):

Priyanka Rao ◽

Quentin Peyron ◽

Sven Lilge ◽

Jessica Burgner-Kahrs

Keyword(s):

Case Studies ◽

State Of The Art ◽

Computation Time ◽

Comprehensive Overview ◽

Continuum Robots ◽

Current State ◽

Selection Of ◽

Modelling Approaches

Tendon actuation is one of the most prominent actuation principles for continuum robots. To date, a wide variety of modelling approaches has been derived to describe the deformations of tendon-driven continuum robots. Motivated by the need for a comprehensive overview of existing methodologies, this work summarizes and outlines state-of-the-art modelling approaches. In particular, the most relevant models are classified based on backbone representations and kinematic as well as static assumptions. Numerical case studies are conducted to compare the performance of representative modelling approaches from the current state-of-the-art, considering varying robot parameters and scenarios. The approaches show different performances in terms of accuracy and computation time. Guidelines for the selection of the most suitable approach for given designs of tendon-driven continuum robots and applications are deduced from these results.