Hardware-Aware Neural Architecture Search: Survey and Taxonomy

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/592 ◽

2021 ◽

Author(s):

Hadjer Benmeziane ◽

Kaoutar El Maghraoui ◽

Hamza Ouarnoughi ◽

Smail Niar ◽

Martin Wistuba ◽

...

Keyword(s):

Cost Estimation ◽

Deep Neural Networks ◽

Data Centers ◽

Hardware Cost ◽

Design Of Algorithms ◽

Future Directions ◽

Trade Off ◽

Neural Architecture ◽

Resource Constrained Devices ◽

Constrained Devices

There is no doubt that making AI mainstream by bringing powerful, yet power hungry deep neural networks (DNNs) to resource-constrained devices would required an efficient co-design of algorithms, hardware and software. The increased popularity of DNN applications deployed on a wide variety of platforms, from tiny microcontrollers to data-centers, have resulted in multiple questions and challenges related to constraints introduced by the hardware. In this survey on hardware-aware neural architecture search (HW-NAS), we present some of the existing answers proposed in the literature for the following questions: "Is it possible to build an efficient DL model that meets the latency and energy constraints of tiny edge devices?", "How can we reduce the trade-off between the accuracy of a DL model and its ability to be deployed in a variety of platforms?". The survey provides a new taxonomy of HW-NAS and assesses the hardware cost estimation strategies. We also highlight the challenges and limitations of existing approaches and potential future directions. We hope that this survey will help to fuel the research towards efficient deep learning.

Get full-text (via PubEx)

Energy-Efficient and High-throughput Implementations of Lightweight Block Cipher

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1022.1292s19 ◽

2019 ◽

Vol 9 (2S) ◽

pp. 35-41

Keyword(s):

High Throughput ◽

Energy Efficient ◽

High Speed ◽

Block Cipher ◽

Lightweight Cryptography ◽

Trade Off ◽

Lightweight Block Cipher ◽

Secure Transmission ◽

Resource Constrained Devices ◽

Constrained Devices

Security in resource-constrained devices has drawn the great attentions to researchers in recent years. To make secure transmission of critical information in such devices, lightweight cryptography algorithms come in light to large extend. KLEIN has been popular lightweight block cipher used to overcome such issues. In this paper, different architectures of KLEIN block cipher are presented. One of designs enhances the efficiency with regard to the throughput at the expense of a larger area. In order to make such designs, the pipelined registers are placed on different positions in datapath algorithm. The proposed design transforms the data input to protected output with the speed of 2414.13 Mbps for xc5vlx50t-3ff1136 device. In addition, the second design implementation completes either one or more than one round in only one clock and gives energy-efficient and high throughput implementations. Due to this, a trade-off between area and speed can be analyzed for high-speed applications. Moreover, this proposed design shows that with increasing the area of cipher implementation results in more transformation of plaintext into ciphertext. All results are verified and simulated for various families of Xilinx ISE design suite.

Get full-text (via PubEx)

SIBSC: Separable Identity-Based Signcryption for Resource-Constrained Devices

Informatica ◽

10.15388/informatica.2017.126 ◽

2017 ◽

Vol 28 (1) ◽

pp. 193-214 ◽

Cited By ~ 2

Author(s):

Tung-Tso Tsai ◽

Sen-Shan Huang ◽

Yuh-Min Tseng

Keyword(s):

Resource Constrained ◽

Identity Based ◽

Resource Constrained Devices ◽

Constrained Devices

Get full-text (via PubEx)

FPGA Implementation of Simple Encryption Scheme for Resource-Constrained Devices

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/213942020 ◽

2020 ◽

Vol 9 (4) ◽

pp. 5631-5639

Author(s):

Kiran Kumar V G

Keyword(s):

Fpga Implementation ◽

Encryption Scheme ◽

Resource Constrained ◽

Resource Constrained Devices ◽

Constrained Devices

Get full-text (via PubEx)

Emotion Recognition on Edge Devices: Training and Deployment

Sensors ◽

10.3390/s21134496 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4496

Author(s):

Vlad Pandelea ◽

Edoardo Ragusa ◽

Tommaso Apicella ◽

Paolo Gastaldo ◽

Erik Cambria

Keyword(s):

Emotion Recognition ◽

Language Processing ◽

Computational Cost ◽

Sequential Learning ◽

High Quality ◽

Fast Training ◽

Online Sequential Learning ◽

And Performance ◽

Resource Constrained Devices ◽

Constrained Devices

Emotion recognition, among other natural language processing tasks, has greatly benefited from the use of large transformer models. Deploying these models on resource-constrained devices, however, is a major challenge due to their computational cost. In this paper, we show that the combination of large transformers, as high-quality feature extractors, and simple hardware-friendly classifiers based on linear separators can achieve competitive performance while allowing real-time inference and fast training. Various solutions including batch and Online Sequential Learning are analyzed. Additionally, our experiments show that latency and performance can be further improved via dimensionality reduction and pre-training, respectively. The resulting system is implemented on two types of edge device, namely an edge accelerator and two smartphones.

Get full-text (via PubEx)

Location- and Person-Independent Activity Recognition with WiFi, Deep Neural Networks, and Reinforcement Learning

ACM Transactions on Internet of Things ◽

10.1145/3424739 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-25

Author(s):

Yongsen Ma ◽

Sheheryar Arshad ◽

Swetha Muniraju ◽

Eric Torkildson ◽

Enrico Rantala ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Reinforcement Learning ◽

Activity Recognition ◽

Deep Neural Networks ◽

State Machine ◽

Recognition Algorithm ◽

The State ◽

Neural Architecture ◽

Learning Agent

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.

Get full-text (via PubEx)

Aggregation of cohorts for histopathological diagnosis with deep morphological analysis

Scientific Reports ◽

10.1038/s41598-021-82642-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jeonghyuk Park ◽

Yul Ri Chung ◽

Seo Taek Kong ◽

Yeong Won Kim ◽

Hyunho Park ◽

...

Keyword(s):

Cancer Detection ◽

Morphological Analysis ◽

Deep Neural Networks ◽

Large Datasets ◽

Histopathological Diagnosis ◽

Single Model ◽

Trade Off ◽

Detection Model ◽

Optimal Behavior ◽

The Cost

AbstractThere have been substantial efforts in using deep learning (DL) to diagnose cancer from digital images of pathology slides. Existing algorithms typically operate by training deep neural networks either specialized in specific cohorts or an aggregate of all cohorts when there are only a few images available for the target cohort. A trade-off between decreasing the number of models and their cancer detection performance was evident in our experiments with The Cancer Genomic Atlas dataset, with the former approach achieving higher performance at the cost of having to acquire large datasets from the cohort of interest. Constructing annotated datasets for individual cohorts is extremely time-consuming, with the acquisition cost of such datasets growing linearly with the number of cohorts. Another issue associated with developing cohort-specific models is the difficulty of maintenance: all cohort-specific models may need to be adjusted when a new DL algorithm is to be used, where training even a single model may require a non-negligible amount of computation, or when more data is added to some cohorts. In resolving the sub-optimal behavior of a universal cancer detection model trained on an aggregate of cohorts, we investigated how cohorts can be grouped to augment a dataset without increasing the number of models linearly with the number of cohorts. This study introduces several metrics which measure the morphological similarities between cohort pairs and demonstrates how the metrics can be used to control the trade-off between performance and the number of models.

Get full-text (via PubEx)