Energy Efficiency of Machine Learning in Embedded Systems Using Neuromorphic Hardware

Recently, the application of machine learning on embedded systems has drawn interest in both the research community and industry because embedded systems located at the edge can produce a faster response and reduce network load. However, software implementation of neural networks on Central Processing Units (CPUs) is considered infeasible in embedded systems due to limited power supply. To accelerate AI processing, the many-core Graphics Processing Unit (GPU) has been a preferred device to the CPU. However, its energy efficiency is not still considered to be good enough for embedded systems. Among other approaches for machine learning on embedded systems, neuromorphic processing chips are expected to be less power-consuming and overcome the memory bottleneck. In this work, we implemented a pedestrian image detection system on an embedded device using a commercially available neuromorphic chip, NM500, which is based on NeuroMem technology. The NM500 processing time and the power consumption were measured as the number of chips was increased from one to seven, and they were compared to those of a multicore CPU system and a GPU-accelerated embedded system. The results show that NM500 is more efficient in terms of energy required to process data for both learning and classification than the GPU-accelerated system or the multicore CPU system. Additionally, limits and possible improvement of the current NM500 are identified based on the experimental results.

Download Full-text

Controlling Embedded Systems Remotely via Internet-of-Things Based on Emotional Recognition

Advances in Human-Computer Interaction ◽

10.1155/2020/8895176 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Mohammad J. M. Zedan ◽

Ali I. Abduljabbar ◽

Fahad Layth Malallah ◽

Mustafa Ghanem Saeed

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Remote Control ◽

Embedded Systems ◽

Embedded System ◽

Electronic Devices ◽

Recognition Rate ◽

Disabled People ◽

Research Attention ◽

Image Preprocessing

Nowadays, much research attention is focused on human–computer interaction (HCI), specifically in terms of biosignal, which has been recently used for the remote controlling to offer benefits especially for disabled people or protecting against contagions, such as coronavirus. In this paper, a biosignal type, namely, facial emotional signal, is proposed to control electronic devices remotely via emotional vision recognition. The objective is converting only two facial emotions: a smiling or nonsmiling vision signal captured by the camera into a remote control signal. The methodology is achieved by combining machine learning (for smiling recognition) and embedded systems (for remote control IoT) fields. In terms of the smiling recognition, GENKl-4K database is exploited to train a model, which is built in the following sequenced steps: real-time video, snapshot image, preprocessing, face detection, feature extraction using HOG, and then finally SVM for the classification. The achieved recognition rate is up to 89% for the training and testing with 10-fold validation of SVM. In terms of IoT, the Arduino and MCU (Tx and Rx) nodes are exploited for transferring the resulting biosignal remotely as a server and client via the HTTP protocol. Promising experimental results are achieved by conducting experiments on 40 individuals who participated in controlling their emotional biosignals on several devices such as closing and opening a door and also turning the alarm on or off through Wi-Fi. The system implementing this research is developed in Matlab. It connects a webcam to Arduino and a MCU node as an embedded system.

Download Full-text

Typhoon Quantitative Rainfall Prediction from Big Data Analytics by Using the Apache Hadoop Spark Parallel Computing Framework

Atmosphere ◽

10.3390/atmos11080870 ◽

2020 ◽

Vol 11 (8) ◽

pp. 870 ◽

Cited By ~ 1

Author(s):

Chih-Chiang Wei ◽

Tzu-Hao Chou

Keyword(s):

Machine Learning ◽

Big Data ◽

Prediction Models ◽

Processing Unit ◽

Central Processing ◽

Rainfall Prediction ◽

Typhoon Rainfall ◽

Computing Framework ◽

Spark Framework ◽

Big Data Technology

Situated in the main tracks of typhoons in the Northwestern Pacific Ocean, Taiwan frequently encounters disasters from heavy rainfall during typhoons. Accurate and timely typhoon rainfall prediction is an imperative topic that must be addressed. The purpose of this study was to develop a Hadoop Spark distribute framework based on big-data technology, to accelerate the computation of typhoon rainfall prediction models. This study used deep neural networks (DNNs) and multiple linear regressions (MLRs) in machine learning, to establish rainfall prediction models and evaluate rainfall prediction accuracy. The Hadoop Spark distributed cluster-computing framework was the big-data technology used. The Hadoop Spark framework consisted of the Hadoop Distributed File System, MapReduce framework, and Spark, which was used as a new-generation technology to improve the efficiency of the distributed computing. The research area was Northern Taiwan, which contains four surface observation stations as the experimental sites. This study collected 271 typhoon events (from 1961 to 2017). The following results were obtained: (1) in machine-learning computation, prediction errors increased with prediction duration in the DNN and MLR models; and (2) the system of Hadoop Spark framework was faster than the standalone systems (single I7 central processing unit (CPU) and single E3 CPU). When complex computation is required in a model (e.g., DNN model parameter calibration), the big-data-based Hadoop Spark framework can be used to establish highly efficient computation environments. In summary, this study successfully used the big-data Hadoop Spark framework with machine learning, to develop rainfall prediction models with effectively improved computing efficiency. Therefore, the proposed system can solve problems regarding real-time typhoon rainfall prediction with high timeliness and accuracy.

Download Full-text

On the Optimal Energy Efficiency and Spectral Efficiency Trade-Off of CF Massive MIMO SWIPT System

10.21203/rs.3.rs-184279/v1 ◽

2021 ◽

Author(s):

Na Li ◽

Yuan Yuan Gao ◽

Kui Xu

Keyword(s):

Energy Efficiency ◽

Energy Harvesting ◽

Information Transmission ◽

Spectral Efficiency ◽

Performance Optimization ◽

Processing Unit ◽

Processing Scheme ◽

Selection Scheme ◽

Trade Off ◽

Central Processing

Abstract This paper studies a cell-free (CF) massive multi-input multi-output (MIMO) simultaneous wireless information and power transmission (SWIPT) system and proposes a user-centric (UC) access point (AP) selection method and a trade-off performance optimization scheme for spectral efficiency and energy efficiency. In this system, users have both energy harvesting and information transmission functions, and according to the difference between energy harvesting and information transmission, a flexible AP selection scheme is designed. This paper analyses the trade-off between energy efficiency and spectral efficiency, proposes an evaluation index that takes into account both energy efficiency and spectral efficiency, and jointly optimizes the AP selection scheme and the uplink (UL) and downlink (DL) time switching ratio to maximize the trade-off performance. Then, the non-convex problem is converted to a geometric planning (GP) problem to solve. The simulation results show that by implementing a suitable AP selection scheme and UL and DL time allocation, the information processing scheme on the AP side has a slight loss in spectral efficiency, but the energy efficiency is close to the performance of global processing on the central processing unit (CPU).

Download Full-text

Machine Learning Enabled Adaptive Optimization of a Transonic Compressor Rotor With Precompression

Journal of Turbomachinery ◽

10.1115/1.4041808 ◽

2019 ◽

Vol 141 (5) ◽

Cited By ~ 3

Author(s):

Michael Joly ◽

Soumalya Sarkar ◽

Dhagash Mehta

Keyword(s):

Machine Learning ◽

Design Space Exploration ◽

Surrogate Models ◽

Processing Unit ◽

Adaptive Optimization ◽

Transonic Compressor ◽

Central Processing ◽

Compressor Rotor ◽

Self Tuning ◽

The Stability

In aerodynamic design, accurate and robust surrogate models are important to accelerate computationally expensive computational fluid dynamics (CFD)-based optimization. In this paper, a machine learning framework is presented to speed-up the design optimization of a highly loaded transonic compressor rotor. The approach is threefold: (1) dynamic selection and self-tuning among several surrogate models; (2) classification to anticipate failure of the performance evaluation; and (3) adaptive selection of new candidates to perform CFD evaluation for updating the surrogate, which facilitates design space exploration and reduces surrogate uncertainty. The framework is demonstrated with a multipoint optimization of the transonic NASA rotor 37, yielding increased compressor efficiency in less than 48 h on 100 central processing unit cores. The optimized rotor geometry features precompression that relocates and attenuates the shock, without the stability penalty or undesired reacceleration usually observed in the literature.

Download Full-text

Design and Power Management of a Secured Wireless Sensor System for Salton Sea Environmental Monitoring

Electronics ◽

10.3390/electronics9040544 ◽

2020 ◽

Vol 9 (4) ◽

pp. 544

Author(s):

Kristian Diaz ◽

Ying-Khai Teh

Keyword(s):

Embedded System ◽

Power Optimization ◽

Finite Energy ◽

Environmental Data ◽

Salton Sea ◽

Processing Unit ◽

Energy Aware ◽

Imperial Valley ◽

Central Processing ◽

Commercial Off The Shelf

An embedded system composed of commercial off the shelf (COTS) peripherals and microcontroller. The system will collect environmental data for Salton Sea, Imperial Valley, California in order to understand the development of environmental and health hazards. Power analysis of each system features (i.e. Central Processing Unit (CPU) core, Input/Output (I/O) buses, and peripheral (temperature, humidity, and optical dust sensor) are studied. Software-based power optimization utilizes the power information with hardware-assisted power gating to control system features. The control of these features extends system uptime in a field deployed finite energy scenario. The proposed power optimization algorithm can collect more data by increasing system up time when compared to a Low Power Energy Aware Processing (LEAP) approach. Lastly, the 128 bit Advanced Encryption Standard (AES) algorithm is applied on the collected data using various parameters. A hidden peripheral requirement that must be considered during design are also noted to impact the efficacy of this method.

Download Full-text

A MACHINE LEARNING MODEL FOR AN EARTHQUAKE FORECASTING USING PARALLEL PROCESSING

Proceedings of the International Conference on Emerging Trends in Engineering & Technology (IConETech-2020) ◽

10.47412/dhhv5862 ◽

2020 ◽

Author(s):

Manoj Kollam ◽

Ajay Joshi

Keyword(s):

Machine Learning ◽

Parallel Processing ◽

Geographical Location ◽

Model Performance ◽

Economic Loss ◽

Central Processing Unit ◽

Earthquake Forecasting ◽

Processing Unit ◽

Daily Lives ◽

Central Processing

Earthquake is a devastating natural hazard which has a capability to wipe out thousands of lives and cause economic loss to the geographical location. Seismic stations continuously gather data without the necessity of the occurrence of an event. The gathered data is processed by the model to forecast the occurrence of earthquakes. This paper presents a model to forecast earthquakes using Parallel processing. Machine Learning is rapidly taking over a variety of aspects in our daily lives. Even though Machine Learning methods can be used for analyzing data, in the scenario of event forecasts like earthquakes, performance of Machine Learning is limited as the data grows day by day. Using ML alone is not a perfect solution for the model. To increase the model performance and accuracy, a new ML model is designed using parallel processing. The drawbacks of ML using central processing unit (CPU) can be overcome byGraphic Processing unit (GPU) implementation, since the parallelism is naturally provided using framework for developing GPU utilizing computational algorithms, known as the Compute Unified Device Architecture (CUDA). The implementation of hybrid state vector machine (H-SVM) algorithm using parallel processing through CUDA is used to forecast earthquakes. Our experiments show that the GPU based implementation achieved typical speedup values in the range of 3-70 times compared to conventional central processing unit (CPU). Results of different experiments are discussed along with their consequences.

Download Full-text

Towards Neuromorphic Learning Machines using Emerging Memory Devices with Brain-like Energy Efficiency

10.20944/preprints201807.0362.v2 ◽

2018 ◽

Author(s):

Vishal Saxena ◽

Xinyu Wu ◽

Ira Srivastava ◽

Kehan Zhu

Keyword(s):

Machine Learning ◽

Energy Efficiency ◽

Deep Learning ◽

Learning Algorithms ◽

Memory Devices ◽

Cognitive Computing ◽

Mixed Signal ◽

Von Neumann ◽

Neuromorphic Hardware ◽

Emerging Memory

The ongoing revolution in Deep Learning is redefining the nature of computing that is driven by the increasing amount of pattern classification and cognitive tasks. Specialized digital hardware for deep learning still holds its predominance due to the flexibility offered by the software implementation and maturity of algorithms. However, it is being increasingly desired that cognitive computing occurs at the edge, i.e. on hand-held devices that are energy constrained, which is energy prohibitive when employing digital von Neumann architectures. Recent explorations in digital neuromorphic hardware have shown promise, but offer low neurosynaptic density needed for scaling to applications such as intelligent cognitive assistants (ICA). Large-scale integration of nanoscale emerging memory devices with Complementary Metal Oxide Semiconductor (CMOS) mixed-signal integrated circuits can herald a new generation of Neuromorphic computers that will transcend the von Neumann bottleneck for cognitive computing tasks. Such hybrid Neuromorphic System-on-a-chip (NeuSoC) architectures promise machine learning capability at chip-scale form factor, and several orders of magnitude improvement in energy efficiency. Practical demonstration of such architectures has been limited as performance of emerging memory devices falls short of the expected behavior from the idealized memristor-based analog synapses, or weights, and novel machine learning algorithms are needed to take advantage of the device behavior. In this work, we review the challenges involved and present a pathway to realize ultra-low-power mixed-signal NeuSoC, from device arrays and circuits to spike-based deep learning algorithms, with ‘brain-like’ energy-efficiency.

Download Full-text

Intruder Detection Systems on Computer Networks Using Host Based Intrusion Detection System Techniques

Buletin Ilmiah Sarjana Teknik Elektro ◽

10.12928/biste.v3i1.1752 ◽

2021 ◽

Vol 3 (1) ◽

pp. 21

Author(s):

Rio Widodo ◽

Imam Riadi

Keyword(s):

Network Security ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Accurate Information ◽

Processing Unit ◽

Dos Attacks ◽

Wifi Networks ◽

Central Processing ◽

Detection Systems

The openness of access to information raises various problems, including maintaining the validity and integrity of data, so a network security system is needed that can deal with potential threats that can occur quickly and accurately by utilizing an IDS (intrusion detection system). One of the IDS tools that are often used is Snort which works in real-time to monitor and detect the ongoing network by providing warnings and information on potential threats in the form of DoS attacks. DoS attacks run to exhaust the packet path by requesting packets to a target in large and continuous ways which results in increased usage of CPU (central processing unit), memory, and ethernet or WiFi networks. The snort IDS implementation can help provide accurate information on network security that you want to monitor because every communication that takes place in a network, every event that occurs and potential attacks that can paralyze the internet network are monitored by snort.

Download Full-text

Joint Optimized CPU and Networking Control Scheme for Improved Energy Efficiency in Video Streaming on Mobile Devices

Mobile Information Systems ◽

10.1155/2017/9816467 ◽

2017 ◽

Vol 2017 ◽

pp. 1-8 ◽

Cited By ~ 2

Author(s):

Sung-Woong Jo ◽

Jong-Moon Chung

Keyword(s):

Energy Efficiency ◽

Energy Consumption ◽

Mobile Devices ◽

Video Streaming ◽

Joint Optimization ◽

Battery Life ◽

Processing Unit ◽

Central Processing ◽

Control Scheme ◽

Streaming Service

Video streaming service is one of the most popular applications for mobile users. However, mobile video streaming services consume a lot of energy, resulting in a reduced battery life. This is a critical problem that results in a degraded user’s quality of experience (QoE). Therefore, in this paper, a joint optimization scheme that controls both the central processing unit (CPU) and wireless networking of the video streaming process for improved energy efficiency on mobile devices is proposed. For this purpose, the energy consumption of the network interface and CPU is analyzed, and based on the energy consumption profile a joint optimization problem is formulated to maximize the energy efficiency of the mobile device. The proposed algorithm adaptively adjusts the number of chunks to be downloaded and decoded in each packet. Simulation results show that the proposed algorithm can effectively improve the energy efficiency when compared with the existing algorithms.

Download Full-text

Empirical Performance and Energy Consumption Evaluation of Container Solutions on Resource Constrained IoT Gateways

Sensors ◽

10.3390/s21041378 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1378

Author(s):

Syed M. Raza ◽

Jaeyeop Jeong ◽

Moonseong Kim ◽

Byungseok Kang ◽

Hyunseung Choo

Keyword(s):

Energy Consumption ◽

Detection System ◽

Processing Unit ◽

Memory Usage ◽

Resource Constrained ◽

Service Architecture ◽

Central Processing ◽

Empirical Performance ◽

Iot Devices ◽

Container Orchestration

Containers virtually package a piece of software and share the host Operating System (OS) upon deployment. This makes them notably light weight and suitable for dynamic service deployment at the network edge and Internet of Things (IoT) devices for reduced latency and energy consumption. Data collection, computation, and now intelligence is included in variety of IoT devices which have very tight latency and energy consumption conditions. Recent studies satisfy latency condition through containerized services deployment on IoT devices and gateways. They fail to account for the limited energy and computing resources of these devices which limit the scalability and concurrent services deployment. This paper aims to establish guidelines and identify critical factors for containerized services deployment on resource constrained IoT devices. For this purpose, two container orchestration tools (i.e., Docker Swarm and Kubernetes) are tested and compared on a baseline IoT gateways testbed. Experiments use Deep Learning driven data analytics and Intrusion Detection System services, and evaluate the time it takes to prepare and deploy a container (creation time), Central Processing Unit (CPU) utilization for concurrent containers deployment, memory usage under different traffic loads, and energy consumption. The results indicate that container creation time and memory usage are decisive factors for containerized micro service architecture.

Download Full-text