Parallelized Online Regularized Least-Squares for Adaptive Embedded Systems

Tapio Pahikkala; Antti Airola; Thomas Canhao Xu; Pasi Liljeberg; Hannu Tenhunen; Tapio Salakoski

doi:10.4018/jertcs.2012040104

Parallelized Online Regularized Least-Squares for Adaptive Embedded Systems

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/jertcs.2012040104 ◽

2012 ◽

Vol 3 (2) ◽

pp. 73-91 ◽

Cited By ~ 1

Author(s):

Tapio Pahikkala ◽

Antti Airola ◽

Thomas Canhao Xu ◽

Pasi Liljeberg ◽

Hannu Tenhunen ◽

...

Keyword(s):

Machine Learning ◽

Least Squares ◽

Real Time ◽

Adaptive Systems ◽

Learning Algorithm ◽

Real Life ◽

Recognition Task ◽

Learning System ◽

Regularized Least Squares ◽

On Chip

The authors introduce a machine learning approach based on parallel online regularized least-squares learning algorithm for parallel embedded hardware platforms. The system is suitable for use in real-time adaptive systems. Firstly, the system can learn in online fashion, a property required in real-life applications of embedded machine learning systems. Secondly, to guarantee real-time response in embedded multi-core computer architectures, the learning system is parallelized and able to operate with a limited amount of computational and memory resources. Thirdly, the system can predict several labels simultaneously. The authors evaluate the performance of the algorithm from three different perspectives. The prediction performance is evaluated on a hand-written digit recognition task. The computational speed is measured from 1 thread to 4 threads, in a quad-core platform. As a promising unconventional multi-core architecture, Network-on-Chip platform is studied for the algorithm. The authors construct a NoC consisting of a 4x4 mesh. The machine learning algorithm is implemented in this platform with up to 16 threads. It is shown that the memory consumption and cache efficiency can be considerably improved by optimizing the cache behavior of the system. The authors’ results provide a guideline for designing future embedded multi-core machine learning devices.

Download Full-text

On Parallel Online Learning for Adaptive Embedded Systems

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Advancing Embedded Systems and Real-Time Communications with Emerging Technologies ◽

10.4018/978-1-4666-6034-2.ch011 ◽

2014 ◽

pp. 262-281

Author(s):

Tapio Pahikkala ◽

Antti Airola ◽

Thomas Canhao Xu ◽

Pasi Liljeberg ◽

Hannu Tenhunen ◽

...

Keyword(s):

Machine Learning ◽

Adaptive Systems ◽

Learning Algorithm ◽

Parallel Implementation ◽

Recognition Task ◽

Accurate Solution ◽

Training Data ◽

Machine Learning Algorithm ◽

Systems Learning ◽

On Chip

This chapter considers parallel implementation of the online multi-label regularized least-squares machine-learning algorithm for embedded hardware platforms. The authors focus on the following properties required in real-time adaptive systems: learning in online fashion, that is, the model improves with new data but does not require storing it; the method can fully utilize the computational abilities of modern embedded multi-core computer architectures; and the system efficiently learns to predict several labels simultaneously. They demonstrate on a hand-written digit recognition task that the online algorithm converges faster, with respect to the amount of training data processed, to an accurate solution than a stochastic gradient descent based baseline. Further, the authors show that our parallelization of the method scales well on a quad-core platform. Moreover, since Network-on-Chip (NoC) has been proposed as a promising candidate for future multi-core architectures, they implement a NoC system consisting of 16 cores. The proposed machine learning algorithm is evaluated in the NoC platform. Experimental results show that, by optimizing the cache behaviour of the program, cache/memory efficiency can improve significantly. Results from the chapter provide a guideline for designing future embedded multi-core machine learning devices.

Download Full-text

A PARALLEL ONLINE REGULARIZED LEAST-SQUARES MACHINE LEARNING ALGORITHM FOR FUTURE MULTI-CORE PROCESSORS

Proceedings of the 1st International Conference on Pervasive and Embedded Computing and Communication Systems ◽

10.5220/0003411405900599 ◽

2011 ◽

Keyword(s):

Machine Learning ◽

Least Squares ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Regularized Least Squares

Download Full-text

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Download Full-text

On-Device Deep Learning Inference for System-on-Chip (SoC) Architectures

Electronics ◽

10.3390/electronics10060689 ◽

2021 ◽

Vol 10 (6) ◽

pp. 689

Author(s):

Tom Springer ◽

Elia Eiroa-Lledo ◽

Elizabeth Stevens ◽

Erik Linstead

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Real Time ◽

Operating Systems ◽

System On Chip ◽

Low Latency ◽

Management Framework ◽

On Chip ◽

Specialized Hardware ◽

Deterministic Behavior

As machine learning becomes ubiquitous, the need to deploy models on real-time, embedded systems will become increasingly critical. This is especially true for deep learning solutions, whose large models pose interesting challenges for target architectures at the “edge” that are resource-constrained. The realization of machine learning, and deep learning, is being driven by the availability of specialized hardware, such as system-on-chip solutions, which provide some alleviation of constraints. Equally important, however, are the operating systems that run on this hardware, and specifically the ability to leverage commercial real-time operating systems which, unlike general purpose operating systems such as Linux, can provide the low-latency, deterministic execution required for embedded, and potentially safety-critical, applications at the edge. Despite this, studies considering the integration of real-time operating systems, specialized hardware, and machine learning/deep learning algorithms remain limited. In particular, better mechanisms for real-time scheduling in the context of machine learning applications will prove to be critical as these technologies move to the edge. In order to address some of these challenges, we present a resource management framework designed to provide a dynamic on-device approach to the allocation and scheduling of limited resources in a real-time processing environment. These types of mechanisms are necessary to support the deterministic behavior required by the control components contained in the edge nodes. To validate the effectiveness of our approach, we applied rigorous schedulability analysis to a large set of randomly generated simulated task sets and then verified the most time critical applications, such as the control tasks which maintained low-latency deterministic behavior even during off-nominal conditions. The practicality of our scheduling framework was demonstrated by integrating it into a commercial real-time operating system (VxWorks) then running a typical deep learning image processing application to perform simple object detection. The results indicate that our proposed resource management framework can be leveraged to facilitate integration of machine learning algorithms with real-time operating systems and embedded platforms, including widely-used, industry-standard real-time operating systems.

Download Full-text

Low-cost real-time non-intrusive appliance identification and controlling through machine learning algorithm

2018 International Symposium on Consumer Technologies (ISCT) ◽

10.1109/isce.2018.8408911 ◽

2018 ◽

Cited By ~ 3

Author(s):

Sheharyar Khan ◽

Ahmad Farhan Latif ◽

Sarmad Sohaib

Keyword(s):

Machine Learning ◽

Real Time ◽

Learning Algorithm ◽

Low Cost ◽

Machine Learning Algorithm

Download Full-text

Implementation of modified SARSA learning technique in EMCAP

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.5.9161 ◽

2017 ◽

Vol 7 (1.5) ◽

pp. 274

Author(s):

D. Ganesha ◽

Vijayakumar Maragal Venkatamuni

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Decision Process ◽

Learning Algorithm ◽

Research Work ◽

Learning System ◽

State Action ◽

Learning Technique ◽

Markov Decision ◽

Experiment Analysis

This research work presents analysis of Modified Sarsa learning algorithm. Modified Sarsa algorithm. State-Action-Reward-State-Action (SARSA) is an technique for learning a Markov decision process (MDP) strategy, used in for reinforcement learning int the field of artificial intelligence (AI) and machine learning (ML). The Modified SARSA Algorithm makes better actions to get better rewards. Experiment are conducted to evaluate the performace for each agent individually. For result comparison among different agent, the same statistics were collected. This work considered varied kind of agents in different level of architecture for experiment analysis. The Fungus world testbed has been considered for experiment which is has been implemented using SwI-Prolog 5.4.6. The fixed obstructs tend to be more versatile, to make a location that is specific to Fungus world testbed environment. The various parameters are introduced in an environment to test a agent’s performance. This modified SARSA learning algorithm can be more suitable in EMCAP architecture. The experiments are conducted the modified SARSA Learning system gets more rewards compare to existing SARSA algorithm.

Download Full-text

Applied On-Chip Machine Learning for Dynamic Resource Control in Multithreaded Processors

Parallel Processing Letters ◽

10.1142/s0129626419500130 ◽

2019 ◽

Vol 29 (03) ◽

pp. 1950013

Author(s):

Shane Carroll ◽

Wei-Ming Lin

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Clock Cycle ◽

Shared Resources ◽

Multithreaded Processors ◽

Multiple Threads ◽

On Chip ◽

Control Instruction ◽

Cache Miss ◽

Fetch Bandwidth

In this paper, we propose a machine learning algorithm to control instruction fetch bandwidth in a simultaneous multithreaded CPU. In a simultaneous multithreaded CPU, multiple threads occupy pools of hardware resources in the same clock cycle. Under some conditions, one or more threads may undergo a period of inefficiency, e.g., a cache miss, thereby inefficiently using shared resources and degrading the performance of other threads. If these inefficiencies can be identified at runtime, the offending thread can be temporarily blocked from fetching new instructions into the pipeline and given time to recover from its inefficiency, and prevent the shared system resources from being wasted on a stalled thread. In this paper, we propose a machine learning approach to determine when a thread should be blocked from fetching new instructions. The model is trained offline and the parameters embedded in a CPU, which can be queried with runtime statistics to determine if a thread is running inefficiently and should be temporarily blocked from fetching. We propose two models: a simple linear model and a higher-capacity neural network. We test each model in a simulation environment and show that system performance can increase by up to 19% on average with a feasible implementation of the proposed algorithm.

Download Full-text

Modulation Recognition of Communication Signal Based on Convolutional Neural Network

Symmetry ◽

10.3390/sym13122302 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2302

Author(s):

Kaiyuan Jiang ◽

Xvan Qin ◽

Jiawei Zhang ◽

Aili Wang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Communication Systems ◽

Learning Algorithm ◽

Recognition Task ◽

Machine Learning Algorithms ◽

Convolution Neural Network ◽

Modulation Recognition ◽

Signal Modulation ◽

Communication Signal

In the noncooperation communication scenario, digital signal modulation recognition will help people to identify the communication targets and have better management over them. To solve problems such as high complexity, low accuracy and cumbersome manual extraction of features by traditional machine learning algorithms, a kind of communication signal modulation recognition model based on convolution neural network (CNN) is proposed. In this paper, a convolution neural network combines bidirectional long short-term memory (BiLSTM) with a symmetrical structure to successively extract the frequency domain features and timing features of signals and then assigns importance weights based on the attention mechanism to complete the recognition task. Seven typical digital modulation schemes including 2ASK, 4ASK, 4FSK, BPSK, QPSK, 8PSK and 64QAM are used in the simulation test, and the results show that, compared with the classical machine learning algorithm, the proposed algorithm has higher recognition accuracy at low SNR, which confirmed that the proposed modulation recognition method is effective in noncooperation communication systems.

Download Full-text

A REAL TIME CLOUD BASED MACHINE LEARNING SYSTEM WITH BIG DATA ANALYTICS FOR DIABETES DETECTION AND CLASSIFICATION

International Journal of Research in Engineering and Technology ◽

10.15623/ijret.2017.0605020 ◽

2017 ◽

Vol 06 (05) ◽

pp. 120-124

Author(s):

Shashikant .

Keyword(s):

Machine Learning ◽

Big Data ◽

Real Time ◽

Data Analytics ◽

Big Data Analytics ◽

Learning System

Download Full-text

Machine learning algorithms can predict tail biting outbreaks in pigs using feeding behaviour records

10.1101/2021.05.11.443554 ◽

2021 ◽

Author(s):

Catherine Ollagnier ◽

Claudia Kasper ◽

Anna Wallenbeck ◽

Linda Keeling ◽

Siavash A Bigdeli

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Feeding Behaviour ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

Tail Biting ◽

Testing Set

Tail biting is a detrimental behaviour that impacts the welfare and health of pigs. Early detection of tail biting precursor signs allows for preventive measures to be taken, thus avoiding the occurrence of the tail biting event. This study aimed to build a machine-learning algorithm for real time detection of upcoming tail biting outbreaks, using feeding behaviour data recorded by an electronic feeder. Prediction capacities of seven machine learning algorithms (e.g., random forest, neural networks) were evaluated from daily feeding data collected from 65 pens originating from 2 herds of grower-finisher pigs (25-100kg), in which 27 tail biting events occurred. Data were divided into training and testing data, either by randomly splitting data into 75% (training set) and 25% (testing set), or by randomly selecting pens to constitute the testing set. The random forest algorithm was able to predict 70% of the upcoming events with an accuracy of 94%, when predicting events in pens for which it had previous data. The detection of events for unknown pens was less sensitive, and the neural network model was able to detect 14% of the upcoming events with an accuracy of 63%. A machine-learning algorithm based on ongoing data collection should be considered for implementation into automatic feeder systems for real time prediction of tail biting events.

Download Full-text