Load Classification: A Case Study for Applying Neural Networks in Hyper-Constrained Embedded Devices

The application of Artificial Intelligence to the industrial world and its appliances has recently grown in popularity. Indeed, AI techniques are now becoming the de-facto technology for the resolution of complex tasks concerning computer vision, natural language processing and many other areas. In the last years, most of the the research community efforts have focused on increasing the performance of most common AI techniques—e.g., Neural Networks, etc.—at the expenses of their complexity. Indeed, many works in the AI field identify and propose hyper-efficient techniques, targeting high-end devices. However, the application of such AI techniques to devices and appliances which are characterised by limited computational capabilities, remains an open research issue. In the industrial world, this problem heavily targets low-end appliances, which are developed focusing on saving costs and relying on—computationally—constrained components. While some efforts have been made in this area through the proposal of AI-simplification and AI-compression techniques, it is still relevant to study which available AI techniques can be used in modern constrained devices. Therefore, in this paper we propose a load classification task as a case study to analyse which state-of-the-art NN solutions can be embedded successfully into constrained industrial devices. The presented case study is tested on a simple microcontroller, characterised by very poor computational performances—i.e., FLOPS –, to mirror faithfully the design process of low-end appliances. A handful of NN models are tested, showing positive outcomes and possible limitations, and highlighting the complexity of AI embedding.

Download Full-text

Photovoltaics Enabling Sustainable Energy Communities: Technological Drivers and Emerging Markets

Energies ◽

10.3390/en14071862 ◽

2021 ◽

Vol 14 (7) ◽

pp. 1862

Author(s):

Alexandros-Georgios Chronis ◽

Foivos Palaiogiannis ◽

Iasonas Kouveliotis-Lysikatos ◽

Panos Kotsampopoulos ◽

Nikos Hatziargyriou

Keyword(s):

State Of The Art ◽

Research Question ◽

Economic Benefits ◽

Energy Market ◽

Small Scale ◽

Local Energy ◽

Community Members ◽

Open Research ◽

Energy Community

In this paper, we investigate the economic benefits of an energy community investing in small-scale photovoltaics (PVs) when local energy trading is operated amongst the community members. The motivation stems from the open research question on whether a community-operated local energy market can enhance the investment feasibility of behind-the-meter small-scale PVs installed by energy community members. Firstly, a review of the models, mechanisms and concepts required for framing the relevant concepts is conducted, while a clarification of nuances at important terms is attempted. Next, a tool for the investigation of the economic benefits of operating a local energy market in the context of an energy community is developed. We design the local energy market using state-of-the-art formulations, modified according to the requirements of the case study. The model is applied to an energy community that is currently under formation in a Greek municipality. From the various simulations that were conducted, a series of generalizable conclusions are extracted.

Download Full-text

Action recognition based on 2D skeletons extracted from RGB videos

MATEC Web of Conferences ◽

10.1051/matecconf/201927702034 ◽

2019 ◽

Vol 277 ◽

pp. 02034

Author(s):

Sophie Aubry ◽

Sohaib Laraba ◽

Joëlle Tilmanne ◽

Thierry Dutoit

Keyword(s):

Neural Networks ◽

Image Classification ◽

Action Recognition ◽

State Of The Art ◽

Video Stream ◽

Motion Data ◽

Rgb Images ◽

Human Pose ◽

2D Images ◽

Made In

In this paper a methodology to recognize actions based on RGB videos is proposed which takes advantages of the recent breakthrough made in deep learning. Following the development of Convolutional Neural Networks (CNNs), research was conducted on the transformation of skeletal motion data into 2D images. In this work, a solution is proposed requiring only the use of RGB videos instead of RGB-D videos. This work is based on multiple works studying the conversion of RGB-D data into 2D images. From a video stream (RGB images), a two-dimension skeleton of 18 joints for each detected body is extracted with a DNN-based human pose estimator called OpenPose. The skeleton data are encoded into Red, Green and Blue channels of images. Different ways of encoding motion data into images were studied. We successfully use state-of-the-art deep neural networks designed for image classification to recognize actions. Based on a study of the related works, we chose to use image classification models: SqueezeNet, AlexNet, DenseNet, ResNet, Inception, VGG and retrained them to perform action recognition. For all the test the NTU RGB+D database is used. The highest accuracy is obtained with ResNet: 83.317% cross-subject and 88.780% cross-view which outperforms most of state-of-the-art results.

Download Full-text

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based InputTiling

10.21203/rs.3.rs-743636/v1 ◽

2021 ◽

Author(s):

Weihao Zhuang ◽

Tristan Hascoet ◽

Xunquan Chen ◽

Ryoichi Takashima ◽

Tetsuya Takiguchi ◽

...

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Convolutional Neural Networks ◽

Language Processing ◽

State Of The Art ◽

Input Image ◽

Memory Consumption ◽

Excellent Performance ◽

Conceptual Approach ◽

Recent Developments

Abstract Currently, deep learning plays an indispensable role in many fields, including computer vision, natural language processing, and speech recognition. Convolutional Neural Networks (CNNs) have demonstrated excellent performance in computer vision tasks thanks to their powerful feature extraction capability. However, as the larger models have shown higher accuracy, recent developments have led to state-of-the-art CNN models with increasing resource consumption. This paper investigates a conceptual approach to reduce the memory consumption of CNN inference. Our method consists of processing the input image in a sequence of carefully designed tiles within the lower subnetwork of the CNN, so as to minimize its peak memory consumption, while keeping the end-to-end computation unchanged. This method introduces a trade-off between memory consumption and computations, which is particularly suitable for high-resolution inputs. Our experimental results show that MobileNetV2 memory consumption can be reduced by up to 5.3 times with our proposed method. For ResNet50, one of the most commonly used CNN models in computer vision tasks, memory can be optimized by up to 2.3 times.

Download Full-text

Quantification of the suitable rooftop area for solar panel installation from overhead imagery using Convolutional Neural Networks

Journal of Physics Conference Series ◽

10.1088/1742-6596/2042/1/012002 ◽

2021 ◽

Vol 2042 (1) ◽

pp. 012002

Author(s):

Roberto Castello ◽

Alina Walch ◽

Raphaël Attias ◽

Riccardo Cadei ◽

Shasha Jiang ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Computer Vision ◽

State Of The Art ◽

Solar Panel ◽

Post Processing ◽

Processing Step ◽

Recent Method

Abstract The integration of solar technology in the built environment is realized mainly through rooftop-installed panels. In this paper, we leverage state-of-the-art Machine Learning and computer vision techniques applied on overhead images to provide a geo-localization of the available rooftop surfaces for solar panel installation. We further exploit a 3D building database to associate them to the corresponding roof geometries by means of a geospatial post-processing approach. The stand-alone Convolutional Neural Network used to segment suitable rooftop areas reaches an intersection over union of 64% and an accuracy of 93%, while a post-processing step using building database improves the rejection of false positives. The model is applied to a case study area in the canton of Geneva and the results are compared with another recent method used in the literature to derive the realistic available area.

Download Full-text

Image Caption Generation and Comprehensive Comparison of Image Encoders

10.54216/fpa.040202 ◽

2021 ◽

pp. 42-55

Author(s):

Shitiz Gupta ◽

◽

...

Keyword(s):

Language Processing ◽

State Of The Art ◽

Image Feature ◽

Image Captioning ◽

Interactive Machine Learning ◽

Learning Techniques ◽

Comprehensive Comparison ◽

Image Caption Generation ◽

Image Caption ◽

Made In

Image caption generation is a stimulating multimodal task. Substantial advancements have been made in thefield of deep learning notably in computer vision and natural language processing. Yet, human-generated captions are still considered better, which makes it a challenging application for interactive machine learning. In this paper, we aim to compare different transfer learning techniques and develop a novel architecture to improve image captioning accuracy. We compute image feature vectors using different state-of-the-art transferlearning models which are fed into an Encoder-Decoder network based on Stacked LSTMs with soft attention,along with embedded text to generate high accuracy captions. We have compared these models on severalbenchmark datasets based on different evaluation metrics like BLEU and METEOR.

Download Full-text

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00413 ◽

2021 ◽

Vol 9 ◽

pp. 1061-1080

Author(s):

Prakhar Ganesh ◽

Yao Chen ◽

Xin Lou ◽

Mohammad Ali Khan ◽

Yin Yang ◽

...

Keyword(s):

Language Processing ◽

Large Scale ◽

State Of The Art ◽

Future Research ◽

Research Directions ◽

Research Attention ◽

Model Compression ◽

Future Research Directions ◽

Or Applications

Abstract Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and thus are too resource- hungry and computation-intensive to suit low- capability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attracted considerable research attention. Here, we summarize the research in compressing Transformers, focusing on the especially popular BERT model. In particular, we survey the state of the art in compression for BERT, we clarify the current best practices for compressing large-scale Transformer models, and we provide insights into the workings of various methods. Our categorization and analysis also shed light on promising future research directions for achieving lightweight, accurate, and generic NLP models.

Download Full-text

Grapheme-to-Phoneme Conversion with Convolutional Neural Networks

Applied Sciences ◽

10.3390/app9061143 ◽

2019 ◽

Vol 9 (6) ◽

pp. 1143 ◽

Cited By ~ 1

Author(s):

Sevinj Yolchuyeva ◽

Géza Németh ◽

Bálint Gyires-Tóth

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

Language Processing ◽

Speech Synthesis ◽

State Of The Art ◽

Error Rates ◽

Written Form ◽

G2p Conversion

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly essential role for natural language processing, text-to-speech synthesis and automatic speech recognition systems. In this paper, we investigate convolutional neural networks (CNN) for G2P conversion. We propose a novel CNN-based sequence-to-sequence (seq2seq) architecture for G2P conversion. Our approach includes an end-to-end CNN G2P conversion with residual connections and, furthermore, a model that utilizes a convolutional neural network (with and without residual connections) as encoder and Bi-LSTM as a decoder. We compare our approach with state-of-the-art methods, including Encoder-Decoder LSTM and Encoder-Decoder Bi-LSTM. Training and inference times, phoneme and word error rates were evaluated on the public CMUDict dataset for US English, and the best performing convolutional neural network-based architecture was also evaluated on the NetTalk dataset. Our method approaches the accuracy of previous state-of-the-art results in terms of phoneme error rate.

Download Full-text

Sentence embeddings in NLI with iterative refinement encoders

Natural Language Engineering ◽

10.1017/s1351324919000202 ◽

2019 ◽

Vol 25 (4) ◽

pp. 467-482 ◽

Cited By ~ 3

Author(s):

Aarne Talman ◽

Anssi Yli-Jyrä ◽

Jörg Tiedemann

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Iterative Refinement ◽

Learning Tasks ◽

Sentence Level ◽

Refinement Strategy

AbstractSentence-level representations are necessary for various natural language processing tasks. Recurrent neural networks have proven to be very effective in learning distributed representations and can be trained efficiently on natural language inference tasks. We build on top of one such model and propose a hierarchy of bidirectional LSTM and max pooling layers that implements an iterative refinement strategy and yields state of the art results on the SciTail dataset as well as strong results for Stanford Natural Language Inference and Multi-Genre Natural Language Inference. We can show that the sentence embeddings learned in this way can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks. Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings’ ability to capture some of the important linguistic properties of sentences.

Download Full-text

Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression

Sensors ◽

10.3390/s21103464 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3464

Author(s):

Huabin Diao ◽

Yuexing Hao ◽

Shaoyun Xu ◽

Gongyan Li

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Language Processing ◽

Compression Rate ◽

Software Support ◽

Model Size ◽

Significant Compression ◽

Selection Operator ◽

The Impact ◽

Constrained Devices

Convolutional neural networks (CNNs) have achieved significant breakthroughs in various domains, such as natural language processing (NLP), and computer vision. However, performance improvement is often accompanied by large model size and computation costs, which make it not suitable for resource-constrained devices. Consequently, there is an urgent need to compress CNNs, so as to reduce model size and computation costs. This paper proposes a layer-wise differentiable compression (LWDC) algorithm for compressing CNNs structurally. A differentiable selection operator OS is embedded in the model to compress and train the model simultaneously by gradient descent in one go. Instead of pruning parameters from redundant operators by contrast to most of the existing methods, our method replaces the original bulky operators with more lightweight ones directly, which only needs to specify the set of lightweight operators and the regularization factor in advance, rather than the compression rate for each layer. The compressed model produced by our method is generic and does not need any special hardware/software support. Experimental results on CIFAR-10, CIFAR-100 and ImageNet have demonstrated the effectiveness of our method. LWDC obtains more significant compression than state-of-the-art methods in most cases, while having lower performance degradation. The impact of lightweight operators and regularization factor on the compression rate and accuracy also is evaluated.

Download Full-text

Review of neural approaches for conditional text generation

Bulletin of Taras Shevchenko National University of Kyiv. Series: Physics and Mathematics ◽

10.17721/1812-5409.2021/1.13 ◽

2021 ◽

pp. 102-107

Author(s):

O. H. Skurzhanskyi ◽

A. A. Marchenko

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Language Processing ◽

Test Generation ◽

State Of The Art ◽

Text Generation ◽

Text Simplification ◽

Local Sequence ◽

Key Aspects ◽

Paraphrase Generation

The article is devoted to the review of conditional test generation, one of the most promising fields of natural language processing and artificial intelligence. Specifically, we explore monolingual local sequence transduction tasks: paraphrase generation, grammatical and spelling errors correction, text simplification. To give a better understanding of the considered tasks, we show examples of good rewrites. Then we take a deep look at such key aspects as publicly available datasets with the splits (training, validation, and testing), quality metrics for proper evaluation, and modern solutions based primarily on modern neural networks. For each task, we analyze its main characteristics and how they influence the state-of-the-art models. Eventually, we investigate the most significant shared features for the whole group of tasks in general and for approaches that provide solutions for them.

Download Full-text