Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction

Reinforcement learning from demonstration (RLfD) is considered to be a promising approach to improve reinforcement learning (RL) by leveraging expert demonstrations as the additional decision-making guidance. However, most existing RLfD methods only regard demonstrations as low-level knowledge instances under a certain task. Demonstrations are generally used to either provide additional rewards or pretrain the neural network-based RL policy in a supervised manner, usually resulting in poor generalization capability and weak robustness performance. Considering that human knowledge is not only interpretable but also suitable for generalization, we propose to exploit the potential of demonstrations by extracting knowledge from them via Bayesian networks and develop a novel RLfD method called Reinforcement Learning from demonstration via Bayesian Network-based Knowledge (RLBNK). The proposed RLBNK method takes advantage of node influence with the Wasserstein distance metric (NIW) algorithm to obtain abstract concepts from demonstrations and then a Bayesian network conducts knowledge learning and inference based on the abstract data set, which will yield the coarse policy with corresponding confidence. Once the coarse policy’s confidence is low, another RL-based refine module will further optimize and fine-tune the policy to form a (near) optimal hybrid policy. Experimental results show that the proposed RLBNK method improves the learning efficiency of corresponding baseline RL algorithms under both normal and sparse reward settings. Furthermore, we demonstrate that our RLBNK method delivers better generalization capability and robustness than baseline methods.

Download Full-text

On the use of neural regulators

Transaction of Scientific Papers of the Novosibirsk State Technical University ◽

10.17212/2307-6879-2021-1-53-63 ◽

2021 ◽

pp. 53-63

Author(s):

Alexsander Voevoda ◽

◽

Dmitry Romannikov ◽

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Control Systems ◽

State Vector ◽

Neural Controller ◽

Discrete Form ◽

Learning Methods ◽

Data Set ◽

The Neural Network ◽

Input Error

The application of neural networks for the synthesis of control systems is considered. Examples of synthesis of control systems using methods of reinforcement learning, in which the state vector is involved, are given. And the synthesis of a neural controller for objects with an inaccessible state vector is discussed: 1) a variant using a neural network with recurrent feedbacks; 2) a variant using the input error vector, where each error (except for the first one) enters the input of the neural network passing through the delay element. The disadvantages of the first method include the fact that for such a structure of a neural network it is not possible to apply existing learning methods with confirmation and for training it is required to use a data set obtained, for example, from a previously calculated linear controller. The structure of the neural network used in the second option allows the application of reinforcement learning methods, but the article provides a statement and its proof that for the synthesis of a control system for objects with three or more integrators, a neural network without recurrent connections cannot be used. The application of the above structures is given on examples of the synthesis of control systems for objects 1/s2 and 1/s3 presented in a discrete form.

Download Full-text

Bayesian Network Structure and Predictability of Autistic Traits

Psychological Reports ◽

10.1177/0033294120978159 ◽

2020 ◽

pp. 003329412097815

Author(s):

Giovanni Briganti ◽

Donald R. Williams ◽

Joris Mulder ◽

Paul Linkowski

Keyword(s):

Sex Differences ◽

Network Analysis ◽

Bayesian Network ◽

Network Structure ◽

Network Connectivity ◽

Autistic Traits ◽

Data Set ◽

Conditional Dependence ◽

Bayesian Network Structure ◽

Male Subjects

The aim of this work is to explore the construct of autistic traits through the lens of network analysis with recently introduced Bayesian methods. A conditional dependence network structure was estimated from a data set composed of 649 university students that completed an autistic traits questionnaire. The connectedness of the network is also explored, as well as sex differences among female and male subjects in regard to network connectivity. The strongest connections in the network are found between items that measure similar autistic traits. Traits related to social skills are the most interconnected items in the network. Sex differences are found between female and male subjects. The Bayesian network analysis offers new insight on the connectivity of autistic traits as well as confirms several findings in the autism literature.

Download Full-text

Watermelon Sorting Process by Frequency Identification and Artificial Neural Network

Applied Science and Engineering Progress ◽

10.14416/j.asep.2021.12.004 ◽

2021 ◽

Author(s):

Komsan Wongkalasin ◽

Teerapon Upachaban ◽

Wacharawish Daosawang ◽

Nattadon Pannucharoenwong ◽

Phadungsak Ratanadecho

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Selection Process ◽

Network Processor ◽

Data Set ◽

The Neural Network ◽

Frequency Identification ◽

Sorting Process ◽

Artificial Neural ◽

Reasonable Prediction

This research aims to enhance the watermelon’s quality selection process, which was traditionally conducted by knocking the watermelon fruit and sort out by the sound’s character. The proposed method in this research is generating the sound spectrum through the watermelon and then analyzes the response signal’s frequency and the amplitude by Fast Fourier Transform (FFT). Then the obtained data were used to train and verify the neural network processor. The result shows that, the frequencies of 129 and 172 Hz were suit to be used in the comparison. Thirty watermelons, which were randomly selected from the orchard, were used to create a data set, and then were cut to manually check and match to the fruits’ quality. The 129 Hz frequency gave the response ranging from 13.57 and above in 3 groups of watermelons quality, including, not fully ripened, fully ripened, and close to rotten watermelons. When the 172 Hz gave the response between 11.11–12.72 in not fully ripened watermelons and those of 13.00 or more in the group of close to rotten and hollow watermelons. The response was then used as a training condition for the artificial neural network processor of the sorting machine prototype. The verification results provided a reasonable prediction of the ripeness level of watermelon and can be used as a pilot prototype to improve the efficiency of the tools to obtain a modern-watermelon quality selection tool, which could enhance the competitiveness of the local farmers on the product quality control.

Download Full-text

A Model on the Correlation between Composition and Mechanical Properties of Mg-Al-Zn Alloys by Using Artificial Neural Network

Materials Science Forum ◽

10.4028/www.scientific.net/msf.488-489.793 ◽

2005 ◽

Vol 488-489 ◽

pp. 793-796 ◽

Cited By ~ 1

Author(s):

Hai Ding Liu ◽

Ai Tao Tang ◽

Fu Sheng Pan ◽

Ru Lin Zuo ◽

Ling Yun Wang

Keyword(s):

Neural Network ◽

Mechanical Properties ◽

Artificial Neural Network ◽

Magnesium Alloys ◽

Tensile Yield Strength ◽

Data Set ◽

The Neural Network ◽

Artificial Neural ◽

Prediction Of Mechanical Properties ◽

Artificial Neural Network Ann

A model was developed for the analysis and prediction of correlation between composition and mechanical properties of Mg-Al-Zn (AZ) magnesium alloys by applying artificial neural network (ANN). The input parameters of the neural network (NN) are alloy composition. The outputs of the NN model are important mechanical properties, including ultimate tensile strength, tensile yield strength and elongation. The model is based on multilayer feedforward neural network. The NN was trained with comprehensive data set collected from domestic and foreign literature. A very good performance of the neural network was achieved. The model can be used for the simulation and prediction of mechanical properties of AZ system magnesium alloys as functions of composition.

Download Full-text

Key Parts of Transmission Line Detection Using Improved YOLO v3

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/6/1 ◽

2021 ◽

Vol 18 (6) ◽

Author(s):

Tu Renwei ◽

Zhu Zhongjie ◽

Bai Yongqiang ◽

Gao Ming ◽

Ge Zhifeng

Keyword(s):

Transmission Line ◽

Detection Accuracy ◽

Data Sets ◽

Feature Maps ◽

Data Set ◽

Detection Model ◽

The Neural Network ◽

Low Efficiency ◽

The One ◽

Detection Speed

Unmanned Aerial Vehicle (UAV) inspection has become one of main methods for current transmission line inspection, but there are still some shortcomings such as slow detection speed, low efficiency, and inability for low light environment. To address these issues, this paper proposes a deep learning detection model based on You Only Look Once (YOLO) v3. On the one hand, the neural network structure is simplified, that is the three feature maps of YOLO v3 are pruned into two to meet specific detection requirements. Meanwhile, the K-means++ clustering method is used to calculate the anchor value of the data set to improve the detection accuracy. On the other hand, 1000 sets of power tower and insulator data sets are collected, which are inverted and scaled to expand the data set, and are fully optimized by adding different illumination and viewing angles. The experimental results show that this model using improved YOLO v3 can effectively improve the detection accuracy by 6.0%, flops by 8.4%, and the detection speed by about 6.0%.

Download Full-text

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/478 ◽

2018 ◽

Cited By ~ 9

Author(s):

Guiliang Liu ◽

Oliver Schulte

Keyword(s):

Reinforcement Learning ◽

Empirical Evaluation ◽

Professional Sports ◽

Ice Hockey ◽

New Approach ◽

The Neural Network ◽

Overall Performance ◽

Game Context ◽

Player Performance ◽

Q Function

A variety of machine learning models have been proposed to assess the performance of players in professional sports. However, they have only a limited ability to model how player performance depends on the game context. This paper proposes a new approach to capturing game context: we apply Deep Reinforcement Learning (DRL) to learn an action-value Q function from 3M play-by-play events in the National Hockey League (NHL). The neural network representation integrates both continuous context signals and game history, using a possession-based LSTM. The learned Q-function is used to value players' actions under different game contexts. To assess a player's overall performance, we introduce a novel Game Impact Metric (GIM) that aggregates the values of the player's actions. Empirical Evaluation shows GIM is consistent throughout a play season, and correlates highly with standard success measures and future salary.

Download Full-text

Bootstrapping a Neural Morphological Generator from Morphological Analyzer Output for Inuktitut

10.33011/computel.v2i.455 ◽

2019 ◽

Vol 2 (1) ◽

Author(s):

Jeffrey Micher

Keyword(s):

Neural Network ◽

Training Data ◽

Data Set ◽

Set Size ◽

The Neural Network ◽

Surface Character ◽

Finite State ◽

Character Sequences ◽

Finite State Transducer

We present a method for building a morphological generator from the output of an existing analyzer for Inuktitut, in the absence of a two-way finite state transducer which would normally provide this functionality. We make use of a sequence to sequence neural network which “translates” underlying Inuktitut morpheme sequences into surface character sequences. The neural network uses only the previous and the following morphemes as context. We report a morpheme accuracy of approximately 86%. We are able to increase this accuracy slightly by passing deep morphemes directly to output for unknown morphemes. We do not see significant improvement when increasing training data set size, and postulate possible causes for this.

Download Full-text

NEURAL NETWORK ANALYSIS FOR TUMOR INVESTIGATION AND CANCER PREDICTION

Journal of Electronics and Informatics - September 2019 ◽

10.36548/jei.2019.2.004 ◽

2019 ◽

Vol 2019 (02) ◽

pp. 89-98

Author(s):

Vijayakumar T

Keyword(s):

Neural Network ◽

Neural Networks ◽

Early Stage ◽

Original Data ◽

Data Set ◽

Cancer Prediction ◽

The Neural Network ◽

Feed Forward Neural Networks ◽

The Neural Networks ◽

Human Nervous System

Predicting the category of tumors and the types of the cancer in its early stage remains as a very essential process to identify depth of the disease and treatment available for it. The neural network that functions similar to the human nervous system is widely utilized in the tumor investigation and the cancer prediction. The paper presents the analysis of the performance of the neural networks such as the, FNN (Feed Forward Neural Networks), RNN (Recurrent Neural Networks) and the CNN (Convolutional Neural Network) investigating the tumors and predicting the cancer. The results obtained by evaluating the neural networks on the breast cancer Wisconsin original data set shows that the CNN provides 43 % better prediction than the FNN and 25% better prediction than the RNN.

Download Full-text

CARINA data synthesis project: pH data scale unification and cruise adjustments

Earth System Science Data Discussions ◽

10.5194/essdd-2-421-2009 ◽

2009 ◽

Vol 2 (1) ◽

pp. 421-475 ◽

Cited By ~ 10

Author(s):

A. Velo ◽

F. F. Pérez ◽

X. Lin ◽

R. M. Key ◽

T. Tanhua ◽

...

Keyword(s):

Quality Control ◽

Southern Ocean ◽

Data Sets ◽

Data Set ◽

Crossover Analysis ◽

Abstract Data ◽

Data Files ◽

Systematic Biases ◽

Oceanic Carbon ◽

Inversion Analysis

Abstract. Data on carbon and carbon-relevant hydrographic and hydrochemical parameters from previously non-publicly available cruise data sets in the Artic Mediterranean Seas (AMS), Atlantic and Southern Ocean have been retrieved and merged to a new database: CARINA (CARbon IN the Atlantic). These data have gone through rigorous quality control (QC) procedures to assure the highest possible quality and consistency. The data for most of the measured parameters in the CARINA database were objectively examined in order to quantify systematic differences in the reported values, i.e. secondary quality control. Systematic biases found in the data have been corrected in the data products, i.e. three merged data files with measured, calculated and interpolated data for each of the three CARINA regions; AMS, Atlantic and Southern Ocean. Out of a total of 188 cruise entries in the CARINA database, 59 reported pH measured values. Here we present details of the secondary QC on pH for the CARINA database. Procedures of quality control, including crossover analysis between cruises and inversion analysis of all crossover data are briefly described. Adjustments were applied to the pH values for 21 of the cruises in the CARINA dataset. With these adjustments the CARINA database is consistent both internally as well as with GLODAP data, an oceanographic data set based on the World Hydrographic Program in the 1990s. Based on our analysis we estimate the internal accuracy of the CARINA pH data to be 0.005 pH units. The CARINA data are now suitable for accurate assessments of, for example, oceanic carbon inventories and uptake rates and for model validation.

Download Full-text

Tunicate swarm algorithm-trained multi-layered perceptron for data centre energy demand forecasting and relative percentage contribution analysis of input parameters

Journal of Engineering Design and Technology ◽

10.1108/jedt-10-2020-0436 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Oluwafemi Ajayi ◽

Reolyn Heymann

Keyword(s):

Neural Network ◽

Energy Management ◽

Energy Demand ◽

Mean Squared Error ◽

Data Set ◽

Content Type ◽

Demand Pattern ◽

The Neural Network ◽

Input Parameters ◽

Demand Profile

Purpose Energy management is critical to data centres (DCs) majorly because they are high energy-consuming facilities and demand for their services continue to rise due to rapidly increasing global demand for cloud services and other technological services. This projected sectoral growth is expected to translate into increased energy demand from the sector, which is already considered a major energy consumer unless innovative steps are used to drive effective energy management systems. The purpose of this study is to provide insights into the expected energy demand of the DC and the impact each measured parameter has on the building's energy demand profile. This serves as a basis for the design of an effective energy management system. Design/methodology/approach This study proposes novel tunicate swarm algorithm (TSA) for training an artificial neural network model used for predicting the energy demand of a DC. The objective is to find the optimal weights and biases of the model while avoiding commonly faced challenges when using the backpropagation algorithm. The model implementation is based on historical energy consumption data of an anonymous DC operator in Cape Town, South Africa. The data set provided consists of variables such as ambient temperature, ambient relative humidity, chiller output temperature and computer room air conditioning air supply temperature, which serve as inputs to the neural network that is designed to predict the DC’s hourly energy consumption for July 2020. Upon preprocessing of the data set, total sample number for each represented variable was 464. The 80:20 splitting ratio was used to divide the data set into training and testing set respectively, making 452 samples for the training set and 112 samples for the testing set. A weights-based approach has also been used to analyze the relative impact of the model’s input parameters on the DC’s energy demand pattern. Findings The performance of the proposed model has been compared with those of neural network models trained using state of the art algorithms such as moth flame optimization, whale optimization algorithm and ant lion optimizer. From analysis, it was found that the proposed TSA outperformed the other methods in training the model based on their mean squared error, root mean squared error, mean absolute error, mean absolute percentage error and prediction accuracy. Analyzing the relative percentage contribution of the model's input parameters based on the weights of the neural network also shows that the ambient temperature of the DC has the highest impact on the building’s energy demand pattern. Research limitations/implications The proposed novel model can be applied to solving other complex engineering problems such as regression and classification. The methodology for optimizing the multi-layered perceptron neural network can also be further applied to other forms of neural networks for improved performance. Practical implications Based on the forecasted energy demand of the DC and an understanding of how the input parameters impact the building's energy demand pattern, neural networks can be deployed to optimize the cooling systems of the DC for reduced energy cost. Originality/value The use of TSA for optimizing the weights and biases of a neural network is a novel study. The application context of this study which is DCs is quite untapped in the literature, leaving many gaps for further research. The proposed prediction model can be further applied to other regression tasks and classification tasks. Another contribution of this study is the analysis of the neural network's input parameters, which provides insight into the level to which each parameter influences the DC’s energy demand profile.

Download Full-text