scholarly journals Categorization of material quality using a model-free reinforcement learning algorithm

2021 ◽  
Vol 2107 (1) ◽  
pp. 012027
Author(s):  
Annapoorni Mani ◽  
Shahriman Abu Bakar ◽  
Pranesh Krishnan ◽  
Sazali Yaacob

Abstract Reinforcement learning is the most preferred algorithms for optimization problems in industrial automation. Model-free reinforcement learning algorithms optimize for rewards without the knowledge of the environmental dynamics and require less computation. Regulating the quality of the raw materials in the inbound inventory can improve the manufacturing process. In this paper, the raw materials arriving at the incoming inspection process are categorized and labeled based on their quality through the path traveled. A model-free temporal difference learning approach is used to predict the acceptance and rejection path of raw materials in the incoming inspection process. The algorithm presented eight routes paths that the raw materials could travel. Four pathways correspond to material acceptance, while the rest lead to material refusal. The materials are annotated using the total scores acquired in the incoming inspection process. The materials traveling on the ideal path (path A) get the highest total score. The rest of the accepted materials in the acceptance path have a 7.37% lower score in path B, whereas path C and path D get 37.28% and 42.44% lower than the ideal approach.

2021 ◽  
Vol 2107 (1) ◽  
pp. 012026
Author(s):  
Annapoorni Mani ◽  
Shahriman Abu Bakar ◽  
Pranesh Krishnan ◽  
Sazali Yaacob

Abstract Reinforcement learning is one of the promising approaches for operations research problems. The incoming inspection process in any manufacturing plant aims to control quality, reduce manufacturing costs, eliminate scrap, and process failure downtimes due to non-conforming raw materials. Prediction of the raw material acceptance rate can regulate the raw material supplier selection and improve the manufacturing process by filtering out non-conformities. This paper presents a Markov model developed to estimate the probability of the raw material being accepted or rejected in an incoming inspection environment. The proposed forecasting model is further optimized for efficiency using the two reinforcement learning algorithms (dynamic programming and temporal differencing). The results of the two optimized models are compared, and the findings are discussed.


2005 ◽  
Vol 24 ◽  
pp. 81-108 ◽  
Author(s):  
P. Geibel ◽  
F. Wysotzki

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with respect to a policy as the probability of entering such a state when the policy is pursued. We consider the problem of finding good policies whose risk is smaller than some user-specified threshold, and formalize it as a constrained MDP with two criteria. The first criterion corresponds to the value function originally given. We will show that the risk can be formulated as a second criterion function based on a cumulative return, whose definition is independent of the original value function. We present a model free, heuristic reinforcement learning algorithm that aims at finding good deterministic policies. It is based on weighting the original value function and the risk. The weight parameter is adapted in order to find a feasible solution for the constrained problem that has a good performance with respect to the value function. The algorithm was successfully applied to the control of a feed tank with stochastic inflows that lies upstream of a distillation column. This control task was originally formulated as an optimal control problem with chance constraints, and it was solved under certain assumptions on the model to obtain an optimal solution. The power of our learning algorithm is that it can be used even when some of these restrictive assumptions are relaxed.


Author(s):  
T.S. Morozova

A study into the failure causes of mixing and charging equipment confirms that the main impact on the probability of accidents is the use of raw materials that do not meet the specifications and have unstable properties. The raw materials used for explosives preparation in mechanized charging of boreholes include such components as ammonium nitrate, emulsion phase, diesel fuel, emulsifier and others. The paper describes the application of various formulations with these components in specific types of mixing and charging machines manufactured by AZOTTECH LLC. The main properties that affect the quality of raw materials are summarised, and the incoming inspection of explosive components is described as part of the acceptance procedure at temporary storage sites at a hazardous production facility. The paper describes common types of equipment failures and maintenance procedures when using substandard raw materials. The conclusion highlights the key practices to improve the equipment uptime as well as recommendations for incoming inspection and the use of high-quality explosive components.


2015 ◽  
Vol 12 (03) ◽  
pp. 1550028 ◽  
Author(s):  
Rok Vuga ◽  
Bojan Nemec ◽  
Aleš Ude

In this paper, we propose an integrated policy learning framework that fuses iterative learning control (ILC) and reinforcement learning. Integration is accomplished at the exploration level of the reinforcement learning algorithm. The proposed algorithm combines fast convergence properties of iterative learning control and robustness of reinforcement learning. This way, the advantages of both approaches are retained while overcoming their respective limitations. The proposed approach was verified in simulation and in real robot experiments on three challenging motion optimization problems.


Radiocarbon ◽  
2020 ◽  
pp. 1-8
Author(s):  
Mariaelena Fedi ◽  
Serena Barone ◽  
Luca Carraresi ◽  
Simona Dominici ◽  
Lucia Liccioli

ABSTRACT When dating documents by radiocarbon (14C), what we typically measure is the concentration of the support (e.g. paper, parchment, or papyrus). This can however lead to a possible misinterpretation of results because the support may be older than the writing itself. To minimize such a possible ambiguity, the ideal approach would be direct dating of the ink or color (if organic). Here we propose a feasibility study to date carbon-based black inks when deposited on papyrus, one of the most widespread writing supports used in the past. We prepared test samples, using a commercial papyrus and a homemade black ink, obtained combining modern charcoal fragments and Arabic gum. Even though the ink binder might have represented the best candidate to be dated, we verified by FTIR that the molecular composition of its soluble fraction is very similar to papyrus extractives, thus identifying the residual charcoals recovered after extraction as the most suitable material for the measurement. Enough charcoal material was extracted from the test samples and processed using our new setup optimized for microgram-size samples. The overall experimental procedure was found to be reproducible, and measured 14C concentrations were coherent with the data obtained from larger samples and raw materials.


Author(s):  
László Orgován ◽  
Tamás Bécsi ◽  
Szilárd Aradi

Autonomous vehicles or self-driving cars are prevalent nowadays, many vehicle manufacturers, and other tech companies are trying to develop autonomous vehicles. One major goal of the self-driving algorithms is to perform manoeuvres safely, even when some anomaly arises. To solve these kinds of complex issues, Artificial Intelligence and Machine Learning methods are used. One of these motion planning problems is when the tires lose their grip on the road, an autonomous vehicle should handle this situation. Thus the paper provides an Autonomous Drifting algorithm using Reinforcement Learning. The algorithm is based on a model-free learning algorithm, Twin Delayed Deep Deterministic Policy Gradients (TD3). The model is trained on six different tracks in a simulator, which is developed specifically for autonomous driving systems; namely CARLA.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Qi Yongqiang ◽  
Yang Hailan ◽  
Rong Dan ◽  
Ke Yi ◽  
Lu Dongchen ◽  
...  

This paper proposes a goal-directed locomotion method for a snake-shaped robot in 3D complex environment based on path-integral reinforcement learning. This method uses a model-free online Q-learning algorithm to evaluate action strategies and optimize decision-making through repeated “exploration-learning-utilization” processes to complete snake-shaped robot goal-directed locomotion in 3D complex environment. The proper locomotion control parameters such as joint angles and screw-drive velocities can be learned by path-integral reinforcement learning, and the learned parameters were successfully transferred to the snake-shaped robot. Simulation results show that the planned path can avoid all obstacles and reach the destination smoothly and swiftly.


Sign in / Sign up

Export Citation Format

Share Document