Classification and Prediction with Neural Networks

Author(s):  
Arnošt Veselý

This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it demonstrates that in classification problems one should use cross-entropy error function rather than the usual sum-of-square error function. Using gradient descent method for finding the minimum of the cross entropy error function, leads to the well-known backpropagation of error scheme of gradient calculation if at the output layer of the neural network the neurons with logistic or softmax output functions are used. The author believes that understanding the underlying theory presented in this chapter will help researchers in medical informatics to choose more suitable network architectures for medical applications and that it helps them to carry out the network training more effectively.

2016 ◽  
Vol 8 (2) ◽  
pp. 85-98
Author(s):  
Yanqing Wen ◽  
Jian Wang ◽  
Bingjia Huang ◽  
Jacek M. Zurada

Abstract The iterative inversion of neural networks has been used in solving problems of adaptive control due to its good performance of information processing. In this paper an iterative inversion neural network with L2 penalty term has been presented trained by using the classical gradient descent method. We mainly focus on the theoretical analysis of this proposed algorithm such as monotonicity of error function, boundedness of input sequences and weak (strong) convergence behavior. For bounded property of inputs, we rigorously proved that the feasible solutions of input are restricted in a measurable field. The weak convergence means that the gradient of error function with respect to input tends to zero as the iterations go to infinity while the strong convergence stands for the iterative sequence of input vectors convergence to a fixed optimal point.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Qionglin Fang

To address the difficulty of estimating the drift of the navigation marks, a fractional-order gradient with the momentum RBF neural network (FOGDM-RBF) is designed. The convergence is proved, and it is used to estimate the drifting trajectory of the navigation marks with different geographical locations. First, the weight of the neural network is set. The navigation mark’s meteorological, hydrological, and initial position data are taken as the input of the neural network. The neural network is trained and used to estimate the mark’s position. The navigation mark’s position is taken at a later time as the output of the neural network. The difference between the later position and the estimated position obtained from the neural network is the error function of the neural network. The influence of sea conditions and months are analyzed. The experimental results and error analysis show that FOGDM-RBF is better than other algorithms at trajectory estimation and interpolation, has better accuracy and generalization, and does not easily fall into the local optimum. It is effective at accelerating convergence speed and improving the performance of a gradient descent method.


2018 ◽  
Vol 30 (7) ◽  
pp. 2005-2023 ◽  
Author(s):  
Tomoumi Takase ◽  
Satoshi Oyama ◽  
Masahito Kurihara

We present a comprehensive framework of search methods, such as simulated annealing and batch training, for solving nonconvex optimization problems. These methods search a wider range by gradually decreasing the randomness added to the standard gradient descent method. The formulation that we define on the basis of this framework can be directly applied to neural network training. This produces an effective approach that gradually increases batch size during training. We also explain why large batch training degrades generalization performance, which previous studies have not clarified.


2014 ◽  
pp. 99-106
Author(s):  
Leonid Makhnist ◽  
Nikolaj Maniakov ◽  
Nikolaj Maniakov

Is proposed two new techniques for multilayer neural networks training. Its basic concept is based on the gradient descent method. For every methodic are showed formulas for calculation of the adaptive training steps. Presented matrix algorithmizations for all of these techniques are very helpful in its program realization.


Author(s):  
Stefan Balluff ◽  
Jörg Bendfeld ◽  
Stefan Krauter

Gathering knowledge not only of the current but also the upcoming wind speed is getting more and more important as the experience of operating and maintaining wind turbines is increasing. Not only with regards to operation and maintenance tasks such as gearbox and generator checks but moreover due to the fact that energy providers have to sell the right amount of their converted energy at the European energy markets, the knowledge of the wind and hence electrical power of the next day is of key importance. Selling more energy as has been offered is penalized as well as offering less energy as contractually promised. In addition to that the price per offered kWh decreases in case of a surplus of energy. Achieving a forecast there are various methods in computer science: fuzzy logic, linear prediction or neural networks. This paper presents current results of wind speed forecasts using recurrent neural networks (RNN) and the gradient descent method plus a backpropagation learning algorithm. Data used has been extracted from NASA's Modern Era-Retrospective analysis for Research and Applications (MERRA) which is calculated by a GEOS-5 Earth System Modeling and Data Assimilation system. The presented results show that wind speed data can be forecasted using historical data for training the RNN. Nevertheless, the current set up system lacks robustness and can be improved further with regards to accuracy.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Onesimo Meza-Cruz ◽  
Isaac Pilatowsky ◽  
Agustín Pérez-Ramírez ◽  
Carlos Rivera-Blanco ◽  
Youness El Hamzaoui ◽  
...  

The aim of this work is to present a model for heat transfer, desorbed refrigerant, and pressure of an intermittent solar cooling system’s thermochemical reactor based on backpropagation neural networks and mathematical symmetry groups. In order to achieve this, a reactor was designed and built based on the reaction of BaCl2-NH3. Experimental data from this reactor were collected, where barium chloride was used as a solid absorbent and ammonia as a refrigerant. The neural network was trained using the Levenberg–Marquardt algorithm. The correlation coefficient between experimental data and data simulated by the neural network was r = 0.9957. In the neural network’s sensitivity analysis, it was found that the inputs, reactor’s heating temperature and sorption time, influence neural network’s learning by 35% and 20%, respectively. It was also found that, by applying permutations to experimental data and using multibase mathematical symmetry groups, the neural network training algorithm converges faster.


Author(s):  
NOJUN KWAK

In many pattern recognition problems, it is desirable to reduce the number of input features by extracting important features related to the problems. By focusing on only the problem-relevant features, the dimension of features can be greatly reduced and thereby can result in a better generalization performance with less computational complexity. In this paper, we propose a feature extraction method for handling classification problems. The proposed algorithm is used to search for a set of linear combinations of the original features, whose mutual information with the output class can be maximized. The mutual information between the extracted features and the output class is calculated by using the probability density estimation based on the Parzen window method. A greedy algorithm using the gradient descent method is used to determine the new features. The computational load is proportional to the square of the number of samples. The proposed method was applied to several classification problems, which showed better or comparable performances than the conventional feature extraction methods.


Author(s):  
WANG XIANGDONG ◽  
WANG SHOUJUE

In this paper, we present a neural-based manufacturing process control system for semiconductor factories to improve the die yield. A model based on neural networks is proposed to simulate Very Large-Scale Integrated (VLSI) manufacturing process. Learning from the historical processing lists with Radial Basis Function (RBF), we simulate the functional relationship between the wafer probing parameters and the die yield. Then we use a gradient-descent method to search a set of 'optimal' parameters that lead to the maximum yield of the model. At last, we adjust the specification in the practical semiconductor manufacturing process. The average die yield increased from 51.7% to 57.5% after the system had been applied in Huajing Corporation.


Sign in / Sign up

Export Citation Format

Share Document