Field Inversion and Machine Learning With Embedded Neural Networks: Physics-Consistent Neural Network Training

AbstractMachine-learning methods may be used to perform many tasks required in the analysis of astronomical data, including: data description and interpretation, pattern recognition, prediction, classification, compression, inference and many more. An intuitive and well-established approach to machine learning is the use of artificial neural networks (NNs), which consist of a group of interconnected nodes, each of which processes information that it receives and then passes this product on to other nodes via weighted connections. In particular, I discuss the first public release of the generic neural network training algorithm, calledSkyNet, and demonstrate its application to astronomical problems focusing on its use in the BAMBI package for accelerated Bayesian inference in cosmology, and the identification of gamma-ray bursters. TheSkyNetand BAMBI packages, which are fully parallelised using MPI, are available athttp://www.mrao.cam.ac.uk/software/.

Download Full-text

Parallelization of Neural Network Training for NLP with Hogwild!

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0036 ◽

2017 ◽

Vol 109 (1) ◽

pp. 29-38 ◽

Cited By ~ 2

Author(s):

Valentin Deyringer ◽

Alexander Fraser ◽

Helmut Schmid ◽

Tsuyoshi Okita

Keyword(s):

Neural Network ◽

Neural Networks ◽

Suitable Method ◽

Neural Network Training ◽

Training Time ◽

Pos Tagging ◽

Network Training ◽

Speed Up

Abstract Neural Networks are prevalent in todays NLP research. Despite their success for different tasks, training time is relatively long. We use Hogwild! to counteract this phenomenon and show that it is a suitable method to speed up training Neural Networks of different architectures and complexity. For POS tagging and translation we report considerable speedups of training, especially for the latter. We show that Hogwild! can be an important tool for training complex NLP architectures.

Download Full-text

Evaluation of Parameter Settings for Training Neural Networks Using Backpropagation Algorithms

10.4018/978-1-6684-2408-7.ch009 ◽

2022 ◽

pp. 202-226

Author(s):

Leema N. ◽

Khanna H. Nehemiah ◽

Elgin Christo V. R. ◽

Kannan A.

Keyword(s):

Neural Network ◽

Neural Networks ◽

Activation Function ◽

Neural Network Training ◽

Network Parameter ◽

Network Parameters ◽

Network Training ◽

Rate Minimum ◽

Hidden Layer ◽

Function Number

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.

Download Full-text

Designing deep neural networks for continual learning in an open world

10.21248/gups.62487 ◽

2021 ◽

Author(s):

◽

Martin Mundt

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Network Architecture ◽

Neural Network Training ◽

Neural Network Architecture ◽

Neural Architecture ◽

Network Training ◽

Classification Tasks ◽

Continual Learning

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.

Download Full-text

Evaluation of Parameter Settings for Training Neural Networks Using Backpropagation Algorithms

International Journal of Operations Research and Information Systems ◽

10.4018/ijoris.2020100104 ◽

2020 ◽

Vol 11 (4) ◽

pp. 62-85

Author(s):

Leema N. ◽

Khanna H. Nehemiah ◽

Elgin Christo V. R. ◽

Kannan A.

Keyword(s):

Neural Network ◽

Neural Networks ◽

Activation Function ◽

Neural Network Training ◽

Network Parameter ◽

Network Parameters ◽

Network Training ◽

Hidden Layer ◽

Function Number ◽

The Impact

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.

Download Full-text

Incremental Neural Network Training for Medical Diagnosis

Encyclopedia of Healthcare Information Systems ◽

10.4018/978-1-59904-889-5.ch091 ◽

2008 ◽

pp. 720-731

Author(s):

Sheng-Uei Guan ◽

Ji Hua Ang ◽

Kay Chen Tan ◽

Abdullah Al Mamun

Keyword(s):

Neural Network ◽

Neural Networks ◽

Incremental Learning ◽

Medical Diagnosis ◽

Incremental Algorithm ◽

Neural Network Training ◽

Network Training ◽

Novel Method ◽

Partitioning Algorithms

This chapter proposes a novel method of incremental interference-free neural network training (IIFNNT) for medical datasets, which takes into consideration the interference each attribute has on the others. A specially designed network is used to determine if two attributes interfere with each other, after which the attributes are partitioned using some partitioning algorithms. These algorithms make sure that attributes beneficial to each other are trained in the same batch, thus sharing the same subnetwork while interfering attributes are separated to reduce interference. There are several incremental neural networks available in literature (Guan & Li, 2001; Su, Guan & Yeo, 2001). The architecture of IIFNNT employed some incremental algorithm: the ILIA1 and ILIA2 (incremental learning with respect to new incoming attributes) (Guan & Li, 2001).

Download Full-text

Soil Moisture Retrieval Based on ASAR Data and Genetic Neural Networks

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.500.198 ◽

2012 ◽

Vol 500 ◽

pp. 198-203

Author(s):

Chang Lin Xiao ◽

Yan Chen ◽

Lina Liu ◽

Ling Tong ◽

Ming Quan Jia

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Neural Networks ◽

Soil Moisture ◽

Training Data ◽

Neural Network Training ◽

Practical Test ◽

Network Training

Genetic Algorithm can further optimize Neural Networks, and this optimized Algorithm has been used in many fields and made better results, but currently, it have not been used in inversion parameters. This paper used backscattering coefficients from ASAR, AIEM model to calculate data as neural network training data and through Genetic Algorithm Neural Networks to retrieve soil moisture. Finally compared with practical test and shows the validity and superiority of the Genetic Algorithm Neural Networks.

Download Full-text

Using Artificial Bee Colony to Improve Functional Link Neural Network Training

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.2102 ◽

2012 ◽

Vol 263-266 ◽

pp. 2102-2108 ◽

Cited By ~ 3

Author(s):

Yana Mazwin Mohmad Hassim ◽

Rozaida Ghazali

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Artificial Bee Colony ◽

Neural Network Training ◽

Functional Link ◽

Bee Colony ◽

Network Training ◽

Artificial Neural ◽

Functional Link Neural Network

Artificial Neural Networks have emerged as an important tool for classification and have been widely used to classify non-linearly separable pattern. The most popular artificial neural networks model is a Multilayer Perceptron (MLP) that is able to perform classification task with significant success. However due to the complexity of MLP structure and also problems such as local minima trapping, over fitting and weight interference have made neural network training difficult. Thus, the easy way to avoid these problems is by removing the hidden layers. This paper presents the ability of Functional Link Neural Network (FLNN) in overcoming the complexity structure of MLP, using it single layer architecture and proposes an Artificial Bee Colony (ABC) optimization for training the FLNN. The proposed technique is expected to provide better learning scheme for a classifier in order to get more accurate classification result.

Download Full-text

S-DFP: shifted dynamic fixed point for quantized deep neural network training

Neural Computing and Applications ◽

10.1007/s00521-021-06821-x ◽

2021 ◽

Author(s):

Yasufumi Sakai ◽

Yutaka Tamiya

Keyword(s):

Neural Network ◽

Neural Networks ◽

Fixed Point ◽

Deep Neural Networks ◽

Data Representation ◽

Training Methods ◽

Neural Network Training ◽

Training Time ◽

Network Training ◽

Complex Models

AbstractRecent advances in deep neural networks have achieved higher accuracy with more complex models. Nevertheless, they require much longer training time. To reduce the training time, training methods using quantized weight, activation, and gradient have been proposed. Neural network calculation by integer format improves the energy efficiency of hardware for deep learning models. Therefore, training methods for deep neural networks with fixed point format have been proposed. However, the narrow data representation range of the fixed point format degrades neural network accuracy. In this work, we propose a new fixed point format named shifted dynamic fixed point (S-DFP) to prevent accuracy degradation in quantized neural networks training. S-DFP can change the data representation range of dynamic fixed point format by adding bias to the exponent. We evaluated the effectiveness of S-DFP for quantized neural network training on the ImageNet task using ResNet-34, ResNet-50, ResNet-101 and ResNet-152. For example, the accuracy of quantized ResNet-152 is improved from 76.6% with conventional 8-bit DFP to 77.6% with 8-bit S-DFP.

Download Full-text

Neural network training fingerprint: visual analytics of the training process in classification neural networks

Journal of Visualization ◽

10.1007/s12650-021-00809-4 ◽

2021 ◽

Author(s):

Martha Dais Ferreira ◽

Gabriel D. Cantareira ◽

Rodrigo F. de Mello ◽

Fernando V. Paulovich

Keyword(s):

Neural Network ◽

Neural Networks ◽

Visual Analytics ◽

Neural Network Training ◽

Training Process ◽

Network Training

Download Full-text