scholarly journals Combine and conquer: event reconstruction with Bayesian Ensemble Neural Networks

2021 ◽  
Vol 2021 (4) ◽  
Author(s):  
Jack Y. Araz ◽  
Michael Spannowsky

Abstract Ensemble learning is a technique where multiple component learners are combined through a protocol. We propose an Ensemble Neural Network (ENN) that uses the combined latent-feature space of multiple neural network classifiers to improve the representation of the network hypothesis. We apply this approach to construct an ENN from Convolutional and Recurrent Neural Networks to discriminate top-quark jets from QCD jets. Such ENN provides the flexibility to improve the classification beyond simple prediction combining methods by linking different sources of error correlations, hence improving the representation between data and hypothesis. In combination with Bayesian techniques, we show that it can reduce epistemic uncertainties and the entropy of the hypothesis by simultaneously exploiting various kinematic correlations of the system, which also makes the network less susceptible to a limitation in training sample size.

Author(s):  
TATIANA ESCOVEDO ◽  
ANDRÉ V. ABS DA CRUZ ◽  
MARLEY M. B. R. VELLASCO ◽  
ADRIANO S. KOSHIYAMA

This work describes the use of a weighted ensemble of neural network classifiers for adaptive learning. We train the neural networks by means of a quantum-inspired evolutionary algorithm (QIEA). The QIEA is also used to determine the best weights for each classifier belonging to the ensemble when a new block of data arrives. After running several simulations using two different datasets and performing two different analysis of the results, we show that the proposed algorithm, named neuro-evolutionary ensemble (NEVE), was able to learn the data set and to quickly respond to any drifts on the underlying data, indicating that our model can be a good alternative to address concept drift problems. We also compare the results obtained by our model with an existing algorithm, Learn++.NSE, in two different nonstationary scenarios.


Author(s):  
Viktor Pimenov ◽  
Ilia Pimenov

Introduction: Artificial intelligence development strategy involves the use of deep machine learning algorithms in order to solve various problems. Neural network models trained on specific data sets are difficult to interpret, which is due to the “black box” approach when knowledge is formed as a set of interneuronal connection weights. Purpose: Development of a discrete knowledge model which explicitly represents information processing patterns encoded by connections between neurons. Methods: Adaptive quantization of a feature space using a genetic algorithm, and construction of a discrete model for a multidimensional OLAP cube with binary measures. Results: A genetic algorithm extracts a discrete knowledge carrier from a trained neural network. An individual's chromosome encodes a combination of values of all quantization levels for the measurable object properties. The head gene group defines the feature space structure, while the other genes are responsible for setting up the quantization of a multidimensional space, where each gene is responsible for one quantization threshold for a given variable. A discrete model of a multidimensional OLAP cube with binary measures explicitly represents the relationships between combinations of object feature values and classes. Practical relevance: For neural network prediction models based on a training sample, genetic algorithms make it possible to find the effective value of the feature space volume for the combinations of input feature values not represented in the training sample whose volume is usually limited. The proposed discrete model builds unique images of each class based on rectangular maps which use a mesh structure of gradations. The maps reflect the most significant integral indicators of classes that determine the location and size of a class in a multidimensional space. Based on a convolution of the constructed class images, a complete system of production decision rules is recorded for the preset feature gradations.


2019 ◽  
Vol 7 (3) ◽  
Author(s):  
Liam Moore ◽  
Karl Nordström ◽  
Sreedevi Varma ◽  
Malcolm Fairbairn

We compare the performance of a convolutional neural network (CNN) trained on jet images with dense neural networks (DNNs) trained on nn-subjettiness variables to study the distinguishing power of these two separate techniques applied to top quark decays. We find that they perform almost identically and are highly correlated once jet mass information is included, which suggests they are accessing the same underlying information which can be intuitively understood as being contained in 4-, 5-, 6-, and 8-body kinematic phase spaces depending on the sample. This suggests both of these methods are highly useful for heavy object tagging and provides a tentative answer to the question of what the image network is actually learning.


2020 ◽  
Vol 14 (1) ◽  
pp. 5
Author(s):  
Adam Adli ◽  
Pascal Tyrrell

Introduction: Advances in computers have allowed for the practical application of increasingly advanced machine learning models to aid healthcare providers with diagnosis and inspection of medical images. Often, a lack of training data and computation time can be a limiting factor in the development of an accurate machine learning model in the domain of medical imaging. As a possible solution, this study investigated whether L2 regularization moderate s the overfitting that occurs as a result of small training sample sizes.Methods: This study employed transfer learning experiments on a dental x-ray binary classification model to explore L2 regularization with respect to training sample size in five common convolutional neural network architectures. Model testing performance was investigated and technical implementation details including computation times and hardware considerations as well as performance factors and practical feasibility were described.Results: The experimental results showed a trend that smaller training sample sizes benefitted more from regularization than larger training sample sizes. Further, the results showed that applying L2 regularization did not apply significant computational overhead and that the extra rounds of training L2 regularization were feasible when training sample sizes are relatively small.Conclusion: Overall, this study found that there is a window of opportunity in which the benefits of employing regularization can be most cost-effective relative to training sample size. It is recommended that training sample size should be carefully considered when forming expectations of achievable generalizability improvements that result from investing computational resources into model regularization.


Author(s):  
ANDREW SOHN ◽  
JEAN-LUC GAUDIOT

Much effort has been expended on developing special architectures dedicated to the efficient execution of problems in artificial intelligence (AI), especially production systems. While artificial neural networks (ANNs) offer the promise of solving various problems in pattern recognition and classification, we demonstrate here that the ANN approach can be applied to the AI production system paradigm. Among various types of neural networks, the three-layers of ring-structured feedback network is considered in this paper to suit the problem domain under investigation. Characteristics of the production system paradigm are identified. Various aspects of the use of feedback neural networks in mapping production systems are discussed. Two types of representation techniques are studied: local and hierarchical representations. A hierarchical representation derives features from patterns in production systems and constructs a 3-dimensional space called feature space, where a pattern can be uniquely defined by a vector. To demonstrate the efficient use of the neural network approach, a mapping of the generic production system is detailed throughout the paper. The results of a deterministic simulation demonstrate that the three layers of ring-structured feedback neural network architecture can be an efficient processing mechanism for the AI production system paradigm.


2012 ◽  
Vol 12 (2) ◽  
pp. 98-108 ◽  
Author(s):  
Petar Halachev

Abstract A model for prediction of the outcome indicators of e-Learning, based on Balanced ScoreCard (BSC) by Neural Networks (NN) is proposed. In the development of NN models the problem of a small sample size of the data arises. In order to reduce the number of variables and increase the examples of the training sample, preprocessing of the data with the help of the methods Interpolation and Principal Component Analysis (PCA) is performed. A method for optimizing the structure of the neural network is applied over linear and nonlinear neural network architectures. The highest accuracy of prognosis is obtained applying the method of Optimal Brain Damage (OBD) over the nonlinear neural network. The efficiency and applicability of the method suggested is proved by numerical experiments on the basis of real data.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Wenyi Lin ◽  
Kyle Hasenstab ◽  
Guilherme Moura Cunha ◽  
Armin Schwartzman

AbstractWe propose a random forest classifier for identifying adequacy of liver MR images using handcrafted (HC) features and deep convolutional neural networks (CNNs), and analyze the relative role of these two components in relation to the training sample size. The HC features, specifically developed for this application, include Gaussian mixture models, Euler characteristic curves and texture analysis. Using HC features outperforms the CNN for smaller sample sizes and with increased interpretability. On the other hand, with enough training data, the combined classifier outperforms the models trained with HC features or CNN features alone. These results illustrate the added value of HC features with respect to CNNs, especially when insufficient data is available, as is often found in clinical studies.


Author(s):  
Ruohan Gong ◽  
Zuqi Tang

Purpose This paper aims to investigate the approach combine the deep learning (DL) and finite element method for the magneto-thermal coupled problem. Design/methodology/approach To achieve the DL of electrical device with the hypothesis of a small dataset, with ground truth data obtained from the FEM analysis, U-net, a highly efficient convolutional neural network (CNN) is used to extract hidden features and trained in a supervised manner to predict the magneto-thermal coupled analysis results for different topologies. Using part of the FEM results as training samples, the DL model obtained from effective off-line training can be used to predict the distribution of the magnetic field and temperature field of other cases. Findings The possibility and feasibility of the proposed approach are investigated by discussing the influence of various network parameters, in particular, the four most important factors are training sample size, learning rate, batch size and optimization algorithm respectively. It is shown that DL based on U-net can be used as an efficiency tool in multi-physics analysis and achieve good performance with only small datasets. Originality/value It is shown that DL based on U-net can be used as an efficiency tool in multi-physics analysis and achieve good performance with only small datasets.


Author(s):  
M. Vasylenko ◽  
D. Dobrycheva ◽  
V. Khramtsov ◽  
I. Vavilova

We present the deep learning approach for the determination of morphological types of galaxies. We demonstrate the method's performance with the redshift-limited (z < 0.1) training sample of 6 163 galaxies from the SDSS DR9. We exploited the deep convolutional neural network classifiers such as InceptionV3, DenseNet121, and MobileNetV2 to process images of SDSS-galaxies (100x100 pixels, 25 arcsec in each axis in size) using g, r, i filters as R - G - B channels to create images. We provided the data augmentation (horizontal and vertical flips, random shifts on ±10 pixels, and rotations) randomly applied to the set of images during learning, which helped increase the classifier's generalization ability. Also, two different loss functions, MAE and Lovasz-Softmax, were applied to each classifier. The target sample galaxies were classified into two morphological types (late and early) trained on the images of galaxies from the sample. It turned out that the deep convolutional neural networks InceptionV3 and DenseNet121 with MAE-loss function show the best result attaining 93.3% accuracy.


Sign in / Sign up

Export Citation Format

Share Document