On the use of neural networks for energy reconstruction in high-granularity calorimeters

Abstract We contrasted the performance of deep neural networks — Convolutional Neural Network (CNN) and Graph Neural Network (GNN) — to current state of the art energy regression methods in a finely 3D-segmented calorimeter simulated by GEANT4. This comparative benchmark gives us some insight to assess the particular latent signals neural network methods exploit to achieve superior resolution. A CNN trained solely on a pure sample of pions achieved substantial improvement in the energy resolution for both single pions and jets over the conventional approaches. It maintained good performance for electron and photon reconstruction. We also used the Graph Neural Network (GNN) with edge convolution to assess the importance of timing information in the shower development for improved energy reconstruction. We implement a simple simulation based correction to the energy sum derived from the fraction of energy deposited in the electromagnetic shower component. This serves as an approximate dual-readout analogue for our benchmark comparison. Although this study does not include the simulation of detector effects, such as electronic noise, the margin of improvement seems robust enough to suggest these benefits will endure in real-world application. We also find reason to infer that the CNN/GNN methods leverage latent features that concur with our current understanding of the physics of calorimeter measurement.

Download Full-text

Artificial Neural Networks on FPGAs for Real-Time Energy Reconstruction of the ATLAS LAr Calorimeters

Computing and Software for Big Science ◽

10.1007/s41781-021-00066-y ◽

2021 ◽

Vol 5 (1) ◽

Author(s):

Georges Aad ◽

Anne-Sophie Berthold ◽

Thomas Calvet ◽

Nemer Chiedde ◽

Etienne Marie Fortin ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Real Time ◽

Hadron Collider ◽

Liquid Argon ◽

Learning Approaches ◽

High Luminosity ◽

Proton Proton ◽

Real Time Processing ◽

Energy Reconstruction

AbstractThe ATLAS experiment at the Large Hadron Collider (LHC) is operated at CERN and measures proton–proton collisions at multi-TeV energies with a repetition frequency of 40 MHz. Within the phase-II upgrade of the LHC, the readout electronics of the liquid-argon (LAr) calorimeters of ATLAS are being prepared for high luminosity operation expecting a pileup of up to 200 simultaneous proton–proton interactions. Moreover, the calorimeter signals of up to 25 subsequent collisions are overlapping, which increases the difficulty of energy reconstruction by the calorimeter detector. Real-time processing of digitized pulses sampled at 40 MHz is performed using field-programmable gate arrays (FPGAs). To cope with the signal pileup, new machine learning approaches are explored: convolutional and recurrent neural networks outperform the optimal signal filter currently used, both in assignment of the reconstructed energy to the correct proton bunch crossing and in energy resolution. The improvements concern in particular energies derived from overlapping pulses. Since the implementation of the neural networks targets an FPGA, the number of parameters and the mathematical operations need to be well controlled. The trained neural network structures are converted into FPGA firmware using automated implementations in hardware description language and high-level synthesis tools. Very good agreement between neural network implementations in FPGA and software based calculations is observed. The prototype implementations on an Intel Stratix-10 FPGA reach maximum operation frequencies of 344–640 MHz. Applying time-division multiplexing allows the processing of 390–576 calorimeter channels by one FPGA for the most resource-efficient networks. Moreover, the latency achieved is about 200 ns. These performance parameters show that a neural-network based energy reconstruction can be considered for the processing of the ATLAS LAr calorimeter signals during the high-luminosity phase of the LHC.

Download Full-text

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5977 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5306-5314

Author(s):

Takamasa Okudono ◽

Masaki Waga ◽

Taro Sekiyama ◽

Ichiro Hasuo

Keyword(s):

Neural Network ◽

Neural Networks ◽

Recurrent Neural Network ◽

Recurrent Neural Networks ◽

Learning Algorithm ◽

Internal State ◽

State Spaces ◽

Regression Methods ◽

Weighted Automata ◽

Equivalence Queries

We present a method to extract a weighted finite automaton (WFA) from a recurrent neural network (RNN). Our method is based on the WFA learning algorithm by Balle and Mohri, which is in turn an extension of Angluin's classic L* algorithm. Our technical novelty is in the use of regression methods for the so-called equivalence queries, thus exploiting the internal state space of an RNN to prioritize counterexample candidates. This way we achieve a quantitative/weighted extension of the recent work by Weiss, Goldberg and Yahav that extracts DFAs. We experimentally evaluate the accuracy, expressivity and efficiency of the extracted WFAs.

Download Full-text

Neural network regression for Bermudan option pricing

Monte Carlo Methods and Applications ◽

10.1515/mcma-2021-2091 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Bernard Lapeyre ◽

Jérôme Lelong

Keyword(s):

Neural Network ◽

Neural Networks ◽

Least Square ◽

Dynamic Programming Principle ◽

Regression Methods ◽

Finite Dimensional Vector Space ◽

Finite Dimensional ◽

Conditional Expectations ◽

Regression Techniques ◽

Bermudan Option

Abstract The pricing of Bermudan options amounts to solving a dynamic programming principle, in which the main difficulty, especially in high dimension, comes from the conditional expectation involved in the computation of the continuation value. These conditional expectations are classically computed by regression techniques on a finite-dimensional vector space. In this work, we study neural networks approximations of conditional expectations. We prove the convergence of the well-known Longstaff and Schwartz algorithm when the standard least-square regression is replaced by a neural network approximation, assuming an efficient algorithm to compute this approximation. We illustrate the numerical efficiency of neural networks as an alternative to standard regression methods for approximating conditional expectations on several numerical examples.

Download Full-text

Predicting drug-target affinity based on recurrent neural networks and graph convolutional neural networks

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207324666210215101825 ◽

2021 ◽

Vol 24 ◽

Author(s):

Qingyu Tian ◽

Mao Ding ◽

Hui Yang ◽

Caibin Yue ◽

Yue Zhong ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Time Series ◽

Convolutional Neural Network ◽

Drug Target ◽

Channel Model ◽

Search Space ◽

Topological Features ◽

Latent Features ◽

Chemical Background

Background: Drug development requires a lot of money and time, and the outcome of the challenge is unknown. So, there is an urgent need for researchers to find a new approach that can reduce costs. Therefore, the identification of drug-target interactions (DTIs) has been a critical step in the early stages of drug discovery. These computational methods aim to narrow the search space for novel DTIs and to elucidate the functional background of drugs. Most of the methods developed so far use binary classification to predict the presence or absence of interactions between the drug and the target. However, it is more informative, but also more challenging, to predict the strength of the binding between a drug and its target. If the strength is not strong enough, such a DTI may not be useful. Hence, the development of methods to predict drug-target affinity (DTA) is of significant importance. Method: We have improved the Graph DTA model from a dual-channel model to a triple-channel model. We interpreted the target/protein sequences as time series and extracted their features using the LSTM network. For the drug, we considered both the molecular structure and the local chemical background, retaining the four variant networks used in Graph DTA to extract the topological features of the drug and capturing the local chemical background of the atoms in the drug by using BiGRU. Thus, we obtained the latent features of the target and two latent features of the drug. The connection of these three feature vectors is then input into a 2-layer FC network, and a valuable binding affinity is output. Result: We use the Davis and Kiba datasets, using 80% of the data for training and 20% of the data for validation. Our model shows better performance by comparing it with the experimental results of Graph DTA. Conclusion: In this paper, we altered the Graph DTA model to predict drug-target affinity. It represents the drug as a graph, and extracts the two-dimensional drug information using a graph convolutional neural network. Simultaneously, the drug and protein targets are represented as a word vector, and the convolutional neural network is used to extract the time series information of the drug and the target. We demonstrate that our improved method has better performance than the original method. In particular, our model has better performance in the evaluation of benchmark databases.

Download Full-text

Unsupervised Multi-Document Abstractive Summarization Using Recursive Neural Network with Attention Mechanism

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8976 ◽

2020 ◽

Vol 17 (9) ◽

pp. 3867-3872

Author(s):

Aniv Chakravarty ◽

Jagadish S. Kallimani

Keyword(s):

Neural Network ◽

Neural Networks ◽

Attention Mechanism ◽

Text Summarization ◽

Text Generation ◽

Text Documents ◽

Current State ◽

Semantic Concepts ◽

Text Information ◽

Abstractive Summarization

Text summarization is an active field of research with a goal to provide short and meaningful gists from large amount of text documents. Extractive text summarization methods have been extensively studied where text is extracted from the documents to build summaries. There are various type of multi document ranging from different formats to domains and topics. With the recent advancement in technology and use of neural networks for text generation, interest for research in abstractive text summarization has increased significantly. The use of graph based methods which handle semantic information has shown significant results. When given a set of documents of English text files, we make use of abstractive method and predicate argument structures to retrieve necessary text information and pass it through a neural network for text generation. Recurrent neural networks are a subtype of recursive neural networks which try to predict the next sequence based on the current state and considering the information from previous states. The use of neural networks allows generation of summaries for long text sentences as well. This paper implements a semantic based filtering approach using a similarity matrix while keeping all stop-words. The similarity is calculated using semantic concepts and Jiang–Conrath similarity and making use of a recurrent neural network with an attention mechanism to generate summary. ROUGE score is used for measuring accuracy, precision and recall scores.

Download Full-text

Testing the Reliability of Interpretable Neural Networks in Geoscience Using the Madden-Julian Oscillation

10.5194/gmd-2020-152 ◽

2020 ◽

Author(s):

Benjamin A. Toms ◽

Karthik Kashinath ◽

Da Yang ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Test Case ◽

Madden Julian Oscillation ◽

State Variables ◽

Current State ◽

The Neural Network ◽

Geophysical Phenomenon ◽

The Neural Networks ◽

Atmospheric State

Abstract. We test the reliability of two neural network interpretation techniques, backward optimization and layerwise relevance propagation, within geoscientific applications by applying them to a commonly studied geophysical phenomenon, the Madden-Julian Oscillation. The Madden-Julian Oscillation is a multi-scale pattern within the tropical atmosphere that has been extensively studied over the past decades, which makes it an ideal test case to ensure the interpretability methods can recover the current state of knowledge regarding its spatial structure. The neural networks can, indeed, reproduce the current state of knowledge and can also provide new insights into the seasonality of the Madden-Julian Oscillation and its relationships with atmospheric state variables. The neural network identifies the phase of the Madden-Julian Oscillation twice as accurately as linear regression, which means that nonlinearities used by the neural network are important to the structure of the Madden-Julian Oscillation. Interpretations of the neural network show that it accurately captures the spatial structures of the Madden-Julian Oscillation, suggest that the nonlinearities of the Madden-Julian Oscillation are manifested through the uniqueness of each event, and offer physically meaningful insights into its relationship with atmospheric state variables. We also use the interpretations to identify the seasonality of the MJO, and find that the conventionally defined extended seasons should be shifted later by one month. More generally, this study suggests that neural networks can be reliably interpreted for geoscientific applications and may thereby serve as a dependable method for testing geoscientific hypotheses.

Download Full-text

Multi-Document Abstractive Summarization using Recursive Neural Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g5274.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 364-370

Keyword(s):

Neural Network ◽

Neural Networks ◽

Text Summarization ◽

Similarity Matrix ◽

Text Documents ◽

Short Text ◽

Current State ◽

Recursive Neural Networks ◽

Semantic Concepts ◽

Abstractive Summarization

Text summarization is an area of research with a goal to provide short text from huge text documents. Extractive text summarization methods have been extensively studied by many researchers. There are various type of multi document ranging from different formats to domains and topic specific. With the application of neural networks for text generation, interest for research in abstractive text summarization has increased significantly. This approach has been attempted for English and Telugu languages in this article. Recurrent neural networks are a subtype of recursive neural networks which try to predict the next sequence based on the current state and considering the information from previous states. The use of neural networks allows generation of summaries for long text sentences as well. The work implements semantic based filtering using a similarity matrix while keeping all stop-words. The similarity is calculated using semantic concepts and Jiang Similarity and making use of a Recurrent Neural Network (RNN) with an attention mechanism to generate summary. ROUGE score is used for measuring the performance of the applied method on Telugu and English langauges .

Download Full-text

Sparsity-Inducing Binarized Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6900 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12192-12199 ◽

Cited By ~ 1

Author(s):

Peisong Wang ◽

Xiangyu He ◽

Gang Li ◽

Tianli Zhao ◽

Jian Cheng

Keyword(s):

Neural Network ◽

Neural Networks ◽

High Efficiency ◽

State Of The Art ◽

Feature Representation ◽

Binary Representation ◽

Performance Gap ◽

Sign Function ◽

Current State ◽

The Arts

Binarization of feature representation is critical for Binarized Neural Networks (BNNs). Currently, sign function is the commonly used method for feature binarization. Although it works well on small datasets, the performance on ImageNet remains unsatisfied. Previous methods mainly focus on minimizing quantization error, improving the training strategies and decomposing each convolution layer into several binary convolution modules. However, whether sign is the only option for binarization has been largely overlooked. In this work, we propose the Sparsity-inducing Binarized Neural Network (Si-BNN), to quantize the activations to be either 0 or +1, which introduces sparsity into binary representation. We further introduce trainable thresholds into the backward function of binarization to guide the gradient propagation. Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and BNNs on mainstream architectures, achieving the new state-of-the-art on binarized AlexNet (Top-1 50.5%), ResNet-18 (Top-1 59.7%), and VGG-Net (Top-1 63.2%). At inference time, Si-BNN still enjoys the high efficiency of exclusive-not-or (xnor) operations.

Download Full-text

The FaceChannel: A Fast and Furious Deep Neural Network for Facial Expression Recognition

SN Computer Science ◽

10.1007/s42979-020-00325-6 ◽

2020 ◽

Vol 1 (6) ◽

Author(s):

Pablo Barros ◽

Nikhil Churamani ◽

Alessandra Sciutti

Keyword(s):

Neural Network ◽

Neural Networks ◽

Facial Expression ◽

Facial Expression Recognition ◽

Deep Neural Networks ◽

State Of The Art ◽

Facial Features ◽

Expression Recognition ◽

Current State ◽

Benchmark Datasets

AbstractCurrent state-of-the-art models for automatic facial expression recognition (FER) are based on very deep neural networks that are effective but rather expensive to train. Given the dynamic conditions of FER, this characteristic hinders such models of been used as a general affect recognition. In this paper, we address this problem by formalizing the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks. We introduce an inhibitory layer that helps to shape the learning of facial features in the last layer of the network and, thus, improving performance while reducing the number of trainable parameters. To evaluate our model, we perform a series of experiments on different benchmark datasets and demonstrate how the FaceChannel achieves a comparable, if not better, performance to the current state-of-the-art in FER. Our experiments include cross-dataset analysis, to estimate how our model behaves on different affective recognition conditions. We conclude our paper with an analysis of how FaceChannel learns and adapts the learned facial features towards the different datasets.

Download Full-text

LARGE SCALE NEURAL NETWORK SIMULATIONS

International Journal of Modern Physics C ◽

10.1142/s0129183190000141 ◽

1990 ◽

Vol 01 (02n03) ◽

pp. 259-277 ◽

Cited By ~ 6

Author(s):

G. A. KOHRING

Keyword(s):

Neural Network ◽

Neural Networks ◽

Numerical Simulations ◽

Large Scale ◽

Efficient Simulation ◽

Current State ◽

Network Simulations ◽

Near Future ◽

Future Sample ◽

Simple Models

The current state of large scale, numerical simulations of neural networks is reviewed. Hardware and software improvements make it likely that biological size networks, i.e., networks with more than 1010 couplings, can be simulated in the near future. Sample programs for the efficient simulation of a few simple models are presented as an aid to researchers just entering the field.

Download Full-text