scholarly journals On the use of neural networks for energy reconstruction in high-granularity calorimeters

2021 ◽  
Vol 16 (12) ◽  
pp. P12036
Author(s):  
N. Akchurin ◽  
C. Cowden ◽  
J. Damgov ◽  
A. Hussain ◽  
S. Kunori

Abstract We contrasted the performance of deep neural networks — Convolutional Neural Network (CNN) and Graph Neural Network (GNN) — to current state of the art energy regression methods in a finely 3D-segmented calorimeter simulated by GEANT4. This comparative benchmark gives us some insight to assess the particular latent signals neural network methods exploit to achieve superior resolution. A CNN trained solely on a pure sample of pions achieved substantial improvement in the energy resolution for both single pions and jets over the conventional approaches. It maintained good performance for electron and photon reconstruction. We also used the Graph Neural Network (GNN) with edge convolution to assess the importance of timing information in the shower development for improved energy reconstruction. We implement a simple simulation based correction to the energy sum derived from the fraction of energy deposited in the electromagnetic shower component. This serves as an approximate dual-readout analogue for our benchmark comparison. Although this study does not include the simulation of detector effects, such as electronic noise, the margin of improvement seems robust enough to suggest these benefits will endure in real-world application. We also find reason to infer that the CNN/GNN methods leverage latent features that concur with our current understanding of the physics of calorimeter measurement.

2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Georges Aad ◽  
Anne-Sophie Berthold ◽  
Thomas Calvet ◽  
Nemer Chiedde ◽  
Etienne Marie Fortin ◽  
...  

AbstractThe ATLAS experiment at the Large Hadron Collider (LHC) is operated at CERN and measures proton–proton collisions at multi-TeV energies with a repetition frequency of 40 MHz. Within the phase-II upgrade of the LHC, the readout electronics of the liquid-argon (LAr) calorimeters of ATLAS are being prepared for high luminosity operation expecting a pileup of up to 200 simultaneous proton–proton interactions. Moreover, the calorimeter signals of up to 25 subsequent collisions are overlapping, which increases the difficulty of energy reconstruction by the calorimeter detector. Real-time processing of digitized pulses sampled at 40 MHz is performed using field-programmable gate arrays (FPGAs). To cope with the signal pileup, new machine learning approaches are explored: convolutional and recurrent neural networks outperform the optimal signal filter currently used, both in assignment of the reconstructed energy to the correct proton bunch crossing and in energy resolution. The improvements concern in particular energies derived from overlapping pulses. Since the implementation of the neural networks targets an FPGA, the number of parameters and the mathematical operations need to be well controlled. The trained neural network structures are converted into FPGA firmware using automated implementations in hardware description language and high-level synthesis tools. Very good agreement between neural network implementations in FPGA and software based calculations is observed. The prototype implementations on an Intel Stratix-10 FPGA reach maximum operation frequencies of 344–640 MHz. Applying time-division multiplexing allows the processing of 390–576 calorimeter channels by one FPGA for the most resource-efficient networks. Moreover, the latency achieved is about 200 ns. These performance parameters show that a neural-network based energy reconstruction can be considered for the processing of the ATLAS LAr calorimeter signals during the high-luminosity phase of the LHC.


2020 ◽  
Vol 34 (04) ◽  
pp. 5306-5314
Author(s):  
Takamasa Okudono ◽  
Masaki Waga ◽  
Taro Sekiyama ◽  
Ichiro Hasuo

We present a method to extract a weighted finite automaton (WFA) from a recurrent neural network (RNN). Our method is based on the WFA learning algorithm by Balle and Mohri, which is in turn an extension of Angluin's classic L* algorithm. Our technical novelty is in the use of regression methods for the so-called equivalence queries, thus exploiting the internal state space of an RNN to prioritize counterexample candidates. This way we achieve a quantitative/weighted extension of the recent work by Weiss, Goldberg and Yahav that extracts DFAs. We experimentally evaluate the accuracy, expressivity and efficiency of the extracted WFAs.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Bernard Lapeyre ◽  
Jérôme Lelong

Abstract The pricing of Bermudan options amounts to solving a dynamic programming principle, in which the main difficulty, especially in high dimension, comes from the conditional expectation involved in the computation of the continuation value. These conditional expectations are classically computed by regression techniques on a finite-dimensional vector space. In this work, we study neural networks approximations of conditional expectations. We prove the convergence of the well-known Longstaff and Schwartz algorithm when the standard least-square regression is replaced by a neural network approximation, assuming an efficient algorithm to compute this approximation. We illustrate the numerical efficiency of neural networks as an alternative to standard regression methods for approximating conditional expectations on several numerical examples.


Author(s):  
Qingyu Tian ◽  
Mao Ding ◽  
Hui Yang ◽  
Caibin Yue ◽  
Yue Zhong ◽  
...  

Background: Drug development requires a lot of money and time, and the outcome of the challenge is unknown. So, there is an urgent need for researchers to find a new approach that can reduce costs. Therefore, the identification of drug-target interactions (DTIs) has been a critical step in the early stages of drug discovery. These computational methods aim to narrow the search space for novel DTIs and to elucidate the functional background of drugs. Most of the methods developed so far use binary classification to predict the presence or absence of interactions between the drug and the target. However, it is more informative, but also more challenging, to predict the strength of the binding between a drug and its target. If the strength is not strong enough, such a DTI may not be useful. Hence, the development of methods to predict drug-target affinity (DTA) is of significant importance. Method: We have improved the Graph DTA model from a dual-channel model to a triple-channel model. We interpreted the target/protein sequences as time series and extracted their features using the LSTM network. For the drug, we considered both the molecular structure and the local chemical background, retaining the four variant networks used in Graph DTA to extract the topological features of the drug and capturing the local chemical background of the atoms in the drug by using BiGRU. Thus, we obtained the latent features of the target and two latent features of the drug. The connection of these three feature vectors is then input into a 2-layer FC network, and a valuable binding affinity is output. Result: We use the Davis and Kiba datasets, using 80% of the data for training and 20% of the data for validation. Our model shows better performance by comparing it with the experimental results of Graph DTA. Conclusion: In this paper, we altered the Graph DTA model to predict drug-target affinity. It represents the drug as a graph, and extracts the two-dimensional drug information using a graph convolutional neural network. Simultaneously, the drug and protein targets are represented as a word vector, and the convolutional neural network is used to extract the time series information of the drug and the target. We demonstrate that our improved method has better performance than the original method. In particular, our model has better performance in the evaluation of benchmark databases.


2020 ◽  
Vol 17 (9) ◽  
pp. 3867-3872
Author(s):  
Aniv Chakravarty ◽  
Jagadish S. Kallimani

Text summarization is an active field of research with a goal to provide short and meaningful gists from large amount of text documents. Extractive text summarization methods have been extensively studied where text is extracted from the documents to build summaries. There are various type of multi document ranging from different formats to domains and topics. With the recent advancement in technology and use of neural networks for text generation, interest for research in abstractive text summarization has increased significantly. The use of graph based methods which handle semantic information has shown significant results. When given a set of documents of English text files, we make use of abstractive method and predicate argument structures to retrieve necessary text information and pass it through a neural network for text generation. Recurrent neural networks are a subtype of recursive neural networks which try to predict the next sequence based on the current state and considering the information from previous states. The use of neural networks allows generation of summaries for long text sentences as well. This paper implements a semantic based filtering approach using a similarity matrix while keeping all stop-words. The similarity is calculated using semantic concepts and Jiang–Conrath similarity and making use of a recurrent neural network with an attention mechanism to generate summary. ROUGE score is used for measuring accuracy, precision and recall scores.


2020 ◽  
Author(s):  
Benjamin A. Toms ◽  
Karthik Kashinath ◽  
Da Yang ◽  

Abstract. We test the reliability of two neural network interpretation techniques, backward optimization and layerwise relevance propagation, within geoscientific applications by applying them to a commonly studied geophysical phenomenon, the Madden-Julian Oscillation. The Madden-Julian Oscillation is a multi-scale pattern within the tropical atmosphere that has been extensively studied over the past decades, which makes it an ideal test case to ensure the interpretability methods can recover the current state of knowledge regarding its spatial structure. The neural networks can, indeed, reproduce the current state of knowledge and can also provide new insights into the seasonality of the Madden-Julian Oscillation and its relationships with atmospheric state variables. The neural network identifies the phase of the Madden-Julian Oscillation twice as accurately as linear regression, which means that nonlinearities used by the neural network are important to the structure of the Madden-Julian Oscillation. Interpretations of the neural network show that it accurately captures the spatial structures of the Madden-Julian Oscillation, suggest that the nonlinearities of the Madden-Julian Oscillation are manifested through the uniqueness of each event, and offer physically meaningful insights into its relationship with atmospheric state variables. We also use the interpretations to identify the seasonality of the MJO, and find that the conventionally defined extended seasons should be shifted later by one month. More generally, this study suggests that neural networks can be reliably interpreted for geoscientific applications and may thereby serve as a dependable method for testing geoscientific hypotheses.


Text summarization is an area of research with a goal to provide short text from huge text documents. Extractive text summarization methods have been extensively studied by many researchers. There are various type of multi document ranging from different formats to domains and topic specific. With the application of neural networks for text generation, interest for research in abstractive text summarization has increased significantly. This approach has been attempted for English and Telugu languages in this article. Recurrent neural networks are a subtype of recursive neural networks which try to predict the next sequence based on the current state and considering the information from previous states. The use of neural networks allows generation of summaries for long text sentences as well. The work implements semantic based filtering using a similarity matrix while keeping all stop-words. The similarity is calculated using semantic concepts and Jiang Similarity and making use of a Recurrent Neural Network (RNN) with an attention mechanism to generate summary. ROUGE score is used for measuring the performance of the applied method on Telugu and English langauges .


2020 ◽  
Vol 34 (07) ◽  
pp. 12192-12199 ◽  
Author(s):  
Peisong Wang ◽  
Xiangyu He ◽  
Gang Li ◽  
Tianli Zhao ◽  
Jian Cheng

Binarization of feature representation is critical for Binarized Neural Networks (BNNs). Currently, sign function is the commonly used method for feature binarization. Although it works well on small datasets, the performance on ImageNet remains unsatisfied. Previous methods mainly focus on minimizing quantization error, improving the training strategies and decomposing each convolution layer into several binary convolution modules. However, whether sign is the only option for binarization has been largely overlooked. In this work, we propose the Sparsity-inducing Binarized Neural Network (Si-BNN), to quantize the activations to be either 0 or +1, which introduces sparsity into binary representation. We further introduce trainable thresholds into the backward function of binarization to guide the gradient propagation. Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and BNNs on mainstream architectures, achieving the new state-of-the-art on binarized AlexNet (Top-1 50.5%), ResNet-18 (Top-1 59.7%), and VGG-Net (Top-1 63.2%). At inference time, Si-BNN still enjoys the high efficiency of exclusive-not-or (xnor) operations.


2020 ◽  
Vol 1 (6) ◽  
Author(s):  
Pablo Barros ◽  
Nikhil Churamani ◽  
Alessandra Sciutti

AbstractCurrent state-of-the-art models for automatic facial expression recognition (FER) are based on very deep neural networks that are effective but rather expensive to train. Given the dynamic conditions of FER, this characteristic hinders such models of been used as a general affect recognition. In this paper, we address this problem by formalizing the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks. We introduce an inhibitory layer that helps to shape the learning of facial features in the last layer of the network and, thus, improving performance while reducing the number of trainable parameters. To evaluate our model, we perform a series of experiments on different benchmark datasets and demonstrate how the FaceChannel achieves a comparable, if not better, performance to the current state-of-the-art in FER. Our experiments include cross-dataset analysis, to estimate how our model behaves on different affective recognition conditions. We conclude our paper with an analysis of how FaceChannel learns and adapts the learned facial features towards the different datasets.


1990 ◽  
Vol 01 (02n03) ◽  
pp. 259-277 ◽  
Author(s):  
G. A. KOHRING

The current state of large scale, numerical simulations of neural networks is reviewed. Hardware and software improvements make it likely that biological size networks, i.e., networks with more than 1010 couplings, can be simulated in the near future. Sample programs for the efficient simulation of a few simple models are presented as an aid to researchers just entering the field.


Sign in / Sign up

Export Citation Format

Share Document