Fast simulation methods in ATLAS: from classical to generative models

The ATLAS physics program relies on very large samples of Geant4 simulated events, which provide a highly detailed and accurate simulation of the ATLAS detector. However, this accuracy comes with a high price in CPU, and the sensitivity of many physics analyses is already limited by the available Monte Carlo statistics and will be even more so in the future. Therefore, sophisticated fast simulation tools have been developed. In Run 3 we aim to replace the calorimeter shower simulation for most samples with a new parametrised description of longitudinal and lateral energy deposits, including machine learning approaches, to achieve a fast and accurate description. Looking further ahead, prototypes are being developed using cutting edge machine learning approaches to learn the appropriate calorimeter response, which are expected to improve modeling of correlations within showers. Two different approaches, using Variational Auto-Encoders (VAEs) or Generative Adversarial Networks (GANs), are trained to model the shower simulation. Additional fast simulation tools will replace the inner detector simulation, as well as digitization and reconstruction algorithms, achieving up to two orders of magnitude improvement in speed. In this talk, we will describe the new tools for fast production of simulated events and an exploratory analysis of the deep learning methods.

Download Full-text

Fast Calorimeter Simulation in ATLAS

EPJ Web of Conferences ◽

10.1051/epjconf/202024502002 ◽

2020 ◽

Vol 245 ◽

pp. 02002

Author(s):

Sean Gasiorowski ◽

Heather Gray

Keyword(s):

Machine Learning ◽

Monte Carlo ◽

Atlas Detector ◽

High Price ◽

Learning Approaches ◽

Fast Simulation ◽

Simulation Tools ◽

Large Samples ◽

Physics Program ◽

Physics Performance

The ATLAS physics program at the LHC relies on very large samples of simulated events. Most of these samples are produced with Geant4, which provides a highly detailed and accurate simulation of the ATLAS detector. However, this accuracy comes with a high price in CPU, and the sensitivity of many physics analyses is already limited by the available Monte Carlo statistics and will be even more so in the future as datasets grow. To solve this problem, sophisticated fast simulation tools are developed, and they will become the default tools in ATLAS production in Run 3 and beyond. The slowest component is the simulation of the calorimeter showers. Those are replaced by a new parametrised description of the longitudinal and lateral energy deposits, including machine learning approaches, achieving a fast but accurate description. In this talk we will describe the new tool for fast calorimeter simulation that has been developed by ATLAS, review its technical and physics performance, and demonstrate its potential to transform physics analyses.

Download Full-text

3D convolutional GAN for fast simulation

EPJ Web of Conferences ◽

10.1051/epjconf/201921402010 ◽

2019 ◽

Vol 214 ◽

pp. 02010 ◽

Cited By ~ 4

Author(s):

Sofia Vallecorsa ◽

Federico Carminati ◽

Gulrukh Khattak

Keyword(s):

Machine Learning ◽

Monte Carlo Simulation ◽

Monte Carlo ◽

Three Dimensional ◽

Machine Learning Techniques ◽

Generative Adversarial Networks ◽

Detector Response ◽

Fast Simulation ◽

Geant4 Monte Carlo Simulation ◽

Detector Simulation

Machine Learning techniques have been used in different applications by the HEP community: in this talk, we discuss the case of detector simulation. The need for simulated events, expected in the future for LHC experiments and their High Luminosity upgrades, is increasing dramatically and requires new fast simulation solutions. We describe an R&D activity aimed at providing a configurable tool capable of training a neural network to reproduce the detector response and speed-up standard Monte Carlo simulation. This represents a generic approach in the sense that such a network could be designed and trained to simulate any kind of detector and, eventually, the whole data processing chain in order to get, directly in one step, the final reconstructed quantities, in just a small fraction of time. We present the first application of three-dimensional convolutional Generative Adversarial Networks to the simulation of high granularity electromagnetic calorimeters. We describe detailed validation studies comparing our results to Geant4 Monte Carlo simulation. Finally we show how this tool could be generalized to describe a whole class of calorimeters, opening the way to a generic machine learning based fast simulation approach.

Download Full-text

Generative Adversarial Networks for LHCb Fast Simulation

EPJ Web of Conferences ◽

10.1051/epjconf/202024502026 ◽

2020 ◽

Vol 245 ◽

pp. 02026

Author(s):

Fedor Ratnikov

Keyword(s):

Hadron Collider ◽

Generative Models ◽

Simulation Software ◽

Generative Adversarial Networks ◽

Fast Simulation ◽

Simulation Techniques ◽

Physics Program ◽

Lhcb Detector ◽

Dataset Size ◽

High Level

LHCb is one of the major experiments operating at the Large Hadron Collider at CERN. The richness of the physics program and the increasing precision of the measurements in LHCb lead to the need of ever larger simulated samples. This need will increase further when the upgraded LHCb detector will start collecting data in the LHC Run 3. Given the computing resources pledged for the production of Monte Carlo simulated events in the next years, the use of fast simulation techniques will be mandatory to cope with the expected dataset size. Generative models, which are nowadays widely used for computer vision and image processing, are being investigated in LHCb to accelerate generation of showers in the calorimeter and high-level responses of Cherenkov detector. We demonstrate that this approach provides high-fidelity results and discuss possible implications of these results. We also present an implementation of this algorithm into LHCb simulation software and validation tests.

Download Full-text

Conditional Wasserstein Generative Adversarial Networks for Fast Detector Simulation

EPJ Web of Conferences ◽

10.1051/epjconf/202125103055 ◽

2021 ◽

Vol 251 ◽

pp. 03055

Author(s):

John Blue ◽

Braden Kronheim ◽

Michelle Kuchera ◽

Raghuram Ramanujan

Keyword(s):

High Energy Physics ◽

High Energy ◽

Generative Models ◽

Generative Adversarial Networks ◽

Detector Response ◽

Event Simulation ◽

Simulation Process ◽

Adversarial Networks ◽

Wide Range ◽

Detector Simulation

Detector simulation in high energy physics experiments is a key yet computationally expensive step in the event simulation process. There has been much recent interest in using deep generative models as a faster alternative to the full Monte Carlo simulation process in situations in which the utmost accuracy is not necessary. In this work we investigate the use of conditional Wasserstein Generative Adversarial Networks to simulate both hadronization and the detector response to jets. Our model takes the 4-momenta of jets formed from partons post-showering and pre-hadronization as inputs and predicts the 4-momenta of the corresponding reconstructed jet. Our model is trained on fully simulated tt events using the publicly available GEANT-based simulation of the CMS Collaboration. We demonstrate that the model produces accurate conditional reconstructed jet transverse momentum (pT) distributions over a wide range of pT for the input parton jet. Our model takes only a fraction of the time necessary for conventional detector simulation methods, running on a CPU in less than a millisecond per event.

Download Full-text

Capturing the diversity of biological tuning curves using generative adversarial networks

10.1101/167916 ◽

2017 ◽

Cited By ~ 3

Author(s):

Takafumi Arakaki ◽

G. Barello ◽

Yashar Ahmadian

Keyword(s):

Machine Learning ◽

Latent Variables ◽

Likelihood Function ◽

Network Models ◽

Generative Models ◽

Generative Adversarial Networks ◽

Connectivity Matrix ◽

Model Parameters ◽

Tuning Curves ◽

Adversarial Networks

AbstractTuning curves characterizing the response selectivities of biological neurons often exhibit large degrees of irregularity and diversity across neurons. Theoretical network models that feature heterogeneous cell populations or random connectivity also give rise to diverse tuning curves. However, a general framework for fitting such models to experimentally measured tuning curves is lacking. We address this problem by proposing to view mechanistic network models as generative models whose parameters can be optimized to fit the distribution of experimentally measured tuning curves. A major obstacle for fitting such models is that their likelihood function is not explicitly available or is highly intractable to compute. Recent advances in machine learning provide ways for fitting generative models without the need to evaluate the likelihood and its gradient. Generative Adversarial Networks (GAN) provide one such framework which has been successful in traditional machine learning tasks. We apply this approach in two separate experiments, showing how GANs can be used to fit commonly used mechanistic models in theoretical neuroscience to datasets of measured tuning curves. This fitting procedure avoids the computationally expensive step of inferring latent variables, e.g., the biophysical parameters of individual cells or the particular realization of the full synaptic connectivity matrix, and directly learns model parameters which characterize the statistics of connectivity or of single-cell properties. Another strength of this approach is that it fits the entire, joint distribution of experimental tuning curves, instead of matching a few summary statistics picked a priori by the user. More generally, this framework opens the door to fitting theoretically motivated dynamical network models directly to simultaneously or non-simultaneously recorded neural responses.

Download Full-text

MegaSyn: Integrating Generative Molecule Design, Automated Analog Designer and Synthetic Viability Prediction

10.33774/chemrxiv-2021-nlwvs ◽

2021 ◽

Author(s):

Fabio Urbina ◽

Christopher Lowden ◽

Christopher Culberson ◽

Sean Ekins

Keyword(s):

Machine Learning ◽

Scoring Function ◽

Generative Models ◽

Generative Adversarial Networks ◽

Test Case ◽

Case Examples ◽

Multi Stage ◽

Machine Learning Model ◽

Retrosynthetic Analysis ◽

Automated Tools

Drug discovery is a multi-stage process, often beginning with the identification of active molecules from a high-throughput screen or machine learning model. Once structure activity relationship trends become well established, identifying new analogs with better properties is important. Synthesizing these new compounds is a logical next step, and is key to research groups that have a synthetic chemistry team or external collaborators. Generative machine learning models have become widely adopted to generate new molecules and explore molecular space, with the goal of discovering novel compounds with desires properties. These generative models have been composed from recurrent neural networks (RNNs), Variational Autoencoders (VAEs), and Generative Adversarial Networks (GANs) and are often combined with transfer learning or scoring of physicochemical properties to steer generative design. While these generative models have proven useful in generating new molecular libraries, often they are not capable of addressing a wide variety of potential problems, and often converge into similar molecular space when combined with a scoring function for desired properties. In addition, generated compounds are often not synthetically feasible, reducing their capabilities outside of virtual composition and limiting their usefulness in real-world scenarios. Here we introduce a suite of automated tools called MegaSyn representing 3 components: a new hill-climb algorithm which makes use of SMILES-based RNN generative models, analog generation software, and retrosynthetic analysis coupled with fragment analysis to score molecules for their synthetic feasibility. We now describe the development and testing of this suite of tools and propose how they might be used to optimize molecules or prioritize promising lead compounds using test case examples.

Download Full-text

A Novel Approach to Enhance the Generalization Capability of the Hourly Solar Diffuse Horizontal Irradiance Models on Diverse Climates

Energies ◽

10.3390/en13184868 ◽

2020 ◽

Vol 13 (18) ◽

pp. 4868

Author(s):

Raghuram Kalyanam ◽

Sabine Hoffmann

Keyword(s):

Machine Learning ◽

Diffuse Radiation ◽

Absolute Error ◽

Training Data ◽

Learning Approaches ◽

Simulation Tools ◽

Energy Applications ◽

Novel Approach ◽

Machine Learning Model ◽

Different Climates

Solar radiation data is essential for the development of many solar energy applications ranging from thermal collectors to building simulation tools, but its availability is limited, especially the diffuse radiation component. There are several studies aimed at predicting this value, but very few studies cover the generalizability of such models on varying climates. Our study investigates how well these models generalize and also show how to enhance their generalizability on different climates. Since machine learning approaches are known to generalize well, we apply them to truly understand how well they perform on different climates than they are originally trained. Therefore, we trained them on datasets from the U.S. and tested on several European climates. The machine learning model that is developed for U.S. climates not only showed low mean absolute error (MAE) of 23 W/m2, but also generalized very well on European climates with MAE in the range of 20 to 27 W/m2. Further investigation into the factors influencing the generalizability revealed that careful selection of the training data can improve the results significantly.

Download Full-text

A State-of-the-Art Survey on Deep Learning Theory and Architectures

Electronics ◽

10.3390/electronics8030292 ◽

2019 ◽

Vol 8 (3) ◽

pp. 292 ◽

Cited By ~ 157

Author(s):

Md Zahangir Alom ◽

Tarek M. Taha ◽

Chris Yakopcic ◽

Stefan Westberg ◽

Paheding Sidike ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Reinforcement Learning ◽

Language Processing ◽

Large Scale ◽

Medical Information ◽

State Of The Art ◽

Generative Models ◽

Learning Approaches

In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and many others. This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network (DNN). The survey goes on to cover Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). Additionally, we have discussed recent developments, such as advanced variant DL techniques based on these DL approaches. This work considers most of the papers published after 2012 from when the history of deep learning began. Furthermore, DL approaches that have been explored and evaluated in different application domains are also included in this survey. We also included recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys that have been published on DL using neural networks and a survey on Reinforcement Learning (RL). However, those papers have not discussed individual advanced techniques for training large-scale deep learning models and the recently developed method of generative models.

Download Full-text

A New Integrated Approach for Landslide Data Balancing and Spatial Prediction Based on Generative Adversarial Networks (GAN)

Remote Sensing ◽

10.3390/rs13194011 ◽

2021 ◽

Vol 13 (19) ◽

pp. 4011

Author(s):

Husam A. H. Al-Najjar ◽

Biswajeet Pradhan ◽

Raju Sarkar ◽

Ghassan Beydoun ◽

Abdullah Alamri

Keyword(s):

Machine Learning ◽

Spatial Prediction ◽

Generative Models ◽

Generative Adversarial Networks ◽

Slope Aspect ◽

Support Vector ◽

Learning Models ◽

Adversarial Networks ◽

Landslide Data ◽

Machine Learning Models

Landslide susceptibility mapping has significantly progressed with improvements in machine learning techniques. However, the inventory / data imbalance (DI) problem remains one of the challenges in this domain. This problem exists as a good quality landslide inventory map, including a complete record of historical data, is difficult or expensive to collect. As such, this can considerably affect one’s ability to obtain a sufficient inventory or representative samples. This research developed a new approach based on generative adversarial networks (GAN) to correct imbalanced landslide datasets. The proposed method was tested at Chukha Dzongkhag, Bhutan, one of the most frequent landslide prone areas in the Himalayan region. The proposed approach was then compared with the standard methods such as the synthetic minority oversampling technique (SMOTE), dense imbalanced sampling, and sparse sampling (i.e., producing non-landslide samples as many as landslide samples). The comparisons were based on five machine learning models, including artificial neural networks (ANN), random forests (RF), decision trees (DT), k-nearest neighbours (kNN), and the support vector machine (SVM). The model evaluation was carried out based on overall accuracy (OA), Kappa Index, F1-score, and area under receiver operating characteristic curves (AUROC). The spatial database was established with a total of 269 landslides and 10 conditioning factors, including altitude, slope, aspect, total curvature, slope length, lithology, distance from the road, distance from the stream, topographic wetness index (TWI), and sediment transport index (STI). The findings of this study have shown that both GAN and SMOTE data balancing approaches have helped to improve the accuracy of machine learning models. According to AUROC, the GAN method was able to boost the models by reaching the maximum accuracy of ANN (0.918), RF (0.933), DT (0.927), kNN (0.878), and SVM (0.907) when default parameters used. With the optimum parameters, all models performed best with GAN at their highest accuracy of ANN (0.927), RF (0.943), DT (0.923) and kNN (0.889), except SVM obtained the highest accuracy of (0.906) with SMOTE. Our finding suggests that RF balanced with GAN can provide the most reasonable criterion for landslide prediction. This research indicates that landslide data balancing may substantially affect the predictive capabilities of machine learning models. Therefore, the issue of DI in the spatial prediction of landslides should not be ignored. Future studies could explore other generative models for landslide data balancing. By using state-of-the-art GAN, the proposed model can be considered in the areas where the data are limited or imbalanced.

Download Full-text

A Survey on Generative Adversarial Networks: Variants, Applications, and Training

ACM Computing Surveys ◽

10.1145/3463475 ◽

2022 ◽

Vol 54 (8) ◽

pp. 1-49

Author(s):

Abdul Jabbar ◽

Xi Li ◽

Bourahla Omar

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Nash Equilibrium ◽

Generative Models ◽

Generative Adversarial Networks ◽

Data Generation ◽

Crucial Issue ◽

Practical Applications ◽

Adversarial Networks ◽

And Training

The Generative Models have gained considerable attention in unsupervised learning via a new and practical framework called Generative Adversarial Networks (GAN) due to their outstanding data generation capability. Many GAN models have been proposed, and several practical applications have emerged in various domains of computer vision and machine learning. Despite GANs excellent success, there are still obstacles to stable training. The problems are Nash equilibrium, internal covariate shift, mode collapse, vanishing gradient, and lack of proper evaluation metrics. Therefore, stable training is a crucial issue in different applications for the success of GANs. Herein, we survey several training solutions proposed by different researchers to stabilize GAN training. We discuss (I) the original GAN model and its modified versions, (II) a detailed analysis of various GAN applications in different domains, and (III) a detailed study about the various GAN training obstacles as well as training solutions. Finally, we reveal several issues as well as research outlines to the topic.

Download Full-text