Enabling Data Intensive Science on Supercomputers for High Energy Physics R&D Projects in HL-LHC Era

The ATLAS experiment at CERN’s Large Hadron Collider uses theWorldwide LHC Computing Grid, the WLCG, for its distributed computing infrastructure. Through the workload management system PanDA and the distributed data management system Rucio, ATLAS provides seamless access to hundreds of WLCG grid and cloud based resources that are distributed worldwide, to thousands of physicists. PanDA annually processes more than an exabyte of data using an average of 350,000 distributed batch slots, to enable hundreds of new scientific results from ATLAS. However, the resources available to the experiment have been insufficient to meet ATLAS simulation needs over the past few years as the volume of data from the LHC has grown. The problem will be even more severe for the next LHC phases. High Luminosity LHC will be a multiexabyte challenge where the envisaged Storage and Compute needs are a factor 10 to 100 above the expected technology evolution. The High Energy Physics (HEP) community needs to evolve current computing and data organization models in order to introduce changes in the way it uses and manages the infrastructure, focused on optimizations to bring performance and efficiency not forgetting simplification of operations. In this paper we highlight recent R&D projects in HEP related to data lake prototype, federated data storage and data carousel.

Download Full-text

An integrated storage and data management system for a high energy physics experiment

High-Performance Computing and Networking - Lecture Notes in Computer Science ◽

10.1007/bfb0031673 ◽

1997 ◽

pp. 975-977

Author(s):

Paolo Calafiura ◽

Gerhard Wirrer ◽

Bernd Panzer-Steindel

Keyword(s):

Data Management ◽

Management System ◽

High Energy Physics ◽

High Energy ◽

Data Management System ◽

Physics Experiment ◽

Energy Physics

Download Full-text

JAZELLE: An enhanced data management system for high energy physics

10.1063/1.39573 ◽

1990 ◽

Author(s):

A. S. Johnson ◽

M. I. Briedenbach ◽

H. Hissen ◽

P. F. Kunz ◽

D. J. Sherden ◽

...

Keyword(s):

Data Management ◽

Management System ◽

High Energy Physics ◽

High Energy ◽

Data Management System ◽

Energy Physics

Download Full-text

Automated Computation of One-Loop Amplitudes

Annual Review of Nuclear and Particle Science ◽

10.1146/annurev-nucl-101917-020959 ◽

2018 ◽

Vol 68 (1) ◽

pp. 291-312 ◽

Cited By ~ 1

Author(s):

Celine Degrande ◽

Valentin Hirschi ◽

Olivier Mattelaer

Keyword(s):

High Energy Physics ◽

Expert Knowledge ◽

Hadron Collider ◽

High Energy ◽

Hadron Colliders ◽

Collider Phenomenology ◽

Physics Community ◽

High Degree ◽

Energy Physics ◽

Loop Amplitudes

The automation of one-loop amplitudes plays a key role in addressing several computational challenges for hadron collider phenomenology: They are needed for simulations including next-to-leading-order corrections, which can be large at hadron colliders. They also allow the exact computation of loop-induced processes. A high degree of automation has now been achieved in public codes that do not require expert knowledge and can be widely used in the high-energy physics community. In this article, we review many of the methods and tools used for the different steps of automated one-loop amplitude calculations: renormalization of the Lagrangian, derivation and evaluation of the amplitude, its decomposition onto a basis of scalar integrals and their subsequent evaluation, as well as computation of the rational terms.

Download Full-text

The Impact of Microelectronics on High Energy Physics Innovation: The Role of 65 nm CMOS Technology on New Generation Particle Detectors

Frontiers in Physics ◽

10.3389/fphy.2021.629028 ◽

2021 ◽

Vol 9 ◽

Author(s):

N. Demaria

Keyword(s):

Particle Physics ◽

High Energy Physics ◽

Hadron Collider ◽

Cmos Technology ◽

High Energy ◽

Main Role ◽

Noise Power ◽

Recent Developments ◽

The Impact ◽

Energy Physics

The High Luminosity Large Hadron Collider (HL-LHC) at CERN will constitute a new frontier for the particle physics after the year 2027. Experiments will undertake a major upgrade in order to stand this challenge: the use of innovative sensors and electronics will have a main role in this. This paper describes the recent developments in 65 nm CMOS technology for readout ASIC chips in future High Energy Physics (HEP) experiments. These allow unprecedented performance in terms of speed, noise, power consumption and granularity of the tracking detectors.

Download Full-text

Striped Data Analysis Framework

EPJ Web of Conferences ◽

10.1051/epjconf/202024506042 ◽

2020 ◽

Vol 245 ◽

pp. 06042

Author(s):

Oliver Gutsche ◽

Igor Mandrichenko

Keyword(s):

Data Analysis ◽

Data Storage ◽

High Energy Physics ◽

Data Access ◽

High Energy ◽

Data Representation ◽

General Idea ◽

Common Data Model ◽

Local File ◽

Energy Physics

A columnar data representation is known to be an efficient way for data storage, specifically in cases when the analysis is often done based only on a small fragment of the available data structures. A data representation like Apache Parquet is a step forward from a columnar representation, which splits data horizontally to allow for easy parallelization of data analysis. Based on the general idea of columnar data storage, working on the [LDRD Project], we have developed a striped data representation, which, we believe, is better suited to the needs of High Energy Physics data analysis. A traditional columnar approach allows for efficient data analysis of complex structures. While keeping all the benefits of columnar data representations, the striped mechanism goes further by enabling easy parallelization of computations without requiring special hardware. We will present an implementation and some performance characteristics of such a data representation mechanism using a distributed no-SQL database or a local file system, unified under the same API and data representation model. The representation is efficient and at the same time simple so that it allows for a common data model and APIs for wide range of underlying storage mechanisms such as distributed no-SQL databases and local file systems. Striped storage adopts Numpy arrays as its basic data representation format, which makes it easy and efficient to use in Python applications. The Striped Data Server is a web service, which allows to hide the server implementation details from the end user, easily exposes data to WAN users, and allows to utilize well known and developed data caching solutions to further increase data access efficiency. We are considering the Striped Data Server as the core of an enterprise scale data analysis platform for High Energy Physics and similar areas of data processing. We have been testing this architecture with a 2TB dataset from a CMS dark matter search and plan to expand it to multiple 100 TB or even PB scale. We will present the striped format, Striped Data Server architecture and performance test results.

Download Full-text

Impact of detector simulation in particle physics collider experiments - highlights

EPJ Web of Conferences ◽

10.1051/epjconf/201921402019 ◽

2019 ◽

Vol 214 ◽

pp. 02019

Author(s):

V. Daniel Elvira

Keyword(s):

Particle Physics ◽

High Energy Physics ◽

Hadron Collider ◽

High Energy ◽

Design And Optimization ◽

The Cost ◽

Detector Simulation ◽

The Impact ◽

Journal Submission ◽

Energy Physics

Detector simulation has become fundamental to the success of modern high-energy physics (HEP) experiments. For example, the Geant4-based simulation applications developed by the ATLAS and CMS experiments played a major role for them to produce physics measurements of unprecedented quality and precision with faster turnaround, from data taking to journal submission, than any previous hadron collider experiment. The material presented here contains highlights of a recent review on the impact of detector simulation in particle physics collider experiments published in Ref. [1]. It includes examples of applications to detector design and optimization, software development and testing of computing infrastructure, and modeling of physics objects and their kinematics. The cost and economic impact of simulation in the CMS experiment is also presented. A discussion on future detector simulation needs, challenges and potential solutions to address them is included at the end.

Download Full-text

The Data Ocean Project

EPJ Web of Conferences ◽

10.1051/epjconf/201921404020 ◽

2019 ◽

Vol 214 ◽

pp. 04020 ◽

Cited By ~ 2

Author(s):

Martin Barisits ◽

Fernando Barreiro ◽

Thomas Beermann ◽

Karan Bhatia ◽

Kaushik De ◽

...

Keyword(s):

High Energy Physics ◽

Workflow Management ◽

High Energy ◽

Data Management System ◽

Cloud Platform ◽

Scientific Experiments ◽

Physics Experiment ◽

Future Work ◽

Work Done ◽

Energy Physics

Transparent use of commercial cloud resources for scientific experiments is a hard problem. In this article, we describe the first steps of the Data Ocean R&D collaboration between the high-energy physics experiment ATLAS together with Google Cloud Platform, to allow seamless use of Google Compute Engine and Google Cloud Storage for physics analysis. We start by describing the three preliminary use cases that were identified at the beginning of the project. The following sections then detail the work done in the data management system Rucio and the workflow management systems PanDA and Harvester to interface Google Cloud Platform with the ATLAS distributed computing environment, and show the results of the integration tests. Afterwards, we describe the setup and results from a full ATLAS user analysis that was executed natively on Google Cloud Platform, and give estimates on projected costs. We close with a summary and and outlook on future work.

Download Full-text

Comparison of data storage and analysis throughput in the light of high energy physics experiment MACE

Astronomy and Computing ◽

10.1016/j.ascom.2020.100409 ◽

2020 ◽

Vol 33 ◽

pp. 100409

Author(s):

D. Sarkar ◽

Mahesh P. ◽

Padmini S. ◽

N. Chouhan ◽

C. Borwankar ◽

...

Keyword(s):

Data Storage ◽

High Energy Physics ◽

High Energy ◽

Physics Experiment ◽

Energy Physics

Download Full-text

Towards efficient scheduling of data intensive high energy physics workflows

Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science - WORKS '15 ◽

10.1145/2822332.2822335 ◽

2015 ◽

Cited By ~ 5

Author(s):

Mahantesh Halappanavar ◽

Malachi Schram ◽

Luis de la Torre ◽

Kevin Barker ◽

Nathan R. Tallent ◽

...

Keyword(s):

High Energy Physics ◽

High Energy ◽

Data Intensive ◽

Energy Physics

Download Full-text

The Supercollider: The Pre-Texas Days — A Personal Recollection of Its Birth and Berkeley Years

Reviews of Accelerator Science and Technology ◽

10.1142/s1793626808000113 ◽

2008 ◽

Vol 01 (01) ◽

pp. 259-302 ◽

Cited By ~ 2

Author(s):

Stanley Wojcicki

Keyword(s):

Site Selection ◽

High Energy Physics ◽

Selection Process ◽

Hadron Collider ◽

High Energy ◽

Management Structure ◽

Superconducting Super Collider ◽

Physics Community ◽

The Us ◽

Energy Physics

This article describes the beginnings of the Superconducting Super Collider (SSC). The narrative starts in the early 1980s with the discussion of the process that led to the recommendation by the US high energy physics community to initiate work on a multi-TeV hadron collider. The article then describes the formation in 1984 of the Central Design Group (CDG) charged with directing and coordinating the SSC R&D and subsequent activities which led in early 1987 to the SSC endorsement by President Reagan. The last part of the article deals with the site selection process, steps leading to the initial Congressional appropriation of the SSC construction funds and the creation of the management structure for the SSC Laboratory.

Download Full-text