scholarly journals Dynamic Integration and Management of Opportunistic Resources for HEP

2019 ◽  
Vol 214 ◽  
pp. 08009 ◽  
Author(s):  
Matthias J. Schnepf ◽  
R. Florian von Cube ◽  
Max Fischer ◽  
Manuel Giffels ◽  
Christoph Heidecker ◽  
...  

Demand for computing resources in high energy physics (HEP) shows a highly dynamic behavior, while the provided resources by the Worldwide LHC Computing Grid (WLCG) remains static. It has become evident that opportunistic resources such as High Performance Computing (HPC) centers and commercial clouds are well suited to cover peak loads. However, the utilization of these resources gives rise to new levels of complexity, e.g. resources need to be managed highly dynamically and HEP applications require a very specific software environment usually not provided at opportunistic resources. Furthermore, aspects to consider are limitations in network bandwidth causing I/O-intensive workflows to run inefficiently. The key component to dynamically run HEP applications on opportunistic resources is the utilization of modern container and virtualization technologies. Based on these technologies, the Karlsruhe Institute of Technology (KIT) has developed ROCED, a resource manager to dynamically integrate and manage a variety of opportunistic resources. In combination with ROCED, HTCondor batch system acts as a powerful single entry point to all available computing resources, leading to a seamless and transparent integration of opportunistic resources into HEP computing. KIT is currently improving the resource management and job scheduling by focusing on I/O requirements of individual workflows, available network bandwidth as well as scalability. For these reasons, we are currently developing a new resource manager, called TARDIS. In this paper, we give an overview of the utilized technologies, the dynamic management, and integration of resources as well as the status of the I/O-based resource and job scheduling.

2019 ◽  
Vol 214 ◽  
pp. 08004 ◽  
Author(s):  
R. Du ◽  
J. Shi ◽  
J. Zou ◽  
X. Jiang ◽  
Z. Sun ◽  
...  

There are two production clusters co-existed in the Institute of High Energy Physics (IHEP). One is a High Throughput Computing (HTC) cluster with HTCondor as the workload manager, the other is a High Performance Computing (HPC) cluster with Slurm as the workload manager. The resources of the HTCondor cluster are funded by multiple experiments, and the resource utilization reached more than 90% by adopting a dynamic resource share mechanism. Nevertheless, there is a bottleneck if more resources are requested by multiple experiments at the same moment. On the other hand, parallel jobs running on the Slurm cluster reflect some specific attributes, such as high degree of parallelism, low quantity and long wall time. Such attributes make it easy to generate free resource slots which are suitable for jobs from the HTCondor cluster. As a result, if there is a mechanism to schedule jobs from the HTCon-dor cluster to the Slurm cluster transparently, it would improve the resource utilization of the Slurm cluster, and reduce job queue time for the HTCondor cluster. In this proceeding, we present three methods to migrate HTCondor jobs to the Slurm cluster, and concluded that HTCondor-C is more preferred. Furthermore, because design philosophy and application scenes are di↵erent between HTCondor and Slurm, some issues and possible solutions related with job scheduling are presented.


2020 ◽  
Vol 245 ◽  
pp. 07038
Author(s):  
Max Fischer ◽  
Manuel Giffels ◽  
Andreas Heiss ◽  
Eileen Kuehn ◽  
Matthias Schnepf ◽  
...  

Increased operational effectiveness and the dynamic integration of only temporarily available compute resources (opportunistic resources) becomes more and more important in the next decade, due to the scarcity of resources for future high energy physics experiments as well as the desired integration of cloud and high performance computing resources. This results in a more heterogenous compute environment, which gives rise to huge challenges for the computing operation teams of the experiments. At the Karlsruhe Institute of Technology (KIT) we design solutions to tackle these challenges. In order to ensure an efficient utilization of opportunistic resources and unified access to the entire infrastructure, we developed the Transparent Adaptive Resource Dynamic Integration System (TARDIS). A scalable multi-agent resource manager providing interfaces to provision as well as dynamically and transparently integrate resources of various providers into one common overlay batch system. Operational effectiveness is guaranteed by relying on COBalD – the Opportunistic Balancing Daemon and its simple approach of taking into account the utilization and allocation of the different resource types, in order to run the individual workflows on the best-suited resource respectively. In this contribution we will present the current status of integrating various HPC centers and cloud providers into the compute infrastructure at the Karlsruhe Institute of Technology as well as our experiences gained in a production environment.


2021 ◽  
Vol 251 ◽  
pp. 02070
Author(s):  
Matthew Feickert ◽  
Lukas Heinrich ◽  
Giordon Stark ◽  
Ben Galewsky

In High Energy Physics facilities that provide High Performance Computing environments provide an opportunity to efficiently perform the statistical inference required for analysis of data from the Large Hadron Collider, but can pose problems with orchestration and efficient scheduling. The compute architectures at these facilities do not easily support the Python compute model, and the configuration scheduling of batch jobs for physics often requires expertise in multiple job scheduling services. The combination of the pure-Python libraries pyhf and funcX reduces the common problem in HEP analyses of performing statistical inference with binned models, that would traditionally take multiple hours and bespoke scheduling, to an on-demand (fitting) “function as a service” that can scalably execute across workers in just a few minutes, offering reduced time to insight and inference. We demonstrate execution of a scalable workflow using funcX to simultaneously fit 125 signal hypotheses from a published ATLAS search for new physics using pyhf with a wall time of under 3 minutes. We additionally show performance comparisons for other physics analyses with openly published probability models and argue for a blueprint of fitting as a service systems at HPC centers.


2018 ◽  
Vol 182 ◽  
pp. 02063 ◽  
Author(s):  
Vladimir Kekelidze ◽  
Alexander Kovalenko ◽  
Richard Lednicky ◽  
Victor Matveev ◽  
Igor Meshkov ◽  
...  

The NICA (Nuclotron-based Ion Collider fAcility) is the new international research facility under construction at the Joint Institute for Nuclear Research (JINR) in Dubna. The main targets of the facility are the following: 1) study of hot and dense baryonic matter at the energy range of the maximum baryonic density; 2) investigation of nucleon spin structure and polarization phenomena; 3) development of JINR accelerator facility for high energy physics research based on the new collider of relativistic ions from protons to gold and polarized protons and deuterons as well with the maximum collision energy of sqrt(sNN) ~11GeV (Au79+ +Au79+) and ~ 27 GeV (p+p). Two collider detector setups MPD and SPD are foreseen. The setup BM@N (Baryonic Matter at Nuclotron) is commissioned for data taken at the existing Nuclotron beam fixed target area. The MPD construction is in progress whereas the SPD is still at the beginning design stage. An average luminosity of the collider is expected at the level of 1027 cm-2 s-1 for Au (79+) and 1032 cm-2 s-1 for polarized protons at 27 GeV. The status of NICA design and construction work is briefly described below.


2020 ◽  
Vol 245 ◽  
pp. 07036
Author(s):  
Christoph Beyer ◽  
Stefan Bujack ◽  
Stefan Dietrich ◽  
Thomas Finnern ◽  
Martin Flemming ◽  
...  

DESY is one of the largest accelerator laboratories in Europe. It develops and operates state of the art accelerators for fundamental science in the areas of high energy physics, photon science and accelerator development. While for decades high energy physics (HEP) has been the most prominent user of the DESY compute, storage and network infrastructure, various scientific areas as science with photons and accelerator development have caught up and are now dominating the demands on the DESY infrastructure resources, with significant consequences for the IT resource provisioning. In this contribution, we will present an overview of the computational, storage and network resources covering the various physics communities on site. Ranging from high-throughput computing (HTC) batch-like offline processing in the Grid and the interactive user analyses resources in the National Analysis Factory (NAF) for the HEP community, to the computing needs of accelerator development or of photon sciences such as PETRA III or the European XFEL. Since DESY is involved in these experiments and their data taking, their requirements include fast low-latency online processing for data taking and calibration as well as offline processing, thus high-performance computing (HPC) workloads, that are run on the dedicated Maxwell HPC cluster. As all communities face significant challenges due to changing environments and increasing data rates in the following years, we will discuss how this will reflect in necessary changes to the computing and storage infrastructures. We will present DESY compute cloud and container orchestration plans as a basis for infrastructure and platform services. We will show examples of Jupyter notebooks for small scale interactive analysis, as well as its integration into large scale resources such as batch systems or Spark clusters. To overcome the fragmentation of the various resources for all scientific communities at DESY, we explore how to integrate them into a seamless user experience in an Interdisciplinary Data Analysis Facility.


2014 ◽  
Vol 2014 ◽  
pp. 1-13 ◽  
Author(s):  
Florin Pop

Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.


Author(s):  
Supratik Mukherjee ◽  
Aiswarya T ◽  
Subrata Mondal ◽  
Ganapathy Vaitheeswaran

Abstract This article thoroughly addresses the structural, mechanical, vibrational, electronic band structure and the optical properties of the unexplored thallous perchlorate and perbromate from ab-initio calculations. The zone centered vibrational phonon frequencies shows, there is a blue shift in the mid and high frequency range from Cl → Br due to change in mass and force constant with respect to oxygen atom. From the band structure it is clear that the top of the valence band is due to thallium s states, whereas the bottom of the conduction band is due to halogen s and oxygen p states, showing similar magnitude of dispersion and exhibits a charge transfer character. These characteristics and the band gap obtained are consistent with that of a favourable scintillators. Our findings deliver directions for the design of efficient TlXO4 based scintillators with high performance which are desirable for distinct applications such as medical imaging, high energy physics experiments, nuclear security.


2018 ◽  
Vol 191 ◽  
pp. 01003 ◽  
Author(s):  
Alexander Kovalenko ◽  
Vladimir Kekelidze ◽  
Richard Lednicky ◽  
Viktor Matveev ◽  
Igor Meshkov ◽  
...  

The NICA (Nuclotron-based Ion Collider fAcility) is the new international research facility under construction at the Joint Institute for Nuclear Research (JINR) in Dubna. The main targets of the facility are the following: 1) study of hot and dense baryonic matter at the energy range of the maximum baryonic density; 2) investigation of nucleon spin structure and polarization phenomena; 3) development of JINR accelerator facility for high energy physics research based on the new collider of relativistic ions from protons to gold and polarized protons and deuterons as well with the maximum collision energy of √SNN ~11GeV (Au79+ +Au79+) and ~ 27 GeV (p+p). Two collider detector setups MPD and SPD are foreseen. The setup BM@N (Baryonic Matter at Nuclotron) is commissioned for data taken at the existing Nuclotron beam fixed target area. The MPD construction is in progress whereas the SPD is still at the beginning design stage. An average luminosity of the collider is expected at the level of 1027 cm-2 s-1 for Au79+ and 1032 cm-2 s-1 for polarized protons at 27 GeV. The status of NICA design and construction work is briefly described below.


Author(s):  
Peter H Beckman

On 1 October 2004, the most ambitious high-performance Grid project in the United States—the TeraGrid—became fully operational. Resources at nine sites—the San Diego Supercomputer Center, the California Institute of Technology, the National Center for Supercomputing Applications, the University of Chicago/Argonne National Laboratory, Pittsburgh Supercomputing Center, Texas Advanced Computing Center, Purdue University, Indiana University and Oak Ridge National Laboratory—were joined via an ultra-fast optical network, unified policies and security procedures and a sophisticated distributed computing software environment. Funded by the National Science Foundation, the TeraGrid enables scientists and engineers to combine distributed, multiple data sources with computation at any of the sites or link massively parallel computer simulations to extreme-resolution visualizations at remote sites. A single shared utility lets multiple resources be easily leveraged and provides improved access to advanced computational capabilities. One of the demonstrations of this new model for using distributed resources, Teragyroid, linked the infrastructure of the TeraGrid with computing resources in the United Kingdom via a transatlantic data fibre link. Once connected, the software framework of the RealityGrid project was used to successfully explore lattice-Boltzmann simulations involving lattices of over one billion sites.


Sign in / Sign up

Export Citation Format

Share Document