Research and Exploit of Resource Sharing Strategy at IHEP

At IHEP (Institute of High Energy Physics, Chinese Academy of Sciences), computing resources are contributed by different experiments including BES, JUNO, DYW, HXMT, etc. The resources were divided into different partitions to satisfy the dedicated experiment data processing requirements. IHEP had a local Torqu&Maui cluster with 50 queues serving for above 10 experiments. The separated resource partitions leaded to imbalance resource load. In a typical situation, BES resource partition was quite busy without free slot but still with lots of jobs in idle, while JUNO resources are free and wasted seriously. After moving resources from Torque&Maui to HTCondor in 2016, job scheduling efficiency has been improved a lot. In order to balance the imbalance resource load, we designed an efficient sharing strategy to improve the overall resourceutilization. We created an unified pool shared by all experiments. For each experiment, resources are divided into two parts: dedicated resource and sharing resource. The slots in dedicated resource only run jobs from its own experiment, and the slots in sharing resource are shared by jobs from all experiments. Default ratio of dedicated resource to sharing resource is 1:4. To maximize the sharing effectiveness, the ratio is dynamically adjusted between 0:5 and 4:1 based on the number of jobs submitted by each experiment. We have developed a central control system to decide how many resources can be allocated to each experiment group. This system is implemented at two sides: server side and client side. A management database is built at server side, which is storing resource, group and experiment information. Once the sharing ratio needs to be adjusted, resource group will be changed and updated into database. The resource group information is published to the server buffer in real-time. The client periodically pulls resource group information from server buffer via https protocol And resource scheduling configuration at client side is changed based on the resource group information. With this method, share ratio can be modified and deployed dynamically. Sharing strategy is implemented with HTCondor. ClassAd mechanism and accounting-group in HTCondor facilitate to utilizethe sharing strategy at IHEP computing cluster. With the sharing strategy, resource usage has been improved dramatically.

Download Full-text

Dynamic Integration and Management of Opportunistic Resources for HEP

EPJ Web of Conferences ◽

10.1051/epjconf/201921408009 ◽

2019 ◽

Vol 214 ◽

pp. 08009 ◽

Cited By ~ 1

Author(s):

Matthias J. Schnepf ◽

R. Florian von Cube ◽

Max Fischer ◽

Manuel Giffels ◽

Christoph Heidecker ◽

...

Keyword(s):

High Performance ◽

High Energy Physics ◽

Job Scheduling ◽

High Energy ◽

Software Environment ◽

Resource Manager ◽

Network Bandwidth ◽

The Status ◽

Single Entry ◽

Institute Of Technology

Demand for computing resources in high energy physics (HEP) shows a highly dynamic behavior, while the provided resources by the Worldwide LHC Computing Grid (WLCG) remains static. It has become evident that opportunistic resources such as High Performance Computing (HPC) centers and commercial clouds are well suited to cover peak loads. However, the utilization of these resources gives rise to new levels of complexity, e.g. resources need to be managed highly dynamically and HEP applications require a very specific software environment usually not provided at opportunistic resources. Furthermore, aspects to consider are limitations in network bandwidth causing I/O-intensive workflows to run inefficiently. The key component to dynamically run HEP applications on opportunistic resources is the utilization of modern container and virtualization technologies. Based on these technologies, the Karlsruhe Institute of Technology (KIT) has developed ROCED, a resource manager to dynamically integrate and manage a variety of opportunistic resources. In combination with ROCED, HTCondor batch system acts as a powerful single entry point to all available computing resources, leading to a seamless and transparent integration of opportunistic resources into HEP computing. KIT is currently improving the resource management and job scheduling by focusing on I/O requirements of individual workflows, available network bandwidth as well as scalability. For these reasons, we are currently developing a new resource manager, called TARDIS. In this paper, we give an overview of the utilized technologies, the dynamic management, and integration of resources as well as the status of the I/O-based resource and job scheduling.

Download Full-text

A Feasibility Study on workload integration between HT-Condor and Slurm Clusters

EPJ Web of Conferences ◽

10.1051/epjconf/201921408004 ◽

2019 ◽

Vol 214 ◽

pp. 08004 ◽

Cited By ~ 1

Author(s):

R. Du ◽

J. Shi ◽

J. Zou ◽

X. Jiang ◽

Z. Sun ◽

...

Keyword(s):

Resource Utilization ◽

High Performance ◽

High Energy Physics ◽

Job Scheduling ◽

High Energy ◽

The Other ◽

Workload Manager ◽

High Degree ◽

Performance Computing ◽

Energy Physics

There are two production clusters co-existed in the Institute of High Energy Physics (IHEP). One is a High Throughput Computing (HTC) cluster with HTCondor as the workload manager, the other is a High Performance Computing (HPC) cluster with Slurm as the workload manager. The resources of the HTCondor cluster are funded by multiple experiments, and the resource utilization reached more than 90% by adopting a dynamic resource share mechanism. Nevertheless, there is a bottleneck if more resources are requested by multiple experiments at the same moment. On the other hand, parallel jobs running on the Slurm cluster reflect some specific attributes, such as high degree of parallelism, low quantity and long wall time. Such attributes make it easy to generate free resource slots which are suitable for jobs from the HTCondor cluster. As a result, if there is a mechanism to schedule jobs from the HTCon-dor cluster to the Slurm cluster transparently, it would improve the resource utilization of the Slurm cluster, and reduce job queue time for the HTCondor cluster. In this proceeding, we present three methods to migrate HTCondor jobs to the Slurm cluster, and concluded that HTCondor-C is more preferred. Furthermore, because design philosophy and application scenes are di↵erent between HTCondor and Slurm, some issues and possible solutions related with job scheduling are presented.

Download Full-text

Distributed statistical inference with pyhf enabled through funcX

EPJ Web of Conferences ◽

10.1051/epjconf/202125102070 ◽

2021 ◽

Vol 251 ◽

pp. 02070

Author(s):

Matthew Feickert ◽

Lukas Heinrich ◽

Giordon Stark ◽

Ben Galewsky

Keyword(s):

Statistical Inference ◽

High Performance ◽

High Energy Physics ◽

Job Scheduling ◽

Hadron Collider ◽

High Energy ◽

New Physics ◽

Probability Models ◽

Compute Model ◽

Energy Physics

In High Energy Physics facilities that provide High Performance Computing environments provide an opportunity to efficiently perform the statistical inference required for analysis of data from the Large Hadron Collider, but can pose problems with orchestration and efficient scheduling. The compute architectures at these facilities do not easily support the Python compute model, and the configuration scheduling of batch jobs for physics often requires expertise in multiple job scheduling services. The combination of the pure-Python libraries pyhf and funcX reduces the common problem in HEP analyses of performing statistical inference with binned models, that would traditionally take multiple hours and bespoke scheduling, to an on-demand (fitting) “function as a service” that can scalably execute across workers in just a few minutes, offering reduced time to insight and inference. We demonstrate execution of a scalable workflow using funcX to simultaneously fit 125 signal hypotheses from a published ATLAS search for new physics using pyhf with a wall time of under 3 minutes. We additionally show performance comparisons for other physics analyses with openly published probability models and argue for a blueprint of fitting as a service systems at HPC centers.

Download Full-text

Application of Deep Learning on Integrating Prediction, Provenance, and Optimization

EPJ Web of Conferences ◽

10.1051/epjconf/201921406007 ◽

2019 ◽

Vol 214 ◽

pp. 06007

Author(s):

Malachi Schram ◽

Nathan Tallent ◽

Ryan Friese ◽

Alok Singh ◽

Ilkay Altintas

Keyword(s):

Classification Accuracy ◽

Large Scale ◽

High Energy Physics ◽

Job Scheduling ◽

Short Term Memory ◽

High Energy ◽

Short Term ◽

Long Short Term Memory ◽

Computational Resources ◽

Energy Physics

In this research, we investigated two approaches to detect job anomalies and/or contention for large scale computing efforts: 1. Preemptive job scheduling using binomial classification long short-term memory networks 2. Forecasting intra-node computing loads from the active jobs and additional job(s) For approach 1, we achieved a 14% improvement in computational resources utilization and an overall classification accuracy of 85% on real tasks executed in a High Energy Physics computing workflow. For this paper, we present the preliminary results used in second approach.

Download Full-text

Exploring the virtues of XRootD5: Declarative API

EPJ Web of Conferences ◽

10.1051/epjconf/202125102063 ◽

2021 ◽

Vol 251 ◽

pp. 02063

Author(s):

Michal Simon ◽

Andrew Hanushevsky

Keyword(s):

Data Management ◽

Software Quality ◽

High Energy Physics ◽

Positive Impact ◽

Building Blocks ◽

High Energy ◽

Storage Solutions ◽

Numerous Data ◽

Client Side ◽

Energy Physics

Across the years, being the backbone of numerous data management solutions used within the WLCG collaboration, the XRootD framework and protocol became one of the most important building blocks for storage solutions in the High Energy Physics (HEP) community. The latest big milestone for the project, release 5, introduced multitude of architectural improvements and functional enhancements, including the new client side declarative API, which is the main focus of this study. In this contribution, we give an overview of the new client API and we discuss its motivation and its positive impact on overall software quality (coupling, cohesion), readability and composability.

Download Full-text

Predicting resource usage for enhanced job scheduling for opportunistic resources in HEP

EPJ Web of Conferences ◽

10.1051/epjconf/202024507039 ◽

2020 ◽

Vol 245 ◽

pp. 07039

Author(s):

Eileen Kuehn ◽

Max Fischer ◽

Sven Lange ◽

Andreas Petzold ◽

Andreas Heiss

Keyword(s):

High Energy Physics ◽

Job Scheduling ◽

High Energy ◽

Resource Provisioning ◽

Resource Consumption ◽

Accurate Information ◽

Resource Usage ◽

End User ◽

Energy Physics ◽

Available Resources

To overcome the computing challenge in High Energy Physics available resources must be utilized as efficiently as possible. This targets algorithmic challenges in the workflows itself but also the scheduling of jobs to compute resources. To enable the best possible scheduling, job schedulers require accurate information about resource consumption of a job before it is even executed. It is the responsibility of the user to provide an accurate resource estimate required for jobs. However, this is quite a challenge for users as they (i) want to ensure their jobs to run correctly, (ii) must manage to deal with heterogeneous compute resources and (iii) face intransparent library dependencies and frequent updates. Users therefore tend to specify resource requests with an ample buffer. This inaccuracy results in inefficient utilisation by either blocking unused resources or exceeding reserved resources. Especially in the context of opportunistic resource provisioning the inaccuracies have an even broader impact that does not even target utilisation of resources but also composition of the most suitable resources. The contribution of this paper is an analysis of production and end-user workflows in HEP with regards to optimizing the various resources types. We further propose a method to improve user estimates.

Download Full-text