A Debugging Standard for High-Performance Computing

Throughout 1998, the High Performance Debugging Forum worked on defining a base level standard for high performance debuggers. The standard had to meet the sometimes conflicting constraints of being useful to users, realistically implementable by developers, and architecturally independent across multiple platforms. To meet criteria for timeliness, the standard had to be defined in one year and in such a way that it could be implemented within an additional year. The Forum was successful, and in November 1998 released Version 1 of the HPD Standard. Implementations of the standard are currently underway. This paper presents an overview of Version 1 of the standard and an analysis of the process by which the standard was developed. The status of implementation efforts and plans for follow-on efforts are discussed as well.

Download Full-text

ComputeOps: Container for High Performance Computing

EPJ Web of Conferences ◽

10.1051/epjconf/202024507006 ◽

2020 ◽

Vol 245 ◽

pp. 07006

Author(s):

Cécile Cavet ◽

Martin Souchal ◽

Sébastien Gadrat ◽

Gilles Grasseau ◽

Andrea Satirana ◽

...

Keyword(s):

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Key Concepts ◽

The Status ◽

Image Building ◽

Computing Framework ◽

Performance Computing ◽

Linux Containers

The High Performance Computing (HPC) domain aims to optimize code in order to use the latest multicore and parallel technologies including specific processor instructions. In this computing framework, portability and reproducibility are key concepts. A way to handle these requirements is to use Linux containers. These “light virtual machines” allow to encapsulate applications within its environment in Linux processes. Containers have been recently rediscovered due to their abilities to provide both multi-infrastructure environnement for developers and system administrators and reproducibility due to image building file. Two container solutions are emerging: Docker for microservices and Singularity for computing applications. We present here the status of the ComputeOps project which has the goal to study the benefit of containers for HPC applications.

Download Full-text

Accounting and Billing in Computing Environments

Infonomics for Distributed Business and Decision-Making Environments ◽

10.4018/978-1-60566-890-1.ch010 ◽

2010 ◽

pp. 177-200

Author(s):

Claus-Peter Rückemann

Keyword(s):

High Performance Computing ◽

High Performance ◽

Comprehensive Overview ◽

Accounting Systems ◽

Legal Questions ◽

Public Survey ◽

Computing Environments ◽

One Year ◽

Science Framework ◽

Performance Computing

This chapter gives a comprehensive overview of the current status of accounting and billing for up-todate computing environments. Accounting is the key for the management of information system resources. At this stage of evolution of accounting systems it is adequate not to separate computing environments into High Performance Computing and Grid Computing environments for allowing a “holistic” view showing the different approaches and the state of the art for integrated accounting and billing in distributed computing environments. Requirements resulting from a public survey within all communities of the German Grid infrastructure, as well as from computing centres and resource providers of High Performance Computing resources like HLRN, and ZIVGrid, within the German e-Science framework, have been considered as well as requirements resulting from various information systems and the virtualisation of organisations and resources. Additionally, conceptual, technical, economical, and legal questions also had to be taken into consideration. After the requirements have been consolidated and implementations have been done over one year ago, now the overall results and conclusions are presented in the following sections showing a case study based on the GISIG framework and the Grid- GIS framework. The focus is on how an integrated architecture can be built and used in heterogeneous environments. A prototypical implementation is outlined that is able to manage and visualise relevant accounting and billing information based on suitable monitoring data in a virtual organisation (VO) specific way regarding basic business, economic, and security issues.

Download Full-text

Interim Report on the Status of the High Performance Computing and Communications Initiative

10.17226/10525 ◽

1994 ◽

Keyword(s):

High Performance Computing ◽

High Performance ◽

Interim Report ◽

The Status ◽

Performance Computing

Download Full-text

Cosmos : A Unified Accounting System both for the HTCondor and Slurm Clusters at IHEP

EPJ Web of Conferences ◽

10.1051/epjconf/202024507060 ◽

2020 ◽

Vol 245 ◽

pp. 07060

Author(s):

Ran Du ◽

Jingyan Shi ◽

Xiaowei Jiang ◽

Jiaheng Zou

Keyword(s):

High Performance Computing ◽

Virtual Machine ◽

High Throughput ◽

High Performance ◽

Accounting System ◽

High Throughput Computing ◽

Parallel Jobs ◽

The Status ◽

Set Up ◽

Performance Computing

HTCondor was adopted to manage the High Throughput Computing (HTC) cluster at IHEP in 2016. In 2017 a Slurm cluster was set up to run High Performance Computing (HPC) jobs. To provide accounting services for these two clusters, we implemented a unified accounting system named Cosmos. Multiple workloads bring different accounting requirements. Briefly speaking, there are four types of jobs to account. First of all, 30 million single-core jobs run in the HTCondor cluster every year. Secondly, Virtual Machine (VM) jobs run in the legacy HTCondor VM cluster. Thirdly, parallel jobs run in the Slurm cluster, and some of these jobs are run on the GPU worker nodes to accelerate computing. Lastly, some selected HTC jobs are migrated from the HTCondor cluster to the Slurm cluster for research purposes. To satisfy all the mentioned requirements, Cosmos is implemented with four layers: acquisition, integration, statistics and presentation. Details about the issues and solutions of each layer will be presented in the paper. Cosmos has run in production for two years, and the status shows that it is a well-functioning system, also meets the requirements of the HTCondor and Slurm clusters.

Download Full-text

Simulation of Multilayer Shallow Water Fluid Flow Using Lattice Boltzmann Modeling and High Performance Computing

World Environmental and Water Resources Congress 2009 ◽

10.1061/41036(342)282 ◽

2009 ◽

Author(s):

K. R. Tubbs ◽

F. T. -C. Tsai

Keyword(s):

Fluid Flow ◽

Shallow Water ◽

High Performance Computing ◽

Lattice Boltzmann ◽

High Performance ◽

Lattice Boltzmann Modeling ◽

Performance Computing

Download Full-text

High performance computing on graphics processing units

Pollack Periodica ◽

10.1556/pollack.3.2008.2.3 ◽

2008 ◽

Vol 3 (2) ◽

pp. 27-34 ◽

Cited By ~ 2

Author(s):

Balázs Tukora ◽

Tibor Szalay

Keyword(s):

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Graphics Processing ◽

Performance Computing

Download Full-text

The Recent Revolution in High Performance Computing

MRS Bulletin ◽

10.1557/s0883769400034096 ◽

1997 ◽

Vol 22 (10) ◽

pp. 5-6

Author(s):

Horst D. Simon

Keyword(s):

High Performance Computing ◽

High Performance ◽

New Technologies ◽

New Technology ◽

Parallel Architecture ◽

Time Frame ◽

Good News ◽

Computing Industry ◽

Time And Energy ◽

Performance Computing

Recent events in the high-performance computing industry have concerned scientists and the general public regarding a crisis or a lack of leadership in the field. That concern is understandable considering the industry's history from 1993 to 1996. Cray Research, the historic leader in supercomputing technology, was unable to survive financially as an independent company and was acquired by Silicon Graphics. Two ambitious new companies that introduced new technologies in the late 1980s and early 1990s—Thinking Machines and Kendall Square Research—were commercial failures and went out of business. And Intel, which introduced its Paragon supercomputer in 1994, discontinued production only two years later.During the same time frame, scientists who had finished the laborious task of writing scientific codes to run on vector parallel supercomputers learned that those codes would have to be rewritten if they were to run on the next-generation, highly parallel architecture. Scientists who are not yet involved in high-performance computing are understandably hesitant about committing their time and energy to such an apparently unstable enterprise.However, beneath the commercial chaos of the last several years, a technological revolution has been occurring. The good news is that the revolution is over, leading to five to ten years of predictable stability, steady improvements in system performance, and increased productivity for scientific applications. It is time for scientists who were sitting on the fence to jump in and reap the benefits of the new technology.

Download Full-text