Writing, Running, and Analyzing Large-scale Scientific Simulations with Jupyter Notebooks

Increasing processing capabilities and input/output constraints of supercomputers have increased the use of co-processing approaches, i.e., visualizing and analyzing data sets of simulations on the fly. We present a method that evaluates the importance of different regions of simulation data and a data-driven approach that uses the proposed method to accelerate in-transit co-processing of large-scale simulations. We use the importance metrics to simultaneously employ multiple compression methods on different data regions to accelerate the in-transit co-processing. Our approach strives to adaptively compress data on the fly and uses load balancing to counteract memory imbalances. We demonstrate the method’s efficiency through a fluid mechanics application, a Richtmyer–Meshkov instability simulation, showing how to accelerate the in-transit co-processing of simulations. The results show that the proposed method expeditiously can identify regions of interest, even when using multiple metrics. Our approach achieved a speedup of 1.29× in a lossless scenario. The data decompression time was sped up by 2× compared to using a single compression method uniformly.

Download Full-text

Configuration and Performance of a Beowulf Cluster for Large-Scale Scientific Simulations

Computing in Science & Engineering ◽

10.1109/mcse.2005.29 ◽

2005 ◽

Vol 7 (2) ◽

pp. 14-26 ◽

Cited By ~ 4

Author(s):

M.K. Gobbert

Keyword(s):

Large Scale ◽

Beowulf Cluster ◽

Scientific Simulations ◽

And Performance

Download Full-text

Cloud Computing for Scientific Simulation and High Performance Computing

Principles, Methodologies, and Service-Oriented Approaches for Cloud Computing ◽

10.4018/978-1-4666-2854-0.ch003 ◽

2013 ◽

pp. 51-70

Author(s):

Adrian Jackson ◽

Michèle Weiland

Keyword(s):

Cloud Computing ◽

High Performance Computing ◽

High Performance ◽

Large Scale ◽

Parallel Programs ◽

Small Scale ◽

Cloud Infrastructure ◽

Scientific Simulations ◽

Cloud Infrastructures ◽

Performance Computing

This chapter describes experiences using Cloud infrastructures for scientific computing, both for serial and parallel computing. Amazon’s High Performance Computing (HPC) Cloud computing resources were compared to traditional HPC resources to quantify performance as well as assessing the complexity and cost of using the Cloud. Furthermore, a shared Cloud infrastructure is compared to standard desktop resources for scientific simulations. Whilst this is only a small scale evaluation these Cloud offerings, it does allow some conclusions to be drawn, particularly that the Cloud can currently not match the parallel performance of dedicated HPC machines for large scale parallel programs but can match the serial performance of standard computing resources for serial and small scale parallel programs. Also, the shared Cloud infrastructure cannot match dedicated computing resources for low level benchmarks, although for an actual scientific code, performance is comparable.

Download Full-text

Delayed Difference Scheme for Large Scale Scientific Simulations

Physical Review Letters ◽

10.1103/physrevlett.113.218701 ◽

2014 ◽

Vol 113 (21) ◽

Cited By ~ 6

Author(s):

Dheevatsa Mudigere ◽

Sunil D. Sherlekar ◽

Santosh Ansumali

Keyword(s):

Difference Scheme ◽

Large Scale ◽

Scientific Simulations

Download Full-text

Building high accuracy emulators for scientiﬁc simulations with deep neural architecture search

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ac3ffa ◽

2021 ◽

Author(s):

Muhammad Firmansyah Kasim ◽

D. Watson-Parris ◽

L. Deaconu ◽

S. Oliver ◽

P. Hatﬁeld ◽

...

Keyword(s):

Large Scale ◽

Scientific Discovery ◽

Fusion Energy ◽

High Energy ◽

Training Data ◽

High Energy Density ◽

Neural Architecture ◽

Large Scale Data ◽

Scientific Simulations ◽

Computational Discovery

Abstract Computer simulations are invaluable tools for scientiﬁc discovery. However, accurate simulations are often slow to execute, which limits their applicability to extensive parameter exploration, large-scale data analysis, and uncertainty quantiﬁcation. A promising route to accelerate simulations by building fast emulators with machine learning requires large training datasets, which can be prohibitively expensive to obtain with slow simulations. Here we present a method based on neural architecture search to build accurate emulators even with a limited number of training data. The method successfully emulates simulations in 10 scientiﬁc cases including astrophysics, climate sci-ence, biogeochemistry, high energy density physics, fusion energy, and seismology, using the same super-architecture, algorithm, and hyperparameters. Our approach also inherently provides emulator uncertainty estimation, adding further conﬁdence in their use. We anticipate this work will accelerate research involving expensive simulations, allow more extensive parameters exploration, and enable new, previously unfeasible computational discovery.

Download Full-text

Optimizing In-Situ Data Compression for Large-Scale Scientific Simulations

24th High Performance Computing Symposium ◽

10.22360/springsim.2016.hpc.043 ◽

2016 ◽

Keyword(s):

Data Compression ◽

Large Scale ◽

In Situ Data ◽

Scientific Simulations

Download Full-text

The Zoltan and Isorropia Parallel Toolkits for Combinatorial Scientific Computing: Partitioning, Ordering and Coloring

Scientific Programming ◽

10.1155/2012/713587 ◽

2012 ◽

Vol 20 (2) ◽

pp. 129-150 ◽

Cited By ~ 20

Author(s):

Erik G. Boman ◽

Ümit V. Çatalyürek ◽

Cédric Chevalier ◽

Karen D. Devine

Keyword(s):

Load Balancing ◽

Graph Coloring ◽

User Interfaces ◽

Large Scale ◽

Scientific Computing ◽

Combinatorial Problems ◽

Parallel Applications ◽

Matrix Algorithms ◽

Scientific Simulations ◽

Matrix Ordering

Partitioning and load balancing are important problems in scientific computing that can be modeled as combinatorial problems using graphs or hypergraphs. The Zoltan toolkit was developed primarily for partitioning and load balancing to support dynamic parallel applications, but has expanded to support other problems in combinatorial scientific computing, including matrix ordering and graph coloring. Zoltan is based on abstract user interfaces and uses callback functions. To simplify the use and integration of Zoltan with other matrix-based frameworks, such as the ones in Trilinos, we developed Isorropia as a Trilinos package, which supports most of Zoltan's features via a matrix-based interface. In addition to providing an easy-to-use matrix-based interface to Zoltan, Isorropia also serves as a platform for additional matrix algorithms. In this paper, we give an overview of the Zoltan and Isorropia toolkits, their design, capabilities and use. We also show how Zoltan and Isorropia enable large-scale, parallel scientific simulations, and describe current and future development in the next-generation package Zoltan2.

Download Full-text