Parallel STEPS: Large Scale Stochastic Spatial Reaction-Diffusion Simulation with High Performance Computers

AbstractTo extend prevailing scaling limits when solving time-dependent partial differential equations, the parallel full approximation scheme in space and time (PFASST) has been shown to be a promising parallel-in-time integrator. Similar to space–time multigrid, PFASST is able to compute multiple time-steps simultaneously and is therefore in particular suitable for large-scale applications on high performance computing systems. In this work we couple PFASST with a parallel spectral deferred correction (SDC) method, forming an unprecedented doubly time-parallel integrator. While PFASST provides global, large-scale “parallelization across the step”, the inner parallel SDC method allows integrating each individual time-step “parallel across the method” using a diagonalized local Quasi-Newton solver. This new method, which we call “PFASST with Enhanced concuRrency” (PFASST-ER), therefore exposes even more temporal concurrency. For two challenging nonlinear reaction-diffusion problems, we show that PFASST-ER works more efficiently than the classical variants of PFASST and can use more processors than time-steps.

Download Full-text

TESTS OF RANDOM NUMBER GENERATORS USING ISING MODEL SIMULATIONS

International Journal of Modern Physics C ◽

10.1142/s0129183196000235 ◽

1996 ◽

Vol 07 (03) ◽

pp. 295-303 ◽

Cited By ~ 22

Author(s):

P. D. CODDINGTON

Keyword(s):

Monte Carlo ◽

Ising Model ◽

Monte Carlo Simulations ◽

High Performance ◽

Large Scale ◽

Random Number ◽

Random Number Generators ◽

Lattice Monte Carlo ◽

High Performance Computers

Large-scale Monte Carlo simulations require high-quality random number generators to ensure correct results. The contrapositive of this statement is also true — the quality of random number generators can be tested by using them in large-scale Monte Carlo simulations. We have tested many commonly-used random number generators with high precision Monte Carlo simulations of the 2-d Ising model using the Metropolis, Swendsen-Wang, and Wolff algorithms. This work is being extended to the testing of random number generators for parallel computers. The results of these tests are presented, along with recommendations for random number generators for high-performance computers, particularly for lattice Monte Carlo simulations.

Download Full-text

Monitoring and Controlling Large Scale Systems

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Large-Scale Distributed Computing and Applications ◽

10.4018/978-1-61520-703-9.ch007 ◽

2010 ◽

pp. 141-167

Author(s):

Valentin Cristea ◽

Ciprian Dobre ◽

Corina Stratan ◽

Florin Pop

Keyword(s):

Distributed Systems ◽

Performance Management ◽

High Performance ◽

Large Scale ◽

Survey Study ◽

Large Scale Systems ◽

General Management ◽

Large Numbers ◽

Scalable Monitoring ◽

High Performance Computers

The architectural shift presented in the previous chapters towards high performance computers assembled from large numbers of commodity resources raises numerous design issues and assumptions pertaining to traceability, fault tolerance and scalability. Hence, one of the key challenges faced by high performance distributed systems is scalable monitoring of system state. The aim of this chapter is to realize a survey study of existing work and trends in distributed systems monitoring by introducing the involved concepts and requirements, techniques, models and related standardization activities. Monitoring can be defined as the process of dynamic collection, interpretation and presentation of information concerning the characteristics and status of resources of interest. It is needed for various purposes such as debugging, testing, program visualization and animation. It may also be used for general management activities, which have a more permanent and continuous nature (performance management, configuration management, fault management, security management, etc.). In this case the behavior of the system is observed and monitoring information is gathered. This information is used to make management decisions and perform the appropriate control actions on the system. Unlike monitoring which is generally a passive process, control actively changes the behavior of the managed system and it has to be considered and modeled separately. Monitoring proves to be an essential process to observe and improve the reliability and the performance of large-scale distributed systems.

Download Full-text

Large-Scale Blast Analysis of Reinforced Concrete with Advanced Constitutive Models on High Performance Computers

Structural Failure and Plasticity ◽

10.1016/b978-008043875-7/50170-2 ◽

2000 ◽

pp. 229-234

Author(s):

Kent T. Danielson ◽

Mark D. Adley ◽

Stephen A. Akers ◽

Photios P. Papados

Keyword(s):

Reinforced Concrete ◽

High Performance ◽

Large Scale ◽

Constitutive Models ◽

Blast Analysis ◽

High Performance Computers

Download Full-text

Strategies for large-scale structural problems on high-performance computers

ACM SIGARCH Computer Architecture News ◽

10.1145/255129.255164 ◽

1990 ◽

Vol 18 (3b) ◽

pp. 267-280

Author(s):

Ahmed K. Noor ◽

Jeanne M. Peters

Keyword(s):

High Performance ◽

Large Scale ◽

Structural Problems ◽

High Performance Computers

Download Full-text

Computation of Engine Noise Propagation and Scattering off An Aircraft

International Journal of Aeroacoustics ◽

10.1260/147547202765275989 ◽

2002 ◽

Vol 1 (4) ◽

pp. 403-420 ◽

Cited By ~ 18

Author(s):

D. Stanescu ◽

J. Xu ◽

M.Y. Hussaini ◽

F. Farassat

Keyword(s):

Experimental Data ◽

Time Domain ◽

High Performance ◽

Large Scale ◽

Numerical Algorithms ◽

Spectral Element ◽

Spectral Element Methods ◽

Data Set ◽

Noise Field ◽

High Performance Computers

The purpose of this paper is to demonstrate the feasibility of computing the fan inlet noise field around a real twin-engine aircraft, which includes the radiation of the main spinning modes from the engine as well as the reflection and scattering by the fuselage and the wing. This first-cut large-scale computation is based on time domain and frequency domain approaches that employ spectral element methods for spatial discretization. The numerical algorithms are designed to exploit high-performance computers such as the IBM SP4. Although the simulations could not match the exact conditions of the only available experimental data set, they are able to predict the trends of the measured noise field fairly well.

Download Full-text

A state-dependent mean-field formalism to model different activity states in conductance based networks of spiking neurons

10.1101/565127 ◽

2019 ◽

Cited By ~ 2

Author(s):

Cristiano Capone ◽

Matteo di Volo ◽

Alberto Romagnoni ◽

Maurizio Mattia ◽

Alain Destexhe

Keyword(s):

Population Dynamics ◽

High Performance ◽

Large Scale ◽

Mean Field ◽

Analytic Solutions ◽

Slow Oscillations ◽

State Dependent ◽

The Mean ◽

Field Formalism ◽

High Performance Computers

AbstractHigher and higher interest has been shown in the recent years to large scale spiking simulations of cerebral neuronal networks, coming both from the presence of high performance computers and increasing details in the experimental observations. In this context it is important to understand how population dynamics are generated by the designed parameters of the networks, that is the question addressed by mean field theories. Despite analytic solutions for the mean field dynamics has already been proposed generally for current based neurons (CUBA), the same for more realistic neural properties, such as conductance based (COBA) network of adaptive exponential neurons (AdEx), a complete analytic model has not been achieved yet. Here, we propose a novel principled approach to map a COBA on a CUBA. Such approach provides a state-dependent approximation capable to reliably predict the firing rate properties of an AdEx neuron with non-instantaneous COBA integration. We also applied our theory to population dynamics, predicting the dynamical properties of the network in very different regimes, such as asynchronous irregular (AI) and synchronous irregular (SI) (slow oscillations, SO).This results show that a state-dependent approximation can be successfully introduced in order to take into account the subtle effects of COBA integration and to deal with a theory capable to correctly predicts the activity in regimes of alternating states like slow oscillations.

Download Full-text

pSpatiocyte: a high-performance simulator for intracellular reaction-diffusion systems

10.1101/860650 ◽

2019 ◽

Author(s):

Satya N. V. Arjunan ◽

Atsushi Miyauchi ◽

Kazunari Iwamoto ◽

Koichi Takahashi

Keyword(s):

High Performance ◽

Large Scale ◽

Diffusion Processes ◽

Direct Method ◽

Network Motif ◽

Computational Cost ◽

Reaction Diffusion ◽

Cellular Systems ◽

Reaction Diffusion Systems ◽

Diffusion Systems

ABSTRACTBackgroundStudies using quantitative experimental methods have shown that intracellular spatial distribution of molecules plays a central role in many cellular systems. Spatially resolved computer simulations can integrate quantitative data from these experiments to construct physically accurate models of the systems. Although computationally expensive, microscopic resolution reaction-diffusion simulators, such as Spatiocyte can directly capture intracellular effects comprising diffusion-limited reactions and volume exclusion from crowded molecules by explicitly representing individual diffusing molecules in space. To alleviate the steep computational cost typically associated with the simulation of large or crowded intracellular compartments, we present a parallelized Spatiocyte method called pSpatiocyte.ResultsThe new high-performance method employs unique parallelization schemes on hexagonal close-packed (HCP) lattice to efficiently exploit the resources of common workstations and large distributed memory parallel computers. We introduce a coordinate system for fast accesses to HCP lattice voxels, a parallelized event scheduler, a parallelized Gillespie’s direct-method for unimolecular reactions, and a parallelized event for diffusion and bimolecular reaction processes. We verified the correctness of pSpatiocyte reaction and diffusion processes by comparison to theory. To evaluate the performance of pSpatiocyte, we performed a series of parallelized diffusion runs on the RIKEN K computer. In the case of fine lattice discretization with low voxel occupancy, pSpatiocyte exhibited 74% parallel efficiency and achieved a speedup of 7686 times with 663552 cores compared to the runtime with 64 cores. In the weak scaling performance, pSpatiocyte obtained efficiencies of at least 60% with up to 663552 cores. When executing the Michaelis-Menten benchmark model on an eight-core workstation, pSpatiocyte required 45- and 55-fold shorter runtimes than Smoldyn and the parallel version of ReaDDy, respectively. As a high-performance application example, we study the dual phosphorylation-dephosphorylation cycle of the MAPK system, a typical reaction network motif in cell signaling pathways.ConclusionspSpatiocyte demonstrates good accuracies, fast runtimes and a significant performance advantage over well-known microscopic particle simulators for large-scale simulations of intracellular reaction-diffusion systems. The source code of pSpatiocyte is available at https://spatiocyte.org.

Download Full-text

Strategies for large scale structural problems on high-performance computers

Communications in Applied Numerical Methods ◽

10.1002/cnm.1630070607 ◽

1991 ◽

Vol 7 (6) ◽

pp. 465-478 ◽

Cited By ~ 4

Author(s):

Ahmed K. Noor ◽

Jeanne M. Peters

Keyword(s):

High Performance ◽

Large Scale ◽

Structural Problems ◽

High Performance Computers

Download Full-text

Planck intermediate results

Astronomy and Astrophysics ◽

10.1051/0004-6361/202038073 ◽

2020 ◽

Vol 643 ◽

pp. A42 ◽

Cited By ~ 1

Author(s):

◽

Y. Akrami ◽

K. J. Andersen ◽

M. Ashdown ◽

C. Baccigalupi ◽

...

Keyword(s):

Time Domain ◽

High Performance ◽

Large Scale ◽

Low Frequency ◽

Component Separation ◽

Full Time ◽

Single Detector ◽

Ordered Data ◽

High Performance Computers ◽

Temperature Maps

We present the NPIPE processing pipeline, which produces calibrated frequency maps in temperature and polarization from data from the Planck Low Frequency Instrument (LFI) and High Frequency Instrument (HFI) using high-performance computers. NPIPE represents a natural evolution of previous Planck analysis efforts, and combines some of the most powerful features of the separate LFI and HFI analysis pipelines. For example, following the LFI 2018 processing procedure, NPIPE uses foreground polarization priors during the calibration stage in order to break scanning-induced degeneracies. Similarly, NPIPE employs the HFI 2018 time-domain processing methodology to correct for bandpass mismatch at all frequencies. In addition, NPIPE introduces several improvements, including, but not limited to: inclusion of the 8% of data collected during repointing manoeuvres; smoothing of the LFI reference load data streams; in-flight estimation of detector polarization parameters; and construction of maximally independent detector-set split maps. For component-separation purposes, important improvements include: maps that retain the CMB Solar dipole, allowing for high-precision relative calibration in higher-level analyses; well-defined single-detector maps, allowing for robust CO extraction; and HFI temperature maps between 217 and 857 GHz that are binned into 0′.9 pixels (Nside = 4096), ensuring that the full angular information in the data is represented in the maps even at the highest Planck resolutions. The net effect of these improvements is lower levels of noise and systematics in both frequency and component maps at essentially all angular scales, as well as notably improved internal consistency between the various frequency channels. Based on the NPIPE maps, we present the first estimate of the Solar dipole determined through component separation across all nine Planck frequencies. The amplitude is (3366.6 ± 2.7) μK, consistent with, albeit slightly higher than, earlier estimates. From the large-scale polarization data, we derive an updated estimate of the optical depth of reionization of τ = 0.051 ± 0.006, which appears robust with respect to data and sky cuts. There are 600 complete signal, noise and systematics simulations of the full-frequency and detector-set maps. As a Planck first, these simulations include full time-domain processing of the beam-convolved CMB anisotropies. The release of NPIPE maps and simulations is accompanied with a complete suite of raw and processed time-ordered data and the software, scripts, auxiliary data, and parameter files needed to improve further on the analysis and to run matching simulations.

Download Full-text