scholarly journals ITERATIVE AND PARALLEL PERFORMANCE ANALYSIS OF NON-BLOCKING COMMUNICATION ALGORITHMS IN THE MASSIVELY PARALLEL NEUTRON TRANSPORT CODE PIDOTS

2021 ◽  
Vol 247 ◽  
pp. 03016
Author(s):  
Raffi Yessayan ◽  
Yousry Y. Azmy ◽  
R. Joseph Zerr

The PIDOTS neutral particle transport code utilizes a red/black implementation of the Parallel Gauss-Seidel algorithm to solve the SN approximation of the neutron transport equation on 3D Cartesian meshes. PIDOTS is designed for execution on massively parallel platforms and is capable of using the full resources of modern, leadership class high performance computers. Initial testing revealed that some configurations of PIDOTS’s Integral Transport Matrix Method solver demonstrated unexpectedly poor parallel scaling. Work at Idaho and Los Alamos National Laboratories then revealed that this inefficiency was a result of the accumulation of high-cost latency events in the complex blocking communication networks employed during each PIDOTS iteration. That work explored the possibility of minimizing those inefficiencies while maintaining a blocking communications model. While significant speedups were obtained, it was shown that fully mitigating the problem on general-purpose platforms was highly unlikely for a blocking code. This work continues that analysis by implementing a deeply interleaved non-blocking communication model into PIDOTS. This new model benefits from the optimization work performed on the blocking model while also providing significant opportunities to overlap the remaining un-mitigated communication costs with computation. Additionally, our new approach is easily transferable to other similarly spatially decomposed codes. The resulting algorithm was tested on LANL’s Trinity system at up to 32,768 processors and was found at that processor count to effectively hide 100% of MPI communication cost – equivalently 20% of the red/black phase time. It is expected that the implemented interleaving algorithm can fully support far higher processor counts and completely hide communication costs up ~50% of total iteration time.

2021 ◽  
Vol 247 ◽  
pp. 03017
Author(s):  
Dylan S. Hoagland ◽  
Yousry Y. Azmy

Parallel Block Jacobi (PBJ) [1] is an asynchronous spatial domain decomposition with application in solving the neutron transport equation due to its extendibility to massively parallel solution in unstructured spatial meshes (grids) without the use of the computationally complex and expensive sweeps required by the Source Iteration (SI) method in these applications. [2] However, PBJ iterative methods suffer a lack of iterative robustness in problems with optically thin cells, [1] which we have previously demonstrated to be a consequence of PBJ’s asynchronicity. To mitigate this effect, we have developed multiple PBJ / SI hybrid methods which employ a PBJ method (Parallel Block Jacobi - Integral Transport Matrix Method (PBJ-ITMM) or Inexact Parallel Block Jacobi (IPBJ)) along with SI. [3,4] In this work, we perform a parametric study to determine performance of numerous PBJ / SI hybrid methods as a function of multiple problem parameters. This parametric study reached 5 main conclusions: 1) our hybrid approach is more effective with PBJ-ITMM than with IPBJ, 2) for PBJ-ITMM, there is a hybrid method that mitigates the aforementioned iterative slowdown in optically thin cells without diminishing the method’s potential parallelism in unstructured grids, 3) this hybrid method is most effective in problems with large, continuous regions of very thin cells, 4) the best performing hybrid method consistently executes within a factor of ten slower than current state-of-the-art acceleration methods that are not efficiently extendable to the massively parallel regime, and 5) both PBJ-ITMM and IPBJ are observed to be viable approaches for our desired applications. In the pursuit of implementing PBJ-ITMM in unstructured grids, we conclude with a description of the Green’s Function ITMM Construction (GFIC) algorithm, which allows for the ITMM matrices to be constructed using the pre-existing SI sweep algorithm already present in unstructured grid SN transport codes.


Author(s):  
N. VENKATESWARAN ◽  
S. PATTABIRAMAN ◽  
R. DEVANATHAN ◽  
B. KUMARAN ◽  
ASHRAF AHMED ◽  
...  

Very Large Array Processors (VLAP) will be the need of the future for solving computationally intense Very Large Problems (VLP) common in pattern recognition, image processing and other related areas of digital signal processing. Design methodology of such VLAPs for massively parallel dedicated/general purpose applications is highly complex. Two companion papers (Part 1 and Part 2) on VLAP are presented in this issue. In Part 1, we propose a VLAP called Reconfigurable GIPOP Processor Array (RGPA). The RGPA is made up of high performance processing elements called the Generalized Inner Product Outer Product (GIPOP) processor. Unlike the traditional special/general purpose processors, ours has a totally different and new architecture and organization involving higher level functional units to match with the complex computational structures of numeric algorithms and suitable for massively parallel processing. We also present a strategy for mapping VLPs on VLAPs. In Part 2, we propose a novel VLSI design methodology for implementing cost effective and very high performance processors meant for special purpose applications and in particular, for VLAPs.


2020 ◽  
Vol 239 ◽  
pp. 22012
Author(s):  
Qu Wu ◽  
Xingjie Peng ◽  
Guanlin Shi ◽  
Yingrui Yu ◽  
Qing Li ◽  
...  

Nuclear data sensitivity analysis and uncertainty propagation have been extensively applied to nuclear data adjustment and uncertainty quantification in the field of nuclear engineering. Sensitivity and Uncertainty (S&U) analysis is developed in the KYADJ whole-core transport code in order to meet the requirement of advanced reactor design. KYADJ aims to use two-dimension Method of Characteristic (MOC) and one-dimension discrete ordinate (SN) coupled method to solve the neutron transport equation and achieve one-step direct transport calculation of the reactor core. Developing sensitivity and uncertainty analysis module in KYADJ can minimize deviations caused by modeling approximation and enhance calculation efficiency. This work describes the application of the classic perturbation theory to the KYADJ transport solver. In order to obtain uncertainty, a technique is proposed for processing a covariance data file in 45-group energy grid instead of 44-group SCALE 6.1 covariance data which is extensively used in various codes. Numerical results for Uncertainty Analysis in Modelling (UAM) benchmarks and the SF96 benchmark are presented. The results agree well with the reference and the capability of S&U analysis in KYADJ is verified.


2014 ◽  
Vol 36 (4) ◽  
pp. 790-798
Author(s):  
Kai ZHANG ◽  
Shu-Ming CHEN ◽  
Yao-Hua WANG ◽  
Xi NING

2011 ◽  
Vol 28 (1) ◽  
pp. 1-14 ◽  
Author(s):  
W. van Straten ◽  
M. Bailes

Abstractdspsr is a high-performance, open-source, object-oriented, digital signal processing software library and application suite for use in radio pulsar astronomy. Written primarily in C++, the library implements an extensive range of modular algorithms that can optionally exploit both multiple-core processors and general-purpose graphics processing units. After over a decade of research and development, dspsr is now stable and in widespread use in the community. This paper presents a detailed description of its functionality, justification of major design decisions, analysis of phase-coherent dispersion removal algorithms, and demonstration of performance on some contemporary microprocessor architectures.


2021 ◽  
Vol 11 (10) ◽  
pp. 4610
Author(s):  
Simone Berneschi ◽  
Giancarlo C. Righini ◽  
Stefano Pelli

Glasses, in their different forms and compositions, have special properties that are not found in other materials. The combination of transparency and hardness at room temperature, combined with a suitable mechanical strength and excellent chemical durability, makes this material indispensable for many applications in different technological fields (as, for instance, the optical fibres which constitute the physical carrier for high-speed communication networks as well as the transducer for a wide range of high-performance sensors). For its part, ion-exchange from molten salts is a well-established, low-cost technology capable of modifying the chemical-physical properties of glass. The synergy between ion-exchange and glass has always been a happy marriage, from its ancient historical background for the realisation of wonderful artefacts, to the discovery of novel and fascinating solutions for modern technology (e.g., integrated optics). Getting inspiration from some hot topics related to the application context of this technique, the goal of this critical review is to show how ion-exchange in glass, far from being an obsolete process, can still have an important impact in everyday life, both at a merely commercial level as well as at that of frontier research.


2021 ◽  
pp. 107915
Author(s):  
Sooyoung Choi ◽  
Wonkyeong Kim ◽  
Jiwon Choe ◽  
Woonghee Lee ◽  
Hanjoo Kim ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document