Simulation of Gas Dynamics of Hypersonic Aircrafts with the Use of Model of High-Temperature Air and Graphics Processor Units

Проводится численное моделирование обтекания гиперзвукового летательного аппарата с использованием модели высокотемпературного воздуха и гибридной архитектуры на основе высокопроизводительных графических процессорных устройств. Расчеты проводятся на основе уравнений Эйлера, для дискретизации которых применяется метод конечных объемов на неструктурированных сетках. Приводятся результаты исследования эффективности расчета гиперзвуковых течений газа на графических процессорах. Обсуждается время счета, достигнутое при использовании моделей совершенного и реального газа. Numerical simulation of the flow around a hypersonic aircraft is carried out using a high-temperature air model and a hybrid architecture based on high-performance graphics processing units. The calculations are performed with the Euler equations discretized by the finite volume method on unstructured meshes. The scalability of the developed implementations of the model is studied and the results of the study of the efficiency of calculating hypersonic gas flows on graphics processors are analyzed. The computational time spent with the perfect and real gas models is discussed.

Download Full-text

Discharge hydrograph estimation at upstream-ungauged sections by coupling a Bayesian methodology and a 2-D GPU shallow water model

Hydrology and Earth System Sciences ◽

10.5194/hess-22-5299-2018 ◽

2018 ◽

Vol 22 (10) ◽

pp. 5299-5316 ◽

Cited By ~ 3

Author(s):

Alessia Ferrari ◽

Marco D'Oria ◽

Renato Vacondio ◽

Alessandro Dal Palù ◽

Paolo Mignosa ◽

...

Keyword(s):

Shallow Water ◽

Graphics Processing Units ◽

High Performance ◽

Optimization Procedure ◽

Water Model ◽

Computational Time ◽

Geostatistical Approach ◽

River Reach ◽

Graphics Processing ◽

Flow Propagation

Abstract. This paper presents a novel methodology for estimating the unknown discharge hydrograph at the entrance of a river reach when no information is available. The methodology couples an optimization procedure based on the Bayesian geostatistical approach (BGA) with a forward self-developed 2-D hydraulic model. In order to accurately describe the flow propagation in real rivers characterized by large floodable areas, the forward model solves the 2-D shallow water equations (SWEs) by means of a finite volume explicit shock-capturing algorithm. The two-dimensional SWE code exploits the computational power of graphics processing units (GPUs), achieving a ratio of physical to computational time of up to 1000. With the aim of enhancing the computational efficiency of the inverse estimation, the Bayesian technique is parallelized, developing a procedure based on the Secure Shell (SSH) protocol that allows one to take advantage of remote high-performance computing clusters (including those available on the Cloud) equipped with GPUs. The capability of the methodology is assessed by estimating irregular and synthetic inflow hydrographs in real river reaches, also taking into account the presence of downstream corrupted observations. Finally, the procedure is applied to reconstruct a real flood wave in a river reach located in northern Italy.

Download Full-text

Discharge hydrograph estimation at upstream-ungauged sections by coupling a Bayesian methodology and a 2D GPU Shallow Water model

10.5194/hess-2018-118 ◽

2018 ◽

Author(s):

Alessia Ferrari ◽

Marco D'Oria ◽

Renato Vacondio ◽

Alessandro Dal Palù ◽

Paolo Mignosa ◽

...

Keyword(s):

Shallow Water ◽

Graphics Processing Units ◽

High Performance ◽

Optimization Procedure ◽

Water Model ◽

Computational Time ◽

Coupled Models ◽

Geostatistical Approach ◽

River Reach ◽

Graphics Processing

Abstract. In this paper a novel methodology to estimate the unknown discharge hydrograph at the entrance of a river reach, where no information is available, is presented. The methodology is obtained by coupling an optimization procedure, based on the Bayesian Geostatistical Approach (BGA), with a forward self-developed 2D hydraulic model of the stream. In order to accurately describe the flow propagation in real rivers characterized by large floodable areas, the forward model solves the 2D Shallow Water Equations by means of a Finite Volume explicit shock-capturing algorithm. The forward code exploits the computational power of Graphics Processing Units (GPUs) achieving ratio of physical to computational time up to 1000. With the aim of enhancing the computational efficiency of the inverse estimation, the Bayesian technique is parallelized developing a procedure based on the Secure Shell (SSH) protocol which allows to take advantage of remote High Performance Computing clusters (including those available on the Cloud) equipped with GPUs. The capability of the coupled models is assessed estimating irregular and synthetic inflow hydrographs in real river reaches, taking into account also the presence of downstream corrupted observations. Finally, the capability to adopt this methodology for real cases is demonstrated by reconstructing a real flood wave in a river reach located in Northern Italy.

Download Full-text

Assessment of the Impact of Sand Mining on Bottom Morphology in the Mekong River in An Giang Province, Vietnam, Using a Hydro-Morphological Model with GPU Computing

Water ◽

10.3390/w12102912 ◽

2020 ◽

Vol 12 (10) ◽

pp. 2912 ◽

Cited By ~ 1

Author(s):

Tran Thi Kim ◽

Nguyen Thi Mai Huong ◽

Nguyen Dam Quoc Huy ◽

Pham Anh Tai ◽

Sumin Hong ◽

...

Keyword(s):

Numerical Model ◽

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Mekong River ◽

Computational Time ◽

Sand Mining ◽

Graphics Processing ◽

The Impact ◽

Performance Computing

Sand mining, among the many activities that have significant effects on the bed changes of rivers, has increased in many parts of the world in recent decades. Numerical modeling plays a vital role in simulation in the long term; however, computational time remains a challenge. In this paper, we propose a sand mining component integrated into the bedload continuity equation and combine it with high-performance computing using graphics processing units to boost the speed of the simulation. The developed numerical model is applied to the Mekong river segment, flowing through Tan Chau Town, An Giang Province, Vietnam. The 20 years from 1999 to 2019 is examined in this study, both with and without sand mining activities. The results show that the numerical model can simulate the bed change for the period from 1999 to 2019. By adding the sand mining component (2002–2006), the bed change in the river is modeled closely after the actual development. The Tan An sand mine in the area (2002–2006) caused the channel to deviate slightly from that of An Giang and created a slight erosion channel in 2006 (−23 m). From 2006 to 2014, although Tan An mine stopped operating, the riverbed recovered quite slowly with a small accretion rate (0.25 m/year). However, the Tan An sand mine eroded again from 2014–2019 due to a lack of sand. In 2014, in the Vinh Hoa communes, An Giang Province, the Vinh Hoa sand mine began to operate. The results of simulating with sand mining incidents proved that sand mining caused the erosion channel to move towards the sand mines, and the erosion speed was faster when there was no sand mining. Combined with high-performance computing, harnessing the power of accelerators such as graphics processing units (GPUs) can help run numerical simulations up to 23x times faster.

Download Full-text

High performance computing on graphics processing units

Pollack Periodica ◽

10.1556/pollack.3.2008.2.3 ◽

2008 ◽

Vol 3 (2) ◽

pp. 27-34 ◽

Cited By ~ 2

Author(s):

Balázs Tukora ◽

Tibor Szalay

Keyword(s):

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Graphics Processing ◽

Performance Computing

Download Full-text

Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software

ACM Transactions on Mathematical Software ◽

10.1145/3441850 ◽

2021 ◽

Vol 47 (2) ◽

pp. 1-28

Author(s):

Goran Flegar ◽

Hartwig Anzt ◽

Terry Cojean ◽

Enrique S. Quintana-Ortí

Keyword(s):

Linear Algebra ◽

Graphics Processing Units ◽

High Performance ◽

Numerical Algorithms ◽

Mixed Precision ◽

Before And After ◽

Memory Accesses ◽

Specialized Hardware ◽

The Individual ◽

Graphics Processing

The use of mixed precision in numerical algorithms is a promising strategy for accelerating scientific applications. In particular, the adoption of specialized hardware and data formats for low-precision arithmetic in high-end GPUs (graphics processing units) has motivated numerous efforts aiming at carefully reducing the working precision in order to speed up the computations. For algorithms whose performance is bound by the memory bandwidth, the idea of compressing its data before (and after) memory accesses has received considerable attention. One idea is to store an approximate operator–like a preconditioner–in lower than working precision hopefully without impacting the algorithm output. We realize the first high-performance implementation of an adaptive precision block-Jacobi preconditioner which selects the precision format used to store the preconditioner data on-the-fly, taking into account the numerical properties of the individual preconditioner blocks. We implement the adaptive block-Jacobi preconditioner as production-ready functionality in the Ginkgo linear algebra library, considering not only the precision formats that are part of the IEEE standard, but also customized formats which optimize the length of the exponent and significand to the characteristics of the preconditioner blocks. Experiments run on a state-of-the-art GPU accelerator show that our implementation offers attractive runtime savings.

Download Full-text

DSPSR: Digital Signal Processing Software for Pulsar Astronomy

Publications of the Astronomical Society of Australia ◽

10.1071/as10021 ◽

2011 ◽

Vol 28 (1) ◽

pp. 1-14 ◽

Cited By ~ 172

Author(s):

W. van Straten ◽

M. Bailes

Keyword(s):

Signal Processing ◽

Digital Signal Processing ◽

Graphics Processing Units ◽

High Performance ◽

Digital Signal ◽

General Purpose ◽

Design Decisions ◽

Extensive Range ◽

Processing Software ◽

Graphics Processing

Abstractdspsr is a high-performance, open-source, object-oriented, digital signal processing software library and application suite for use in radio pulsar astronomy. Written primarily in C++, the library implements an extensive range of modular algorithms that can optionally exploit both multiple-core processors and general-purpose graphics processing units. After over a decade of research and development, dspsr is now stable and in widespread use in the community. This paper presents a detailed description of its functionality, justification of major design decisions, analysis of phase-coherent dispersion removal algorithms, and demonstration of performance on some contemporary microprocessor architectures.

Download Full-text

A lightweight approach to performance portability with targetDP

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016682071 ◽

2016 ◽

Vol 32 (2) ◽

pp. 288-301

Author(s):

Alan Gray ◽

Kevin Stratford

Keyword(s):

Particle Physics ◽

Message Passing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Message Passing Interface ◽

Graphics Processing Unit ◽

Processing Unit ◽

Performance Portability ◽

Graphics Processing

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.

Download Full-text

Big Data and IT Network Data Visualization

International Journal of Mathematical Engineering and Management Sciences ◽

10.33889/ijmems.2018.3.1-002 ◽

2018 ◽

Vol 3 (1) ◽

pp. 9-16 ◽

Cited By ~ 3

Author(s):

Lidong Wang

Keyword(s):

Big Data ◽

Network Analysis ◽

Graphics Processing Units ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Network Visualization ◽

Network Data ◽

Graphics Processing ◽

Performance Computing

Visualization with graphs is popular in the data analysis of Information Technology (IT) networks or computer networks. An IT network is often modelled as a graph with hosts being nodes and traffic being flows on many edges. General visualization methods are introduced in this paper. Applications and technology progress of visualization in IT network analysis and big data in IT network visualization are presented. The challenges of visualization and Big Data analytics in IT network visualization are also discussed. Big Data analytics with High Performance Computing (HPC) techniques, especially Graphics Processing Units (GPUs) helps accelerate IT network analysis and visualization.

Download Full-text

The VOLNA-OP2 tsunami code (version 1.5)

Geoscientific Model Development ◽

10.5194/gmd-11-4621-2018 ◽

2018 ◽

Vol 11 (11) ◽

pp. 4621-4635 ◽

Cited By ~ 7

Author(s):

Istvan Z. Reguly ◽

Daniel Giles ◽

Devaraj Gopinathan ◽

Laure Quivy ◽

Joakim H. Beck ◽

...

Keyword(s):

Graphics Processing Units ◽

High Performance ◽

Shallow Water Equation ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Central Processing ◽

Domain Specific ◽

Computing Platforms ◽

Graphics Processing ◽

Intel Xeon

Abstract. In this paper, we present the VOLNA-OP2 tsunami model and implementation; a finite-volume non-linear shallow-water equation (NSWE) solver built on the OP2 domain-specific language (DSL) for unstructured mesh computations. VOLNA-OP2 is unique among tsunami solvers in its support for several high-performance computing platforms: central processing units (CPUs), the Intel Xeon Phi, and graphics processing units (GPUs). This is achieved in a way that the scientific code is kept separate from various parallel implementations, enabling easy maintainability. It has already been used in production for several years; here we discuss how it can be integrated into various workflows, such as a statistical emulator. The scalability of the code is demonstrated on three supercomputers, built with classical Xeon CPUs, the Intel Xeon Phi, and NVIDIA P100 GPUs. VOLNA-OP2 shows an ability to deliver productivity as well as performance and portability to its users across a number of platforms.

Download Full-text

Accelerated FDPS: Algorithms to use accelerators with FDPS

Publications of the Astronomical Society of Japan ◽

10.1093/pasj/psz133 ◽

2020 ◽

Vol 72 (1) ◽

Cited By ~ 2

Author(s):

Masaki Iwasawa ◽

Daisuke Namekata ◽

Keigo Nitadori ◽

Kentaro Nomura ◽

Long Wang ◽

...

Keyword(s):

Graphics Processing Units ◽

High Performance ◽

General Purpose ◽

Performance Model ◽

Performance Tuning ◽

Data Types ◽

Interaction Function ◽

Current Implementation ◽

And Performance ◽

Graphics Processing

Abstract We describe algorithms implemented in FDPS (Framework for Developing Particle Simulators) to make efficient use of accelerator hardware such as GPGPUs (general-purpose computing on graphics processing units). We have developed FDPS to make it possible for researchers to develop their own high-performance parallel particle-based simulation programs without spending large amounts of time on parallelization and performance tuning. FDPS provides a high-performance implementation of parallel algorithms for particle-based simulations in a “generic” form, so that researchers can define their own particle data structure and interparticle interaction functions. FDPS compiled with user-supplied data types and interaction functions provides all the necessary functions for parallelization, and researchers can thus write their programs as though they are writing simple non-parallel code. It has previously been possible to use accelerators with FDPS by writing an interaction function that uses the accelerator. However, the efficiency was limited by the latency and bandwidth of communication between the CPU and the accelerator, and also by the mismatch between the available degree of parallelism of the interaction function and that of the hardware parallelism. We have modified the interface of the user-provided interaction functions so that accelerators are more efficiently used. We also implemented new techniques which reduce the amount of work on the CPU side and the amount of communication between CPU and accelerators. We have measured the performance of N-body simulations on a system with an NVIDIA Volta GPGPU using FDPS and the achieved performance is around 27% of the theoretical peak limit. We have constructed a detailed performance model, and found that the current implementation can achieve good performance on systems with much smaller memory and communication bandwidth. Thus, our implementation will be applicable to future generations of accelerator system.

Download Full-text