Prediction of Residual Stresses in a Multipass Pipe Weld by a Novel 3D Finite Element Approach

Due to enormous computation cost, current residual stress simulation of multipass girth welds are mostly performed using two-dimensional (2D) axisymmetric models. The 2D model can only provide limited estimation on the residual stresses by assuming its axisymmetric distribution. In this study, a highly efficient thermal-mechanical finite element code for three dimensional (3D) model has been developed based on high performance Graphics Processing Unit (GPU) computers. Our code is further accelerated by considering the unique physics associated with welding processes that are characterized by steep temperature gradient and a moving arc heat source. It is capable of modeling large-scale welding problems that cannot be easily handled by the existing commercial simulation tools. To demonstrate the accuracy and efficiency, our code was compared with a commercial software by simulating a 3D multi-pass girth weld model with over 1 million elements. Our code achieved comparable solution accuracy with respect to the commercial one but with over 100 times saving on computational cost. Moreover, the three-dimensional analysis demonstrated more realistic stress distribution that is not axisymmetric in hoop direction.

Download Full-text

A lightweight approach to performance portability with targetDP

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016682071 ◽

2016 ◽

Vol 32 (2) ◽

pp. 288-301

Author(s):

Alan Gray ◽

Kevin Stratford

Keyword(s):

Particle Physics ◽

Message Passing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Message Passing Interface ◽

Graphics Processing Unit ◽

Processing Unit ◽

Performance Portability ◽

Graphics Processing

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.

Download Full-text

Finite element method completely implemented for graphic processor units using parallel algorithm libraries

The International Journal of High Performance Computing Applications ◽

10.1177/1094342017694703 ◽

2017 ◽

Vol 33 (1) ◽

pp. 53-66 ◽

Cited By ~ 1

Author(s):

Franz Pichler ◽

Gundolf Haase

Keyword(s):

Finite Element ◽

Graphics Processing Unit ◽

Computational Cost ◽

Processing Unit ◽

Time Step ◽

Device Architecture ◽

Transient Problems ◽

Speed Up ◽

Automotive Batteries ◽

Graphics Processing

A finite element code is developed in which all of the computationally expensive steps are performed on a graphics processing unit via the THRUST and the PARALUTION libraries. The code focuses on the simulation of transient problems where the repeated computations per time-step create the computational cost. It is used to solve partial and ordinary differential equations as they arise in thermal-runaway simulations of automotive batteries. The speed-up obtained by utilizing the graphics processing unit for every critical step is compared against the single core and the multi-threading solutions which are also supported by the chosen libraries. This way a high total speed-up on the graphics processing unit is achieved without the need for programming a single classical Compute Unified Device Architecture kernel.

Download Full-text

Fast Three-Dimensional Finite Element Analysis of Weld Overlay Application on a Formed Feeder Elbow

ASME 2011 Pressure Vessels and Piping Conference: Volume 6, Parts A and B ◽

10.1115/pvp2011-57368 ◽

2011 ◽

Author(s):

Francis H. Ku ◽

Pete C. Riccardella

Keyword(s):

Residual Stress ◽

Finite Element Analysis ◽

Finite Element ◽

Residual Stresses ◽

Nuclear Reactor ◽

Three Dimensional ◽

Fe Method ◽

Element Analysis ◽

Weld Overlay ◽

Welding Processes

This paper presents a fast finite element analysis (FEA) model to efficiently predict the residual stresses in a feeder elbow in a CANDU nuclear reactor coolant system throughout the various stages of the manufacturing and welding processes, including elbow forming, Grayloc hub weld, and weld overlay application. The finite element (FE) method employs optimized FEA procedure along with three-dimensional (3-D) elastic-plastic technology and large deformation capability to predict the residual stresses due to the feeder forming and various welding processes. The results demonstrate that the fast FEA method captures the residual stress trends with acceptable accuracy and, hence, provides an efficient and practical tool for performing complicated parametric 3-D weld residual stress studies.

Download Full-text

HPC simulations of brownout: A noninteracting particles dynamic model

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020905971 ◽

2020 ◽

Vol 34 (3) ◽

pp. 267-281

Author(s):

Roberto Porcù ◽

Edie Miglio ◽

Nicola Parolini ◽

Mattia Penati ◽

Noemi Vergopolan

Keyword(s):

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Graphics Processing Unit ◽

Time Integration ◽

Computational Cost ◽

Aircraft Design ◽

Euler Method ◽

Processing Unit ◽

Integration Algorithm

Helicopters can experience brownout when flying close to a dusty surface. The uplifting of dust in the air can remarkably restrict the pilot’s visibility area. Consequently, a brownout can disorient the pilot and lead to the helicopter collision against the ground. Given its risks, brownout has become a high-priority problem for civil and military operations. Proper helicopter design is thus critical, as it has a strong influence over the shape and density of the cloud of dust that forms when brownout occurs. A way forward to improve aircraft design against brownout is the use of particle simulations. For simulations to be accurate and comparable to the real phenomenon, billions of particles are required. However, using a large number of particles, serial simulations can be slow and too computationally expensive to be performed. In this work, we investigate an message passing interface (MPI) + graphics processing unit (multi-GPU) approach to simulate brownout. In specific, we use a semi-implicit Euler method to consider the particle dynamics in a Lagrangian way, and we adopt a precomputed aerodynamic field. Here, we do not include particle–particle collisions in the model; this allows for independent trajectories and effective model parallelization. To support our methodology, we provide a speedup analysis of the parallelization concerning the serial and pure-MPI simulations. The results show (i) very high speedups of the MPI + multi-GPU implementation with respect to the serial and pure-MPI ones, (ii) excellent weak and strong scalability properties of the implemented time-integration algorithm, and (iii) the possibility to run realistic simulations of brownout with billions of particles at a relatively small computational cost. This work paves the way toward more realistic brownout simulations, and it highlights the potential of high-performance computing for aiding and advancing aircraft design for brownout mitigation.

Download Full-text

Process modeling, joint-property characterization and construction of joint connectors for mechanical fastening by self-piercing riveting

Multidiscipline Modeling in Materials and Structures ◽

10.1108/mmms-04-2014-0024 ◽

2014 ◽

Vol 10 (4) ◽

pp. 631-658 ◽

Cited By ~ 14

Author(s):

Mica Grujicic ◽

Jennifer Snipes ◽

S. Ramaswami ◽

Fadi Abu-Farha

Keyword(s):

Mechanical Properties ◽

Finite Element ◽

Large Scale ◽

Constitutive Relations ◽

Computational Cost ◽

Three Dimensional ◽

Modeling And Simulations ◽

Content Type ◽

Material Parameters ◽

Computational Analyses

Purpose – The purpose of this paper is to propose a computational approach in order to help establish the effect of various self-piercing rivet (SPR) process and material parameters on the quality and the mechanical performance of the resulting SPR joints. Design/methodology/approach – Toward that end, a sequence of three distinct computational analyses is developed. These analyses include: (a) finite-element modeling and simulations of the SPR process; (b) determination of the mechanical properties of the resulting SPR joints through the use of three-dimensional, continuum finite-element-based numerical simulations of various mechanical tests performed on the SPR joints; and (c) determination, parameterization and validation of the constitutive relations for the simplified SPR connectors, using the results obtained in (b) and the available experimental results. The availability of such connectors is mandatory in large-scale computational analyses of whole-vehicle crash or even in simulations of vehicle component manufacturing, e.g. car-body electro-coat paint-baking process. In such simulations, explicit three-dimensional representation of all SPR joints is associated with a prohibitive computational cost. Findings – It is found that the approach developed in the present work can be used, within an engineering optimization procedure, to adjust the SPR process and material parameters (design variables) in order to obtain a desired combination of the SPR-joint mechanical properties (objective function). Originality/value – To the authors’ knowledge, the present work is the first public-domain report of the comprehensive modeling and simulations including: self-piercing process; virtual mechanical testing of the SPR joints; and derivation of the constitutive relations for the SPR connector elements.

Download Full-text

Splotch

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016652713 ◽

2016 ◽

Vol 31 (6) ◽

pp. 550-563

Author(s):

Timothy Dykes ◽

Claudio Gheller ◽

Marzia Rivi ◽

Mel Krokos

Keyword(s):

High Performance ◽

Large Scale ◽

Graphics Processing Unit ◽

Processing Unit ◽

Xeon Phi ◽

The Many ◽

Many Core ◽

Performance Results ◽

Graphics Processing ◽

Performance Computing

With the increasing size and complexity of data produced by large-scale numerical simulations, it is of primary importance for scientists to be able to exploit all available hardware in heterogenous high-performance computing environments for increased throughput and efficiency. We focus on the porting and optimization of Splotch, a scalable visualization algorithm, to utilize the Xeon Phi, Intel’s coprocessor based upon the new many integrated core architecture. We discuss steps taken to offload data to the coprocessor and algorithmic modifications to aid faster processing on the many-core architecture and make use of the uniquely wide vector capabilities of the device, with accompanying performance results using multiple Xeon Phi. Finally we compare performance against results achieved with the Graphics Processing Unit (GPU) based implementation of Splotch.

Download Full-text

Granular layEr Simulator: Design and Multi-GPU Simulation of the Cerebellar Granular Layer

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.630795 ◽

2021 ◽

Vol 15 ◽

Author(s):

Giordana Florimbi ◽

Emanuele Torti ◽

Stefano Masoli ◽

Egidio D'Angelo ◽

Francesco Leporati

Keyword(s):

High Performance ◽

Large Scale ◽

Granular Layer ◽

Graphics Processing Unit ◽

Mossy Fibers ◽

Processing Unit ◽

Large Network ◽

Processing Times ◽

3D Space ◽

High Level

In modern computational modeling, neuroscientists need to reproduce long-lasting activity of large-scale networks, where neurons are described by highly complex mathematical models. These aspects strongly increase the computational load of the simulations, which can be efficiently performed by exploiting parallel systems to reduce the processing times. Graphics Processing Unit (GPU) devices meet this need providing on desktop High Performance Computing. In this work, authors describe a novel Granular layEr Simulator development implemented on a multi-GPU system capable of reconstructing the cerebellar granular layer in a 3D space and reproducing its neuronal activity. The reconstruction is characterized by a high level of novelty and realism considering axonal/dendritic field geometries, oriented in the 3D space, and following convergence/divergence rates provided in literature. Neurons are modeled using Hodgkin and Huxley representations. The network is validated by reproducing typical behaviors which are well-documented in the literature, such as the center-surround organization. The reconstruction of a network, whose volume is 600 × 150 × 1,200 μm3 with 432,000 granules, 972 Golgi cells, 32,399 glomeruli, and 4,051 mossy fibers, takes 235 s on an Intel i9 processor. The 10 s activity reproduction takes only 4.34 and 3.37 h exploiting a single and multi-GPU desktop system (with one or two NVIDIA RTX 2080 GPU, respectively). Moreover, the code takes only 3.52 and 2.44 h if run on one or two NVIDIA V100 GPU, respectively. The relevant speedups reached (up to ~38× in the single-GPU version, and ~55× in the multi-GPU) clearly demonstrate that the GPU technology is highly suitable for realistic large network simulations.

Download Full-text

Efficient Parallel Algorithms for 3D Laplacian Smoothing on the GPU

Applied Sciences ◽

10.3390/app9245437 ◽

2019 ◽

Vol 9 (24) ◽

pp. 5437

Author(s):

Lei Xiao ◽

Guoxiang Yang ◽

Kunyang Zhao ◽

Gang Mei

Keyword(s):

Large Scale ◽

Graphics Processing Unit ◽

Parallel Implementation ◽

Three Dimensional ◽

Smoothing Method ◽

Three Dimensions ◽

Processing Unit ◽

Mesh Quality ◽

Mesh Smoothing ◽

Laplacian Smoothing

In numerical modeling, mesh quality is one of the decisive factors that strongly affects the accuracy of calculations and the convergence of iterations. To improve mesh quality, the Laplacian mesh smoothing method, which repositions nodes to the barycenter of adjacent nodes without changing the mesh topology, has been widely used. However, smoothing a large-scale three dimensional mesh is quite computationally expensive, and few studies have focused on accelerating the Laplacian mesh smoothing method by utilizing the graphics processing unit (GPU). This paper presents a GPU-accelerated parallel algorithm for Laplacian smoothing in three dimensions by considering the influence of different data layouts and iteration forms. To evaluate the efficiency of the GPU implementation, the parallel solution is compared with the original serial solution. Experimental results show that our parallel implementation is up to 46 times faster than the serial version.

Download Full-text

Large-scale sound field rendering with graphics processing unit cluster for three-dimensional audio with loudspeaker array

10.1121/1.4798996 ◽

2013 ◽

Author(s):

Takao Tsuchiya ◽

Yukio Iwaya ◽

Makoto Otani

Keyword(s):

Large Scale ◽

Graphics Processing Unit ◽

Three Dimensional ◽

Sound Field ◽

Processing Unit ◽

Graphics Processing

Download Full-text

Finite Element Modeling of GMAW Process: Evolution and Formation of Residual Stresses Upon Cooling

Heat Transfer, Volume 2 ◽

10.1115/imece2004-59241 ◽

2004 ◽

Cited By ~ 1

Author(s):

Mohammad S. Davoud ◽

Xiaomin Deng

Keyword(s):

Finite Element ◽

Residual Stresses ◽

Arc Welding ◽

Three Dimensional ◽

Residual Stress Field ◽

Full Field ◽

Elastic And Plastic Deformations ◽

2D And 3D ◽

Welding Processes ◽

Field View

Fusion arc welding processes often generate substantial residual stresses, which may alter the performance of welded structures. Residual stresses are the results of incompatible elastic and plastic deformations in a body. Destructive techniques are generally used to experimentally determine residual stresses. Employment of these methods would not often be possible or practical in industry. In this study, three-dimensional (3D) and two-dimensional (2D) finite element simulations and experimental work have been performed to analyze the thermomechanical problem of GMAW and to obtain a full-field view of the residual stress field. One of the purposes of this study is to examine the formation of residual stresses upon cooling of a weldment. Comparisons of the results of 2D and 3D finite element models reveal many three-dimensional features in the thermomechanical problem of GMAW. The magnitude of longitudinal residual stresses obtained from the 2D model, however, compares well with the results obtained from the 3D model.

Download Full-text