An Implicit Harmonic Balance Method in Graphics Processing Units for Oscillating Blades

2015 ◽  
Vol 138 (3) ◽  
Author(s):  
Javier Crespo ◽  
Roque Corral ◽  
Jesus Pueblas

An implicit harmonic balance (HB) method for modeling the unsteady nonlinear periodic flow about vibrating airfoils in turbomachinery is presented. An implicit edge-based three-dimensional Reynolds-averaged Navier–Stokes equations (RANS) solver for unstructured grids, which runs both on central processing units (CPUs) and graphics processing units (GPUs), is used. The HB method performs a spectral discretization of the time derivatives and marches in pseudotime, a new system of equations where the unknowns are the variables at different time samples. The application of the method to vibrating airfoils is discussed. It is shown that a time-spectral scheme may achieve the same temporal accuracy at a much lower computational cost than a backward finite-difference method at the expense of using more memory. The performance of the implicit solver has been assessed with several application examples. A speed-up factor of 10 is obtained between the spectral and finite-difference version of the code, whereas an additional speed-up factor of 10 is obtained when the code is ported to GPUs, totalizing a speed factor of 100. The performance of the solver in GPUs has been assessed using the tenth standard aeroelastic configuration and a transonic compressor.

Author(s):  
Javier Crespo ◽  
Roque Corral ◽  
Jesus Pueblas

An implicit harmonic balance method for modeling the unsteady non-linear periodic flow about vibrating airfoils in turbomachinery is presented. As departing point, an implicit edge-based three-dimensional Reynolds Averaged Navier-Stokes equations solver for unstructured grids that runs both on central processing units (CPUs) and graphics processing units (GPUs) is used. The harmonic balance method performs a spectral discretization of the time derivatives and marches in pseudo-time a new system of equations where the unknowns are the variables at different time samples. The application of the method to vibrating airfoils is discussed. It is shown that a time spectral scheme may achieve the same temporal accuracy at a much lower computational cost than a Backward Finite Difference method at the expense of using more memory. The performance of the implicit solver has been assessed with several application examples. A speed-up factor of 10 is obtained between the spectral and finite difference version of the code whereas and an additional speed-up factor of 10 is obtained when the code is ported to GPUs, totalizing a speed factor of 100. The performance of the solver in GPUs has been assessed using the 10th standard aeroelastic configuration and a transonic compressor.


Author(s):  
Liam Dunn ◽  
Patrick Clearwater ◽  
Andrew Melatos ◽  
Karl Wette

Abstract The F-statistic is a detection statistic used widely in searches for continuous gravitational waves with terrestrial, long-baseline interferometers. A new implementation of the F-statistic is presented which accelerates the existing "resampling" algorithm using graphics processing units (GPUs). The new implementation runs between 10 and 100 times faster than the existing implementation on central processing units without sacrificing numerical accuracy. The utility of the GPU implementation is demonstrated on a pilot narrowband search for four newly discovered millisecond pulsars in the globular cluster Omega Centauri using data from the second Laser Interferometer Gravitational-Wave Observatory observing run. The computational cost is 17:2 GPU-hours using the new implementation, compared to 1092 core-hours with the existing implementation.


2013 ◽  
Vol 135 (6) ◽  
Author(s):  
S. P. Vanka

This paper discusses the various issues of using graphics processing units (GPU) for computing fluid flows. GPUs, used primarily for processing graphics functions in a computer, are massively parallel multicore processors, which can also perform scientific computations in a data parallel mode. In the past ten years, GPUs have become quite powerful and have challenged the central processing units (CPUs) in their price and performance characteristics. However, in order to fully benefit from the GPUs' performance, the numerical algorithms must be made data parallel and converge rapidly. In addition, the hardware features of the GPUs require that the memory access be managed carefully in order to not suffer from the high latency. Fully explicit algorithms for Euler and Navier–Stokes equations and the lattice Boltzmann method for mesoscopic flows have been widely incorporated on the GPUs, with significant speed-up over a scalar algorithm. However, more complex algorithms with implicit formulations and unstructured grids require innovative thinking in data access and management. This article reviews the literature on linear solvers and computational fluid dynamics (CFD) algorithms on GPUs, including the author's own research on simulations of fluid flows using GPUs.


Geophysics ◽  
2016 ◽  
Vol 81 (2) ◽  
pp. T35-T43 ◽  
Author(s):  
Jon Marius Venstad

The difference in computational power between the few- and multicore architectures represented by central processing units (CPUs) and graphics processing units (GPUs) is significant today, and this difference is likely to increase in the years ahead. GPUs are, therefore, ever more popular for applications in computational physics, such as wave modeling. Finite-difference methods are popular for wave modeling and are well suited for the GPU architecture, but developing an efficient and capable GPU implementation is hindered by the limited size of the GPU memory. I revealed how the out-of-core technique can be used to circumvent the memory limit on the GPU, increasing the available memory to that of the CPU (the main memory) instead, with no significant computational overhead. This approach has several advantages over a parallel scheme in terms of applicability, flexibility, and hardware requirements. Choices in the numerical scheme — the numerical differentiators in particular — also greatly affect computational efficiency. These factors are considered explicitly for GPU implementations of wave modeling because GPUs are special purpose with a visible architecture.


Author(s):  
Roque Corral ◽  
Javier Crespo

An harmonic balance method for modeling unsteady nonlinear periodic flows in turbomachinery is presented. The method solves the Reynolds Averaged Navier-Stokes equations in the time domain and may be implemented in a relatively simple way into an existing code including all the standard convergence acceleration techniques used for steady problems. The application of the method to vibrating airfoils and rotorstator interaction is discussed. It is demonstrated that the time spectral scheme may achieve the same temporal accuracy at a lower computational cost at the expense of using more memory.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1974 ◽  
Author(s):  
Yibin Huang ◽  
Congying Qiu ◽  
Xiaonan Wang ◽  
Shijun Wang ◽  
Kui Yuan

The advent of convolutional neural networks (CNNs) has accelerated the progress of computer vision from many aspects. However, the majority of the existing CNNs heavily rely on expensive GPUs (graphics processing units). to support large computations. Therefore, CNNs have not been widely used to inspect surface defects in the manufacturing field yet. In this paper, we develop a compact CNN-based model that not only achieves high performance on tiny defect inspection but can be run on low-frequency CPUs (central processing units). Our model consists of a light-weight (LW) bottleneck and a decoder. By a pyramid of lightweight kernels, the LW bottleneck provides rich features with less computational cost. The decoder is also built in a lightweight way, which consists of an atrous spatial pyramid pooling (ASPP) and depthwise separable convolution layers. These lightweight designs reduce the redundant weights and computation greatly. We train our models on groups of surface datasets. The model can successfully classify/segment surface defects with an Intel i3-4010U CPU within 30 ms. Our model obtains similar accuracy with MobileNetV2 while only has less than its 1/3 FLOPs (floating-point operations per second) and 1/8 weights. Our experiments indicate CNNs can be compact and hardware-friendly for future applications in the automated surface inspection (ASI).


2018 ◽  
Vol 32 (12n13) ◽  
pp. 1840021
Author(s):  
Ziwei Wang ◽  
Xiong Jiang ◽  
Ti Chen ◽  
Yan Hao ◽  
Min Qiu

Simulating the unsteady flow of compressor under circumferential inlet distortion and rotor/stator interference would need full-annulus grid with a dual time method. This process is time consuming and needs a large amount of computational resources. Harmonic balance method simulates the unsteady flow in compressor on single passage grid with a series of steady simulations. This will largely increase the computational efficiency in comparison with the dual time method. However, most simulations with harmonic balance method are conducted on the flow under either circumferential inlet distortion or rotor/stator interference. Based on an in-house CFD code, the harmonic balance method is applied in the simulation of flow in the NASA Stage 35 under both circumferential inlet distortion and rotor/stator interference. As the unsteady flow is influenced by two different unsteady disturbances, it leads to the computational instability. The instability can be avoided by coupling the harmonic balance method with an optimizing algorithm. The computational result of harmonic balance method is compared with the result of full-annulus simulation. It denotes that, the harmonic balance method simulates the flow under circumferential inlet distortion and rotor/stator interference as precise as the full-annulus simulation with a speed-up of about 8 times.


Author(s):  
Franz Pichler ◽  
Gundolf Haase

A finite element code is developed in which all of the computationally expensive steps are performed on a graphics processing unit via the THRUST and the PARALUTION libraries. The code focuses on the simulation of transient problems where the repeated computations per time-step create the computational cost. It is used to solve partial and ordinary differential equations as they arise in thermal-runaway simulations of automotive batteries. The speed-up obtained by utilizing the graphics processing unit for every critical step is compared against the single core and the multi-threading solutions which are also supported by the chosen libraries. This way a high total speed-up on the graphics processing unit is achieved without the need for programming a single classical Compute Unified Device Architecture kernel.


Author(s):  
Aaron F. Shinn ◽  
S. P. Vanka

A semi-implicit pressure based multigrid algorithm for solving the incompressible Navier-Stokes equations was implemented on a Graphics Processing Unit (GPU) using CUDA (Compute Unified Device Architecture). The multigrid method employed was the Full Approximation Scheme (FAS), which is used for solving nonlinear equations. This algorithm is applied to the 2D driven cavity problem and compared to the CPU version of the code (written in Fortran) to assess computational speed-up.


2019 ◽  
Author(s):  
Robert Haase ◽  
Loic A. Royer ◽  
Peter Steinbach ◽  
Deborah Schmidt ◽  
Alexandr Dibrov ◽  
...  

AbstractGraphics processing units (GPU) allow image processing at unprecedented speed. We present CLIJ, a Fiji plugin enabling end-users with entry level experience in programming to benefit from GPU-accelerated image processing. Freely programmable workflows can speed up image processing in Fiji by factor 10 and more using high-end GPU hardware and on affordable mobile computers with built-in GPUs.


Sign in / Sign up

Export Citation Format

Share Document