Acceleration Techniques for System-Hyper-Pipelined Soft-Processors on FPGAs

Author(s):  
Tobias Strauch

2021 ◽  
Vol 13 (3) ◽  
pp. 434
Author(s):  
Ana del Águila ◽  
Dmitry S. Efremenko

Fast radiative transfer models (RTMs) are required to process a great amount of satellite-based atmospheric composition data. Specifically designed acceleration techniques can be incorporated in RTMs to simulate the reflected radiances with a fine spectral resolution, avoiding time-consuming computations on a fine resolution grid. In particular, in the cluster low-streams regression (CLSR) method, the computations on a fine resolution grid are performed by using the fast two-stream RTM, and then the spectra are corrected by using regression models between the two-stream and multi-stream RTMs. The performance enhancement due to such a scheme can be of about two orders of magnitude. In this paper, we consider a modification of the CLSR method (which is referred to as the double CLSR method), in which the single-scattering approximation is used for the computations on a fine resolution grid, while the two-stream spectra are computed by using the regression model between the two-stream RTM and the single-scattering approximation. Once the two-stream spectra are known, the CLSR method is applied the second time to restore the multi-stream spectra. Through a numerical analysis, it is shown that the double CLSR method yields an acceleration factor of about three orders of magnitude as compared to the reference multi-stream fine-resolution computations. The error of such an approach is below 0.05%. In addition, it is analysed how the CLSR method can be adopted for efficient computations for atmospheric scenarios containing aerosols. In particular, it is discussed how the precomputed data for clear sky conditions can be reused for computing the aerosol spectra in the framework of the CLSR method. The simulations are performed for the Hartley–Huggins, O2 A-, water vapour and CO2 weak absorption bands and five aerosol models from the optical properties of aerosols and clouds (OPAC) database.



2017 ◽  
Vol 2017 ◽  
pp. 1-10
Author(s):  
Hsuan-Ming Huang ◽  
Ing-Tsung Hsiao

Background and Objective. Over the past decade, image quality in low-dose computed tomography has been greatly improved by various compressive sensing- (CS-) based reconstruction methods. However, these methods have some disadvantages including high computational cost and slow convergence rate. Many different speed-up techniques for CS-based reconstruction algorithms have been developed. The purpose of this paper is to propose a fast reconstruction framework that combines a CS-based reconstruction algorithm with several speed-up techniques.Methods. First, total difference minimization (TDM) was implemented using the soft-threshold filtering (STF). Second, we combined TDM-STF with the ordered subsets transmission (OSTR) algorithm for accelerating the convergence. To further speed up the convergence of the proposed method, we applied the power factor and the fast iterative shrinkage thresholding algorithm to OSTR and TDM-STF, respectively.Results. Results obtained from simulation and phantom studies showed that many speed-up techniques could be combined to greatly improve the convergence speed of a CS-based reconstruction algorithm. More importantly, the increased computation time (≤10%) was minor as compared to the acceleration provided by the proposed method.Conclusions. In this paper, we have presented a CS-based reconstruction framework that combines several acceleration techniques. Both simulation and phantom studies provide evidence that the proposed method has the potential to satisfy the requirement of fast image reconstruction in practical CT.



2021 ◽  
Vol 11 (04) ◽  
pp. 1-11
Author(s):  
Wanwan Li

In mechanical engineering educations, simulating fluid thermodynamics is rather helpful for students to understand the fluid’s natural behaviors. However, rendering both high-quality and realtime simulations for fluid dynamics are rather challenging tasks due to their intensive computations. So, in order to speed up the simulations, we have taken advantage of GPU acceleration techniques to simulate interactive fluid thermodynamics in real-time. In this paper, we present an elegant, basic, but practical OpenGL/SL framework for fluid simulation with a heat map rendering. By solving Navier-Stokes equations coupled with the heat diffusion equation, we validate our framework through some real-case studies of the smoke-like fluid rendering such as their interactions with moving obstacles and their heat diffusion effects. As shown in Fig. 1, a group of experimental results demonstrates that our GPU-accelerated solver of Navier-Stokes equations with heat transfer could give the observers impressive real-time and realistic rendering results.



Author(s):  
Luca Accorsi ◽  
Daniele Vigo

In this paper, we propose a fast and scalable, yet effective, metaheuristic called FILO to solve large-scale instances of the Capacitated Vehicle Routing Problem. Our approach consists of a main iterative part, based on the Iterated Local Search paradigm, which employs a carefully designed combination of existing acceleration techniques, as well as novel strategies to keep the optimization localized, controlled, and tailored to the current instance and solution. A Simulated Annealing-based neighbor acceptance criterion is used to obtain a continuous diversification, to ensure the exploration of different regions of the search space. Results on extensively studied benchmark instances from the literature, supported by a thorough analysis of the algorithm’s main components, show the effectiveness of the proposed design choices, making FILO highly competitive with existing state-of-the-art algorithms, both in terms of computing time and solution quality. Finally, guidelines for possible efficient implementations, algorithm source code, and a library of reusable components are open-sourced to allow reproduction of our results and promote further investigations.



Author(s):  
Mustafa C. Camur ◽  
Thomas Sharkey ◽  
Chrysafis Vogiatzis

We consider the problem of identifying the induced star with the largest cardinality open neighborhood in a graph. This problem, also known as the star degree centrality (SDC) problem, is shown to be [Formula: see text]-complete. In this work, we first propose a new integer programming (IP) formulation, which has a smaller number of constraints and nonzero coefficients in them than the existing formulation in the literature. We present classes of networks in which the problem is solvable in polynomial time and offer a new proof of [Formula: see text]-completeness that shows the problem remains [Formula: see text]-complete for both bipartite and split graphs. In addition, we propose a decomposition framework that is suitable for both the existing and our formulations. We implement several acceleration techniques in this framework, motivated by techniques used in Benders decomposition. We test our approaches on networks generated based on the Barabási–Albert, Erdös–Rényi, and Watts–Strogatz models. Our decomposition approach outperforms solving the IP formulations in most of the instances in terms of both solution time and quality; this is especially true for larger and denser graphs. We then test the decomposition algorithm on large-scale protein–protein interaction networks, for which SDC is shown to be an important centrality metric. Summary of Contribution: In this study, we first introduce a new integer programming (NIP) formulation for the star degree centrality (SDC) problem in which the goal is to identify the induced star with the largest open neighborhood. We then show that, although the SDC can be efficiently solved in tree graphs, it remains [Formula: see text]-complete in both split and bipartite graphs via a reduction performed from the set cover problem. In addition, we implement a decomposition algorithm motivated by Benders decomposition together with several acceleration techniques to both the NIP formulation and the existing formulation in the literature. Our experimental results indicate that the decomposition implementation on the NIP is the best solution method in terms of both solution time and quality.



2009 ◽  
Vol 8 (3) ◽  
pp. 57-62 ◽  
Author(s):  
Mohd Shahrizal Sunar ◽  
Mohamed Adi Bin Mohamed Azahar ◽  
Mohd Khalid Mokhtar ◽  
Daut Daman

Previously, crowd simulation plays small part or probably ignored in virtual heritage. Architectures or artifacts bit more focused to be reconstructed into virtual environment. By inserting crowd into simulation of virtual heritage, it will give more impact and achieve higher realism level to the reconstructed site. Before inserting the crowd into virtual environment, a research needs to be done to manage the complex environment of virtual heritage and the crowd itself. This paper presents a framework with a vision to reduce the computation cost for rendering crowd simulation in virtual heritage environment while maintaining realisms of the scene. We first review the existing acceleration techniques applied on crowd rendering. Then we introduce a framework that will integrate acceleration techniques for crowd simulation in virtual heritage.



2018 ◽  
Vol 15 (05) ◽  
pp. 1850036 ◽  
Author(s):  
Yasunori Yusa ◽  
Hiroshi Okada ◽  
Yosuke Yumoto

Some improvements of the coupling-matrix-free iterative s-version finite element method (FEM) to shorten its computational time are proposed. Then, the proposed method is applied to three-dimensional stress concentration problems. For sufficiently small computational time for practical use, two key techniques are introduced. First, the iteration is accelerated drastically by using the proposed convergence acceleration techniques. Secondly, stress transfers between global and local meshes are accelerated considerably by a bucket search algorithm. The proposed method was more than one hundred times faster than the straightforward algorithm of the coupling-matrix-free iterative s-version FEM.



2021 ◽  
Author(s):  
Megan Karalus ◽  
Piyush Thakre ◽  
Graham Goldin ◽  
Dustin Brandt

Abstract A Honeywell liquid-fueled gas turbine test combustor, at idle conditions is numerically investigated in Simcenter STAR-CCM+ version 2020.3. This work presents Large Eddy Simulation (LES) results using both the Flamelet Generated Manifold (FGM) and detailed chemistry combustion models. Both take advantage of a hybrid chemical mechanism (HyChem) which has previously demonstrated very good accuracy for real fuels such as Jet-A with only 47 species. The objective of this work is to investigate the ability of FGM and detailed chemistry modeling to capture pollutant formation in an aero-engine combustor. Comparisons for NOx, CO, Unburned Hydrocarbons, and Soot are made, along with the radial temperature profile. To fully capture potential emissions, a soot moment model, and Zeldovich NOx model are employed along with radiation. A comparison of results with and without chemistry acceleration techniques for detailed chemistry is included. Then, computational costs are assessed by comparing the performance and scalability of the simulations with each of the combustion models. It is found that the detailed chemistry case with clustering can reproduce nearly identical results to detailed chemistry without any acceleration if CO is added as a clustering variable. With the Lagrangian model settings chosen for this study, the detailed chemistry results compared more favorably with the experimental data than FGM, however there is uncertainty in the secondary breakup parameters. Sensitivity of the results to a key parameter in the spray breakup model are provided for both FGM and Complex Chemistry (CC). By varying this breakup rate, the FGM case can predict CO, NOx, and Unburned Hydrocarbons equally well. The smoke number, however, is predicted most accurately by CC. The cost for running detailed chemistry with clustering is found to be about 4 times that of FGM for this combustor and chemical mechanism.



Author(s):  
Mohamed Aymen Ben HajKacem ◽  
Chiheb-Eddine Ben N′Cir ◽  
Nadia Essoussi

Big Data clustering has become an important challenge in data analysis since several applications require scalable clustering methods to organize such data into groups of similar objects. Given the computational cost of most of the existing clustering methods, we propose in this paper a new clustering method, referred to as STiMR [Formula: see text]-means, able to provide good tradeoff between scalability and clustering quality. The proposed method is based on the combination of three acceleration techniques: sampling, triangle inequality and MapReduce. Sampling is used to reduce the number of data points when building cluster prototypes, triangle inequality is used to reduce the number of comparisons when looking for nearest clusters and MapReduce is used to configure a parallel framework for running the proposed method. Experiments performed on simulated and real datasets have shown the effectiveness of the proposed method, with the existing ones, in terms of running time, scalability and internal validity measures.



Author(s):  
Justine Cris Borromeo ◽  
Koteswararao Kondepu ◽  
Nicola Andriolli ◽  
Luca Valcarenghi


Sign in / Sign up

Export Citation Format

Share Document