scholarly journals Maximum entropy snapshot sampling for reduced basis modelling

Author(s):  
Marcus W.F.M. Bannenberg ◽  
Fotios Kasolis ◽  
Michael Günther ◽  
Markus Clemens

Purpose The maximum entropy snapshot sampling (MESS) method aims to reduce the computational cost required for obtaining the reduced basis for the purpose of model reduction. Hence, it can significantly reduce the original system dimension whilst maintaining an adequate level of accuracy. The purpose of this paper is to show how these beneficial results are obtained. Design/methodology/approach The so-called MESS method is used for reducing two nonlinear circuit models. The MESS directly reduces the number of snapshots by recursively identifying and selecting the snapshots that strictly increase an estimate of the correlation entropy of the considered systems. Reduced bases are then obtained with the orthogonal-triangular decomposition. Findings Two case studies have been used for validating the reduction performance of the MESS. These numerical experiments verify the performance of the advocated approach, in terms of computational costs and accuracy, relative to gappy proper orthogonal decomposition. Originality/value The novel MESS has been successfully used for reducing two nonlinear circuits: in particular, a diode chain model and a thermal-electric coupled system. In both cases, the MESS removed unnecessary data, and hence, it reduced the snapshot matrix, before calling the QR basis generation routine. As a result, the QR-decomposition has been called on a reduced snapshot matrix, and the offline stage has been significantly scaled down, in terms of central processing unit time.

Author(s):  
Baptiste Ristagno ◽  
Dominique Giraud ◽  
Julien Fontchastagner ◽  
Denis Netter ◽  
Noureddine Takorabet ◽  
...  

Purpose Optimization processes and movement modeling usually require a high number of simulations. The purpose of this paper is to reduce global central processing unit (CPU) time by decreasing each evaluation time. Design Methodology Approach Remeshing the geometry at each iteration is avoided in the proposed method. The idea consists in using a fixed mesh on which functions are projected to represent geometry and supply. Findings Results are very promising. CPU time is reduced for three dimensional problems by almost a factor two, keeping a low relative deviation from usual methods. CPU time saving is performed by avoiding meshing step and also by a better initialization of iterative resolution. Optimization, movement modeling and transient-state simulation are very efficient and give same results as usual finite element method. Research Limitations Implications The method is restricted to simple geometry owing to the difficulty of finding spatial mathematical function describing the geometry. Moreover, a compromise between imprecision, caused by the boundary evaluation, and time saving must be found. Originality Value The method can be applied to optimize rotating machines design. Moreover, movement modeling is performed by shifting functions corresponding to moving parts.


Author(s):  
Satyavir Singh ◽  
Mohammad Abid Bazaz ◽  
Shahkar Ahmad Nahvi

Purpose The purpose of this paper is to demonstrate the applicability of the Discrete Empirical Interpolation method (DEIM) for simulating the swing dynamics of benchmark power system problems. The authors demonstrate that considerable savings in computational time and resources are obtained using this methodology. Another purpose is to apply a recently developed modified DEIM strategy with a reduced on-line computational burden on this problem. Design/methodology/approach On-line computational cost of the power system dynamics problem is reduced by using DEIM, which reduces the complexity of the evaluation of the nonlinear function in the reduced model to a cost proportional to the number of reduced modes. The on-line computational cost is reduced by using an approximate snap-shot ensemble to construct the reduced basis. Findings Considerable savings in computational resources and time are obtained when DEIM is used for simulating swing dynamics. The on-line cost implications of DEIM are also reduced considerably by using approximate snapshots to construct the reduced basis. Originality/value Applicability of DEIM (with and without approximate ensemble) to a large-scale power system dynamics problem is demonstrated for the first time.


Author(s):  
Liam Dunn ◽  
Patrick Clearwater ◽  
Andrew Melatos ◽  
Karl Wette

Abstract The F-statistic is a detection statistic used widely in searches for continuous gravitational waves with terrestrial, long-baseline interferometers. A new implementation of the F-statistic is presented which accelerates the existing "resampling" algorithm using graphics processing units (GPUs). The new implementation runs between 10 and 100 times faster than the existing implementation on central processing units without sacrificing numerical accuracy. The utility of the GPU implementation is demonstrated on a pilot narrowband search for four newly discovered millisecond pulsars in the globular cluster Omega Centauri using data from the second Laser Interferometer Gravitational-Wave Observatory observing run. The computational cost is 17:2 GPU-hours using the new implementation, compared to 1092 core-hours with the existing implementation.


Author(s):  
Theodoros Zygiridis ◽  
Stamatis A. Amanatiadis ◽  
Theodosios Karamanos ◽  
Nikolaos V. Kantartzis

Purpose The extraordinary properties of graphene render it ideal for diverse contemporary and future applications. Aiming at the investigation of certain aspects commonly overlooked in pertinent works, the authors study wave-propagation phenomena supported by graphene layers within a stochastic framework, i.e. when uncertainty in various factors affects the graphene’s surface conductivity. Given that the consideration of an increasing number of graphene sheets may increase the stochastic dimensionality of the corresponding problem, efficient surrogates with reasonable computational cost need to be developed. Design/methodology/approach The authors exploit the potential of generalized Polynomial Chaos (PC) expansions and develop low-cost surrogates that enable the efficient extraction of the necessary statistical properties displayed by stochastic graphene-related quantities of interest (QoI). A key step is the incorporation of an initial variance estimation, which unveils the significance of each input parameter and facilitates the selection of the most appropriate basis functions, by favoring anisotropic formulae. In addition, the impact of controlling the allowable input interactions in the expansion terms is investigated, aiming at further PC-basis elimination. Findings The proposed stochastic methodology is assessed via comparisons with reference Monte-Carlo results, and the developed reduced basis models are shown to be sufficiently reliable, being at the same time computationally cheaper than standard PC expansions. In this context, different graphene configurations with varying numbers of random inputs are modeled, and interesting conclusions are drawn regarding their stochastic responses. Originality/value The statistical properties of surface-plasmon polaritons and other QoIs are predicted reliably in diverse graphene configurations, when the surface conductivity displays non-trivial uncertainty levels. The suggested PC methodology features simple implementation and low complexity, yet its performance is not compromised, compared to other standard approaches, and it is shown to be capable of delivering valid results.


Sensors ◽  
2019 ◽  
Vol 19 (9) ◽  
pp. 2124 ◽  
Author(s):  
Yingzhong Tian ◽  
Xining Liu ◽  
Long Li ◽  
Wenbin Wang

Iterative closest point (ICP) is a method commonly used to perform scan-matching and registration. To be a simple and robust algorithm, it is still computationally expensive, and it has been regarded as having a crucial challenge especially in a real-time application as used for the simultaneous localization and mapping (SLAM) problem. For these reasons, this paper presents a new method for the acceleration of ICP with an assisted intensity. Unlike the conventional ICP, this method is proposed to reduce the computational cost and avoid divergences. An initial transformation guess is computed with an assisted intensity for their relative rigid-body transformation. Moreover, a target function is proposed to determine the best initial transformation guess based on the statistic of their spatial distances and intensity residuals. Additionally, this method is also proposed to reduce the iteration number. The Anderson acceleration is utilized for increasing the iteration speed which has better ability than the Picard iteration procedure. The proposed algorithm is operated in real time with a single core central processing unit (CPU) thread. Hence, it is suitable for the robot which has limited computation resources. To validate the novelty, this proposed method is evaluated on the SEMANTIC3D.NET benchmark dataset. According to comparative results, the proposed method is declared as having better accuracy and robustness than the conventional ICP methods.


2018 ◽  
Vol 8 (10) ◽  
pp. 1985 ◽  
Author(s):  
Yoshihiro Maeda ◽  
Norishige Fukushima ◽  
Hiroshi Matsuo

In this paper, we propose acceleration methods for edge-preserving filtering. The filters natively include denormalized numbers, which are defined in IEEE Standard 754. The processing of the denormalized numbers has a higher computational cost than normal numbers; thus, the computational performance of edge-preserving filtering is severely diminished. We propose approaches to prevent the occurrence of the denormalized numbers for acceleration. Moreover, we verify an effective vectorization of the edge-preserving filtering based on changes in microarchitectures of central processing units by carefully treating kernel weights. The experimental results show that the proposed methods are up to five-times faster than the straightforward implementation of bilateral filtering and non-local means filtering, while the filters maintain the high accuracy. In addition, we showed effective vectorization for each central processing unit microarchitecture. The implementation of the bilateral filter is up to 14-times faster than that of OpenCV. The proposed methods and the vectorization are practical for real-time tasks such as image editing.


2015 ◽  
Vol 82 (1) ◽  
pp. 157-166 ◽  
Author(s):  
Jiarong Guo ◽  
James R. Cole ◽  
Qingpeng Zhang ◽  
C. Titus Brown ◽  
James M. Tiedje

ABSTRACTShotgun metagenomic sequencing does not depend on gene-targeted primers or PCR amplification; thus, it is not affected by primer bias or chimeras. However, searching rRNA genes from large shotgun Illumina data sets is computationally expensive, and no approach exists for unsupervised community analysis of small-subunit (SSU) rRNA gene fragments retrieved from shotgun data. We present a pipeline, SSUsearch, to achieve the faster identification of short-subunit rRNA gene fragments and enabled unsupervised community analysis with shotgun data. It also includes classification and copy number correction, and the output can be used by traditional amplicon analysis platforms. Shotgun metagenome data using this pipeline yielded higher diversity estimates than amplicon data but retained the grouping of samples in ordination analyses. We applied this pipeline to soil samples with paired shotgun and amplicon data and confirmed bias againstVerrucomicrobiain a commonly used V6-V8 primer set, as well as discovering likely bias againstActinobacteriaand forVerrucomicrobiain a commonly used V4 primer set. This pipeline can utilize all variable regions in SSU rRNA and also can be applied to large-subunit (LSU) rRNA genes for confirmation of community structure. The pipeline can scale to handle large amounts of soil metagenomic data (5 Gb memory and 5 central processing unit hours to process 38 Gb [1 lane] of trimmed Illumina HiSeq2500 data) and is freely available athttps://github.com/dib-lab/SSUsearchunder a BSD license.


2021 ◽  
Vol 13 (11) ◽  
pp. 2107
Author(s):  
Shiyu Wu ◽  
Zhichao Xu ◽  
Feng Wang ◽  
Dongkai Yang ◽  
Gongjian Guo

Global Navigation Satellite System Reflectometry Bistatic Synthetic Aperture Radar (GNSS-R BSAR) is becoming more and more important in remote sensing because of its low power, low mass, low cost, and real-time global coverage capability. The Back Projection Algorithm (BPA) was usually selected as the GNSS-R BSAR imaging algorithm because it can process echo signals of complex geometric configurations. However, the huge computational cost is a challenge for its application in GNSS-R BSAR. Graphics Processing Units (GPU) provides an efficient computing platform for GNSS-R BSAR processing. In this paper, a solution accelerating the BPA of GNSS-R BSAR using GPU is proposed to improve imaging efficiency, and a matching pre-processing program was proposed to synchronize direct and echo signals to improve imaging quality. To process hundreds of gigabytes of data collected by a long-time synthetic aperture in fixed station mode, a stream processing structure was used to process such a large amount of data to solve the problem of limited GPU memory. In the improvement of the imaging efficiency, the imaging task is divided into pre-processing and BPA, which are performed in the Central Processing Unit (CPU) and GPU, respectively, and a pixel-oriented parallel processing method in back projection is adopted to avoid memory access conflicts caused by excessive data volume. The improved BPA with the long synthetic aperture time is verified through the simulation of and experimenting on the GPS-L5 signal. The results show that the proposed accelerating solution is capable of taking approximately 128.04 s, which is 156 times lower than pure CPU framework for producing a size of 600 m × 600 m image with 1800 s synthetic aperture time; in addition, the same imaging quality with the existing processing solution can be retained.


2021 ◽  
Vol 249 ◽  
pp. 06003
Author(s):  
François Nader ◽  
Patrick Pizette ◽  
Nicolin Govender ◽  
Daniel N. Wilke ◽  
Jean-François Ferellec

The use of the Discrete Element Method to model engineering structures implementing granular materials has proven to be an efficient method to response under various behaviour conditions. However, the computational cost of the simulations increases rapidly, as the number of particles and particle shape complexity increases. An affordable solution to render problems computationally tractable is to use graphical processing units (GPU) for computing. Modern GPUs offer up 10496 compute cores, which allows for a greater parallelisation relative to 32-cores offered by high-end Central Processing Unit (CPU) compute. This study outlines the application of BlazeDEM-GPU, using an RTX 2080Ti GPU (4352 cores), to investigate the influence of the modelling of particle shape on the lateral pull behaviour of granular ballast systems used in railway applications. The idea is to validate the model and show the benefits of simulating non-spherical shapes in future large-scale tests. The algorithm, created to generate the shape of the ballast based on real grain scans, and using polyhedral shape approximations of varying degrees of complexity is shown. The particle size is modelled to scale. A preliminary investigation of the effect of the grain shape is conducted, where a sleeper lateral pull test is carried out in a spherical grains sample, and a cubic grains sample. Preliminary results show that elementary polyhedral shape representations (cubic) recreate some of the characteristic responses in the lateral pull test, such as stick/slip phenomena and force chain distributions, which looks promising for future works on railway simulations. These responses that cannot be recreated with simple spherical grains, unless heuristics are added, which requires additional calibration and approximations. The significant reduction in time when using non-spherical grains also implies that larger granular systems can be investigated.


2017 ◽  
Vol 27 (12) ◽  
pp. 2768-2774
Author(s):  
Rainald Löhner ◽  
Fumiya Togashi ◽  
Joseph David Baum

Purpose A common observation made when computing chemically reacting flows is how central processing unit (CPU)-intensive these are in comparison to cold flow cases. The update of tens or hundreds of species with hundreds or thousands of reactions can easily consume more than 95% of the total CPU time. In many cases, the region where reactions (combustion) are actually taking place comprises only a very small percentage of the volume. Typical examples are flame fronts propagating through a domain. In such cases, only a small fraction of points/cells needs a full chemistry update. This leads to extreme load imbalances on parallel machines. The purpose of the present work is to develop a methodology to balance the work in an optimal way. Design/methodology/approach Points that require a full chemistry update are identified, gathered and distributed across the network, so that work is evenly distributed. Once the chemistry has been updated, the unknowns are gathered back. Findings The procedure has been found to work extremely well, leading to optimal load balance with insignificant communication overheads. Research limitations/implications In many production runs, the procedure leads to a reduction in CPU requirements of more than an order of magnitude. This allows much larger and longer runs, improving accuracy and statistics. Practical implications The procedure has allowed the calculation of chemically reacting flow cases that were hitherto not possible. Originality/value To the authors’ knowledge, this type of load balancing has not been published before.


Sign in / Sign up

Export Citation Format

Share Document