Parallel Algorithms for Spatial Rainfall Distribution

This paper proposes parallel algorithms for precipitation of flood modelling, especially applied in spatial rainfall distribution. As an important input in flood modelling, spatial distribution of rainfall is always needed as a pre-conditioned model. In this paper two interpolation methods, Inverse distance weighting (IDW) and Ordinary kriging (OK) are discussed. Both are developed in parallel algorithms in order to reduce the computational time. To measure the computation efficiency, the performance of the parallel algorithms are compared to the serial algorithms for both methods. Findings indicate that: (1) the computation time of OK algorithm is up to 23% longer than IDW; (2) the computation time of OK and IDW algorithms is linearly increasing with the number of cells/ points; (3) the computation time of the parallel algorithms for both methods is exponentially decaying with the number of processors. The parallel algorithm of IDW gives a decay factor of 0.52, while OK gives 0.53; (4) The parallel algorithms perform near ideal speed-up.

Download Full-text

Introducing matrix sparsity with kernel truncation into dose calculations for fluence optimization.

Biomedical Physics & Engineering Express ◽

10.1088/2057-1976/ac35f8 ◽

2021 ◽

Author(s):

Hunter Scott Stephens ◽

Q Jackie Wu ◽

Qiuwen Wu

Keyword(s):

Treatment Planning ◽

Dose Calculation ◽

Computation Time ◽

Computational Time ◽

Dose Calculations ◽

Passing Rates ◽

Radiation Therapy Treatment Planning ◽

Large Patient ◽

Complex Architectures ◽

Speed Up

Abstract Deep learning algorithms for radiation therapy treatment planning automation require large patient datasets and complex architectures that often take hundreds of hours to train. Some of these algorithms require constant dose updating (such as with reinforcement learning) and may take days. When these algorithms rely on commerical treatment planning systems to perform dose calculations, the data pipeline becomes the bottleneck of the entire algorithm’s efficiency. Further, uniformly accurate distributions are not always needed for the training and approximations can be introduced to speed up the process without affecting the outcome. These approximations not only speed up the calculation process, but allow for custom algorithms to be written specifically for the purposes of use in AI/ML applications where the dose and fluence must be calculated a multitude of times for a multitude of different situations. Here we present and investigate the effect of introducing matrix sparsity through kernel truncation on the dose calculation for the purposes of fluence optimzation within these AI/ML algorithms. The basis for this algorithm relies on voxel discrimination in which numerous voxels are pruned from the computationally expensive part of the calculation. This results in a significant reduction in computation time and storage. Comparing our dose calculation against calculations in both a water phantom and patient anatomy in Eclipse without heterogenity corrections produced gamma index passing rates around 99% for individual and composite beams with uniform fluence and around 98% for beams with a modulated fluence. The resulting sparsity introduces a reduction in computational time and space proportional to the square of the sparsity tolerance with a potential decrease in cost greater than 10 times that of a dense calculation allowing not only for faster caluclations but for calculations that a dense algorithm could not perform on the same system.

Download Full-text

Fast and Parallel Ranking-based Clustering for Heterogeneous Graphs

journal of Data Intelligence ◽

10.26421/jdi1.2-3 ◽

2020 ◽

Vol 1 (2) ◽

pp. 137-158

Author(s):

Kotaro Yamazaki ◽

Tomoki Sato ◽

Hiroaki Shiokawa ◽

Hiroyuki Kitagawa

Keyword(s):

Data Analysis ◽

Parallel Algorithms ◽

Computational Cost ◽

Computation Time ◽

Graph Data ◽

Original Algorithm ◽

Speed Up ◽

Computationally Expensive ◽

Entire Procedure ◽

Data Analysis Methods

The demands for graph data analysis methods are increasing. RankClus is a framework to extract clusters by integrating clustering and ranking on heterogeneous graphs; it enhances the clustering results by alternately updates the results of clustering and ranking for the better understanding of the clusters. However, RankClus is computationally expensive if a graph is large since it needs to iterate both clustering and ranking for all nodes. In this paper, to address this problem, we propose a novel fast RankClus algorithm for heterogeneous graphs. To speed up the entire procedure of RankClus, our proposed algorithm reduces the computational cost of the ranking process in each iteration. Our proposal measures how each node affects the clustering result; if it is not significant, we prune the node. Furthermore, we also present a parallel algorithm by extending our proposed algorithm by fully exploiting a modern manycore CPU. As a result, our extensive evaluations clarified that our fast and parallel algorithms drastically cut off the computation time of the original algorithm RancClus.

Download Full-text

Speeding up the water distribution network design optimization using the ΔQ method

Journal of Hydroinformatics ◽

10.2166/hydro.2015.118 ◽

2015 ◽

Vol 18 (1) ◽

pp. 33-48 ◽

Cited By ~ 5

Author(s):

Damjan Ivetić ◽

Željko Vasilić ◽

Miloš Stanić ◽

Dušan Prodanović

Keyword(s):

Water Distribution ◽

Distribution Network ◽

Computation Time ◽

Water Distribution Network ◽

Convergence Criterion ◽

Computing Systems ◽

Computation Efficiency ◽

Initial Network ◽

Speed Up ◽

Water Distribution Network Design

To optimize the design of a water distribution network (WDN), a large number of possible solutions need to be examined; hence computation efficiency is an important issue. To accelerate the computation, one can use more powerful computers, parallel computing systems with adapted hydraulic solvers, hybrid algorithms, more efficient hydraulic methods or any combination of these techniques. This paper explores the possibility to speed up optimization using variations of the ΔQ method to solve the network hydraulics. First, the ΔQ method was used inside the evaluation function where each tested alternative was hydraulically solved and ranked. Then, the convergence criterion was relived in order to reduce the computation time. Although the accuracy of the obtained hydraulic results was reduced, these were feasible and interesting solutions. Another modification was tested, where the ΔQ method was used just once to solve the hydraulics of the initial network, and the unknown flow corrections were added to the list of other unknown variables subject to optimization. Two case networks were used for testing and were compared to the results obtained using EPANET2. The obtained results have shown that the use of the ΔQ method in hydraulic computations can significantly accelerate the optimization of WDN.

Download Full-text

Provably Secure Authentication Approach for Data Security in Cloud Using Hashing, Encryption, and Chebyshev-Based Authentication

International Journal of Information Security and Privacy ◽

10.4018/ijisp.2022010106 ◽

2022 ◽

Vol 16 (1) ◽

pp. 0-0

Keyword(s):

Data Security ◽

Security Analysis ◽

Data Communication ◽

Computation Time ◽

Computational Time ◽

Security Issues ◽

Authentication Mechanism ◽

Cloud Server ◽

Secure Authentication ◽

Provably Secure

Secure and efficient authentication mechanism becomes a major concern in cloud computing due to the data sharing among cloud server and user through internet. This paper proposed an efficient Hashing, Encryption and Chebyshev HEC-based authentication in order to provide security among data communication. With the formal and the informal security analysis, it has been demonstrated that the proposed HEC-based authentication approach provides data security more efficiently in cloud. The proposed approach amplifies the security issues and ensures the privacy and data security to the cloud user. Moreover, the proposed HEC-based authentication approach makes the system more robust and secured and has been verified with multiple scenarios. However, the proposed authentication approach requires less computational time and memory than the existing authentication techniques. The performance revealed by the proposed HEC-based authentication approach is measured in terms of computation time and memory as 26ms, and 1878bytes for 100Kb data size, respectively.

Download Full-text

Improving ozone profile retrieval from spaceborne UV backscatter spectrometers using convergence behaviour diagnostics

Atmospheric Measurement Techniques ◽

10.5194/amt-3-1555-2010 ◽

2010 ◽

Vol 3 (6) ◽

pp. 1555-1568 ◽

Cited By ~ 13

Author(s):

B. Mijling ◽

O. N. E. Tuinder ◽

R. F. van Oss ◽

R. J. van der A

Keyword(s):

Cross Sections ◽

A Priori ◽

Computation Time ◽

External Input ◽

Computational Time ◽

Ozone Profile ◽

Global Performance ◽

Convergence Behaviour ◽

Low Cloud ◽

Average Computation Time

Abstract. The Ozone Profile Algorithm (OPERA), developed at KNMI, retrieves the vertical ozone distribution from nadir spectral satellite measurements of back scattered sunlight in the ultraviolet and visible wavelength range. To produce consistent global datasets the algorithm needs to have good global performance, while short computation time facilitates the use of the algorithm in near real time applications. To test the global performance of the algorithm we look at the convergence behaviour as diagnostic tool of the ozone profile retrievals from the GOME instrument (on board ERS-2) for February and October 1998. In this way, we uncover different classes of retrieval problems, related to the South Atlantic Anomaly, low cloud fractions over deserts, desert dust outflow over the ocean, and the intertropical convergence zone. The influence of the first guess and the external input data including the ozone cross-sections and the ozone climatologies on the retrieval performance is also investigated. By using a priori ozone profiles which are selected on the expected total ozone column, retrieval problems due to anomalous ozone distributions (such as in the ozone hole) can be avoided. By applying the algorithm adaptations the convergence statistics improve considerably, not only increasing the number of successful retrievals, but also reducing the average computation time, due to less iteration steps per retrieval. For February 1998, non-convergence was brought down from 10.7% to 2.1%, while the mean number of iteration steps (which dominates the computational time) dropped 26% from 5.11 to 3.79.

Download Full-text

Accelerated plane-wave destruction

Geophysics ◽

10.1190/geo2012-0142.1 ◽

2013 ◽

Vol 78 (1) ◽

pp. V1-V9 ◽

Cited By ~ 32

Author(s):

Zhonghuan Chen ◽

Sergey Fomel ◽

Wenkai Lu

Keyword(s):

Plane Wave ◽

Iterative Algorithm ◽

Field Experiments ◽

Computation Time ◽

Computational Time ◽

Iterative Estimation ◽

Local Slope ◽

Slope Estimation ◽

Correct Estimation ◽

Wave Filter

When plane-wave destruction (PWD) is implemented by implicit finite differences, the local slope is estimated by an iterative algorithm. We propose an analytical estimator of the local slope that is based on convergence analysis of the iterative algorithm. Using the analytical estimator, we design a noniterative method to estimate slopes by a three-point PWD filter. Compared with the iterative estimation, the proposed method needs only one regularization step, which reduces computation time significantly. With directional decoupling of the plane-wave filter, the proposed algorithm is also applicable to 3D slope estimation. We present synthetic and field experiments to demonstrate that the proposed algorithm can yield a correct estimation result with shorter computational time.

Download Full-text

Modeling of Hypervelocity Impact Experiments Using Gamma-SPH Technique

Volume 4: Fluid-Structure Interaction ◽

10.1115/pvp2017-65517 ◽

2017 ◽

Cited By ~ 1

Author(s):

Jérôme Limido ◽

Mohamed Trabia ◽

Shawoon Roy ◽

Brendan O’Toole ◽

Richard Jennings ◽

...

Keyword(s):

Computation Time ◽

3D Models ◽

Hypervelocity Impact ◽

Side Impact ◽

Particle Methods ◽

Computational Time ◽

Back Surface ◽

Tensile Instability ◽

Particle Hydrodynamics ◽

And Performance

A series of experiments were performed to study plastic deformation of metallic plates under hypervelocity impact at the University of Nevada, Las Vegas (UNLV) Center for Materials and Structures using a two-stage light gas gun. In these experiments, cylindrical Lexan projectiles were fired at A36 steel target plates with velocities range of 4.5–6.0 km/s. Experiments were designed to produce a front side impact crater and a permanent bulging deformation on the back surface of the target without inducing complete perforation of the plates. Free surface velocities from the back surface of target plate were measured using the newly developed Multiplexed Photonic Doppler Velocimetry (MPDV) system. To simulate the experiments, a Lagrangian-based smooth particle hydrodynamics (SPH) is typically used to avoid the problems associated with mesh instability. Despite their intrinsic capability for simulation of violent impacts, particle methods have a few drawbacks that may considerably affect their accuracy and performance including, lack of interpolation completeness, tensile instability, and existence of spurious pressure. Moreover, computational time is also a strong limitation that often necessitates the use of reduced 2D axisymmetric models. To address these shortcomings, IMPETUS Afea Solver® implemented a newly developed SPH formulation that can solve the problems regarding spurious pressures and tensile instability. The algorithm takes full advantage of GPU Technology for parallelization of the computation and opens the door for running large 3D models (20,000,000 particles). The combination of accurate algorithms and drastically reduced computation time now makes it possible to run a high fidelity hypervelocity impact model.

Download Full-text

Hyper-optimized tensor network contraction

Quantum ◽

10.22331/q-2021-03-15-410 ◽

2021 ◽

Vol 5 ◽

pp. 410

Author(s):

Johnnie Gray ◽

Stefanos Kourtis

Keyword(s):

Computation Time ◽

Quantum Circuit ◽

Optimization Approach ◽

Many Body ◽

Tensor Networks ◽

Randomized Protocols ◽

Classical Simulation ◽

Tensor Network ◽

Speed Up ◽

Many Body Systems

Tensor networks represent the state-of-the-art in computational methods across many disciplines, including the classical simulation of quantum many-body systems and quantum circuits. Several applications of current interest give rise to tensor networks with irregular geometries. Finding the best possible contraction path for such networks is a central problem, with an exponential effect on computation time and memory footprint. In this work, we implement new randomized protocols that find very high quality contraction paths for arbitrary and large tensor networks. We test our methods on a variety of benchmarks, including the random quantum circuit instances recently implemented on Google quantum chips. We find that the paths obtained can be very close to optimal, and often many orders or magnitude better than the most established approaches. As different underlying geometries suit different methods, we also introduce a hyper-optimization approach, where both the method applied and its algorithmic parameters are tuned during the path finding. The increase in quality of contraction schemes found has significant practical implications for the simulation of quantum many-body systems and particularly for the benchmarking of new quantum chips. Concretely, we estimate a speed-up of over 10,000× compared to the original expectation for the classical simulation of the Sycamore `supremacy' circuits.

Download Full-text

Towards a More General Understanding of the Algorithmic Utility of Recurrent Connections

10.1101/2021.03.12.435130 ◽

2021 ◽

Author(s):

Brett W. Larsen ◽

Shaul Druckmann

Keyword(s):

Decision Making ◽

Large Scale ◽

Neural Circuits ◽

Computation Time ◽

Computational Time ◽

Feedforward Networks ◽

Population Activity ◽

Artificial Neural ◽

The Difference ◽

Recurrent Connections

AbstractLateral and recurrent connections are ubiquitous in biological neural circuits. The strong computational abilities of feedforward networks have been extensively studied; on the other hand, while certain roles for lateral and recurrent connections in specific computations have been described, a more complete understanding of the role and advantages of recurrent computations that might explain their prevalence remains an important open challenge. Previous key studies by Minsky and later by Roelfsema argued that the sequential, parallel computations for which recurrent networks are well suited can be highly effective approaches to complex computational problems. Such “tag propagation” algorithms perform repeated, local propagation of information and were introduced in the context of detecting connectedness, a task that is challenging for feedforward networks. Here, we advance the understanding of the utility of lateral and recurrent computation by first performing a large-scale empirical study of neural architectures for the computation of connectedness to explore feedforward solutions more fully and establish robustly the importance of recurrent architectures. In addition, we highlight a tradeoff between computation time and performance and demonstrate hybrid feedforward/recurrent models that perform well even in the presence of varying computational time limitations. We then generalize tag propagation architectures to multiple, interacting propagating tags and demonstrate that these are efficient computational substrates for more general computations by introducing and solving an abstracted biologically inspired decision-making task. More generally, our work clarifies and expands the set of computational tasks that can be solved efficiently by recurrent computation, yielding hypotheses for structure in population activity that may be present in such tasks.Author SummaryLateral and recurrent connections are ubiquitous in biological neural circuits; intriguingly, this stands in contrast to the majority of current-day artificial neural network research which primarily uses feedforward architectures except in the context of temporal sequences. This raises the possibility that part of the difference in computational capabilities between real neural circuits and artificial neural networks is accounted for by the role of recurrent connections, and as a result a more detailed understanding of the computational role played by such connections is of great importance. Making effective comparisons between architectures is a subtle challenge, however, and in this paper we leverage the computational capabilities of large-scale machine learning to robustly explore how differences in architectures affect a network’s ability to learn a task. We first focus on the task of determining whether two pixels are connected in an image which has an elegant and efficient recurrent solution: propagate a connected label or tag along paths. Inspired by this solution, we show that it can be generalized in many ways, including propagating multiple tags at once and changing the computation performed on the result of the propagation. To illustrate these generalizations, we introduce an abstracted decision-making task related to foraging in which an animal must determine whether it can avoid predators in a random environment. Our results shed light on the set of computational tasks that can be solved efficiently by recurrent computation and how these solutions may appear in neural activity.

Download Full-text

IMPROVED FRACTAL IMAGE COMPRESSION BASED ON ROBUST FEATURE DESCRIPTORS

International Journal of Image and Graphics ◽

10.1142/s0219467811004251 ◽

2011 ◽

Vol 11 (04) ◽

pp. 571-587 ◽

Cited By ~ 10

Author(s):

WILLIAM ROBSON SCHWARTZ ◽

HELIO PEDRINI

Keyword(s):

Image Compression ◽

Computational Cost ◽

Computational Time ◽

Natural Scenes ◽

Fractal Image Compression ◽

Fractal Image ◽

Feature Descriptors ◽

High Compression ◽

Speed Up ◽

Computationally Intensive

Fractal image compression is one of the most promising techniques for image compression due to advantages such as resolution independence and fast decompression. It exploits the fact that natural scenes present self-similarity to remove redundancy and obtain high compression rates with smaller quality degradation compared to traditional compression methods. The main drawback of fractal compression is its computationally intensive encoding process, due to the need for searching regions with high similarity in the image. Several approaches have been developed to reduce the computational cost to locate similar regions. In this work, we propose a method based on robust feature descriptors to speed up the encoding time. The use of robust features provides more discriminative and representative information for regions of the image. When the regions are better represented, the search for similar parts of the image can be reduced to focus only on the most likely matching candidates, which leads to reduction on the computational time. Our experimental results show that the use of robust feature descriptors reduces the encoding time while keeping high compression rates and reconstruction quality.

Download Full-text