scholarly journals PROVABLY CONSISTENT DISTRIBUTED DELAUNAY TRIANGULATION

Author(s):  
M. Brédif ◽  
L. Caraffa ◽  
M. Yirci ◽  
P. Memari

Abstract. This paper deals with the distributed computation of Delaunay triangulations of massive point sets, mainly motivated by the needs of a scalable out-of-core surface reconstruction workflow from massive urban LIDAR datasets. Such a data often corresponds to a huge point cloud represented through a set of tiles of relatively homogeneous point sizes. This will be the input of our algorithm which will naturally partition this data across multiple processing elements. The distributed computation and communication between processing elements is orchestrated efficiently through an uncentralized model to represent, manage and locally construct the triangulation corresponding to each tile. Initially inspired by the star splaying approach, we review the Tile& Merge algorithm for computing Distributed Delaunay Triangulations on the cloud, provide a theoretical proof of correctness of this algorithm, and analyse the performance of our Spark implementation in terms of speedup and strong scaling in both synthetic and real use case datasets. A HPC implementation (e.g. using MPI), left for future work, would benefit from its more efficient message passing paradigm but lose the robustness and failure resilience of our Spark approach.

Author(s):  
Amanda Bienz ◽  
William D Gropp ◽  
Luke N Olson

Algebraic multigrid (AMG) is often viewed as a scalable [Formula: see text] solver for sparse linear systems. Yet, AMG lacks parallel scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy and in the iterative solve phase. This work introduces a parallel implementation of AMG that reduces the cost of communication, yielding improved parallel scalability. It is common in Message Passing Interface (MPI), particularly in the MPI-everywhere approach, to arrange inter-process communication, so that communication is transported regardless of the location of the send and receive processes. Performance tests show notable differences in the cost of intra- and internode communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and the size of internode messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scaling of the entire method.


Technologies ◽  
2018 ◽  
Vol 6 (4) ◽  
pp. 116 ◽  
Author(s):  
Francisco Lacueva-Pérez ◽  
Lea Hannola ◽  
Jan Nierhoff ◽  
Stelios Damalas ◽  
Soumyajit Chatterjee ◽  
...  

The introduction of innovative digital tools for supporting manufacturing processes has far-reaching effects at an organizational and individual level due to the development of Industry 4.0. The FACTS4WORKERS project funded by H2020, i.e., Worker-Centric Workplaces in Smart Factories, aims to develop user-centered assistance systems in order to demonstrate their impact and applicability at the shop floor. To achieve this, understanding how to develop such tools is as important as assessing if advantages can be derived from the ICT system created. This study introduces the technology of a workplace solution linked to the industrial challenge of self-learning manufacturing workplaces. Subsequently, a two-step approach to evaluate the presented system is discussed, consisting of the one used in FACTS4WORKERS and the one used in the “Heuristics for Industry 4.0” project. Both approaches and the use case are introduced as a base for presenting the comparison of the results collected in this paper. The comparison of the results for the presented use case is extended with the results for the rest of the FACTS4WORKERS use cases and with future work in the framework.


1991 ◽  
Vol 20 (356) ◽  
Author(s):  
Padmanabhan Krishnan

In this paper we describe a technique to extend a process language such as CCS which does not model many aspects of distributed computation to one which does. The idea is to use a concept of location which represents a virtual node. Processes at different locations can evolve independently. Furthermore, communication between the processes at different locations occurs via explicit message passing. We extend CCS with locations and message passing primitives and present its operational semantics. We show that the equivalences induced by the new semantics and its properties are similar to the equivalences in CCS. We also show how the semantics of configuration and routing can be handled.


Author(s):  
John Anderson Gómez Múnera ◽  
Alejandro Giraldo Quintero

The considerable increase in computation of the optimal control problems has in many cases overflowed the computing capacity available to handle complex systems in real time. For this reason, alternatives such as parallel computing are studied in this article, where the problem is worked out by distributing the tasks among several processors in order to accelerate the computation and to analyze and investigate the reduction of the total time of calculation the incremental gradually the processors used in it. We explore the use of these methods with a case study represented in a rolling mill process, and in turn making use of the strategy of updating the Phase Finals values for the construction of the final penalty matrix for the solution of the differential Riccati Equation. In addition, the order of the problem studied is increasing gradually for compare the improvements achieved in the models with major dimension. Parallel computing alternatives are also studied through multiple processing elements within a single machine or in a cluster via OpenMP, which is an application programming interface (API) that allows the creation of shared memory programs.


Author(s):  
VIPIN CHAUDHARY ◽  
K. KUMARI ◽  
P. ARUNACHALAM ◽  
J.K. AGGARWAL

Octrees offer a powerful means for representing and manipulating 3-D objects. This paper presents an implementation of octree manipulations using a new approach on a shared memory architecture. Octrees are hierarchical data structures used to model 3-D objects. The manipulation of these data structures involves performing independent computations on each node of the octree. Octrees are much easier to deal with than other forms of representations used to model 3-D objects especially where extensive manipulations are involved. When these operations are distributed among multiple processing elements (PEs) and executed simultaneously, a significant speedup may be achieved. Manipulations such as a complement, a union, an intersection and other operations such as finding the volume and centroid which this paper describes are implemented on the Sequent Balance multiprocessor. In this approach the PEs are allocated dynamically, resulting in a uniform load balancing among them. The experimental results presented illustrate the feasibility of the approach. Although this evaluation has been originally done for shared memory machines, it will provide insight for the evaluation of other architectures.


2019 ◽  
Author(s):  
Stuart Byma ◽  
Akash Dhasade ◽  
Adrian Altenhoff ◽  
Christophe Dessimoz ◽  
James R. Larus

AbstractThis paper presents a new, parallel implementation of clustering and demonstrates its utility in greatly speeding up the process of identifying homologous proteins. Clustering is a technique to reduce the number of comparison needed to find similar pairs in a set of n elements such as protein sequences. Precise clustering ensures that each pair of similar elements appears together in at least one cluster, so that similarities can be identified by all-to-all comparison in each cluster rather than on the full set. This paper introduces ClusterMerge, a new algorithm for precise clustering that uses transitive relationships among the elements to enable parallel and scalable implementations of this approach.We apply ClusterMerge to the important problem of finding similar amino acid sequences in a collection of proteins. ClusterMerge identifies 99.8% of similar pairs found by a full O (n2) comparison, with only half as many operations. More importantly, ClusterMerge is highly amenable to parallel and distributed computation. Our implementation achieves a speedup of 604 × on 768 cores (1400 × faster than a comparable single-threaded clustering implementation), a strong scaling efficiency of 90%, and a weak scaling efficiency of nearly 100%.


Author(s):  
Matthias Wenzl ◽  
Peter Roessler ◽  
Andreas Puhm

Abstract This work presents a proof-of-concept of a new approach on automatic generation of digital hardware that is able to check application-level properties of an embedded system such as a faulty system behavior at runtime. The approach makes use of assertion-based verification setups that today are very common in the area of digital hardware design with, however, the sole focus on logic simulation. Thus, a PSL-to-VHDL compiler is introduced that generates VHDL (Very High Speed Integrated Circuit Description Language) code out of PSL (Property Specification Language) assertions which can be further processed by a traditional digital logic synthesis tool. That way, runtime checker units can be automatically generated with little effort because of the already existing assertion-based test benches. Furthermore, a model railway demonstrator is presented herein as an example for a safety-critical application to prove the proposed tool flow on a use case. Implementation results based on that use case are discussed. Finally, the paper concludes with a brief outlook on related future work of the authors.


2020 ◽  
Author(s):  
Jason Louis Turner ◽  
Samuel N. Stechmann

Abstract. Parallel computing can offer substantial speedup of numerical simulations in comparison to serial computing, as parallel computing uses many processors simultaneously rather than a single processor. However, it typically also requires substantial time and effort to convert a serial code into a parallel code. Here, a new module is developed to reduce the time and effort required to parallelize a serial code. The tested version of the module is written in the Fortran programming language,while the framework could also be extended to other languages (C++, Python, Julia, etc.). The Message Passing Interface is used to allow for either shared-memory or distributed-memory computer architectures. The software is designed for solving partial differential equations on a rectangular two-dimensional or three-dimensional domain, using finite difference, finite volume, pseudo-spectral, or other similar numerical methods. Examples are provided for two idealized models of atmospheric and oceanic fluid dynamics: the two-level quasi-geostrophic equations, and the stochastic heat equation as a model for turbulent advection–diffusion of either water vapor and clouds or sea surface height variability. In tests of the parallelized code, the strong scaling efficiency for the finite difference code is seen to be roughly 80 % to 90 %, which is achieved by adding roughly only 10 new lines to the serial code. Therefore, EZ Parallel provides great benefits with minimal additional effort.


2021 ◽  
Vol 17 (9) ◽  
pp. e1009037
Author(s):  
Jack B. Maguire ◽  
Daniele Grattarola ◽  
Vikram Khipple Mulligan ◽  
Eugene Klyshko ◽  
Hans Melo

Graph representations are traditionally used to represent protein structures in sequence design protocols in which the protein backbone conformation is known. This infrequently extends to machine learning projects: existing graph convolution algorithms have shortcomings when representing protein environments. One reason for this is the lack of emphasis on edge attributes during massage-passing operations. Another reason is the traditionally shallow nature of graph neural network architectures. Here we introduce an improved message-passing operation that is better equipped to model local kinematics problems such as protein design. Our approach, XENet, pays special attention to both incoming and outgoing edge attributes. We compare XENet against existing graph convolutions in an attempt to decrease rotamer sample counts in Rosetta’s rotamer substitution protocol, used for protein side-chain optimization and sequence design. This use case is motivating because it both reduces the size of the search space for classical side-chain optimization algorithms, and allows larger protein design problems to be solved with quantum algorithms on near-term quantum computers with limited qubit counts. XENet outperformed competing models while also displaying a greater tolerance for deeper architectures. We found that XENet was able to decrease rotamer counts by 40% without loss in quality. This decreased the memory consumption for classical pre-computation of rotamer energies in our use case by more than a factor of 3, the qubit consumption for an existing sequence design quantum algorithm by 40%, and the size of the solution space by a factor of 165. Additionally, XENet displayed an ability to handle deeper architectures than competing convolutions.


2021 ◽  
Author(s):  
Zhi Liu ◽  
Qiyue Li ◽  
Xianfu Chen ◽  
Celimuge Wu ◽  
susumu ishihara ◽  
...  

<div>Volumetric video (or hologram video), the medium for representing natural content in VR/AR/MR, is presumably</div><div>the next generation of video technology and a typical use case for 5G and beyond wireless communications. To realize volumetric video applications, efficient volumetric video streaming is in critical demand. This article responds to the challenges of and propose solutions to wireless transmission systems of point cloud video, which is the most popular and favored way to represent volumetric media and significantly differs from the other types of videos. In particular, we first introduce point cloud video technology and its applications, and then discuss the challenges of</div><div>and solutions to point cloud video streaming, including encoding, tiling, viewing angle prediction, decoding, quality assessment and transmission optimization. Furthermore, we explain a prototype of MPEG DASH-based point cloud video streaming system as a preliminary study, along with more simulation results to verify its performance. Finally, we identify future research directions for providing high-quality point cloud video streaming.</div>


Sign in / Sign up

Export Citation Format

Share Document