BENCHMARKING LATTICE-BASED APPLICATIONS ON PARALLEL ARCHITECTURES

1996 ◽  
Vol 06 (03) ◽  
pp. 309-320
Author(s):  
GIULIO DESTRI ◽  
PAOLO MARENZONI

The numerical analysis and solution of many physics and engineering problems is based on lattice-oriented algorithms. The Cellular Neural Network (CNN) computational paradigm embodies a wide set of grid problems characterized by locality of information exchanges among lattice points. Performance analysis tests using CNN-based algorithms may provide insights into the performance achievable by a given parallel architecture, with respect to a wide class of lattice problems. In this paper a message passing version of a general CNN-based algorithm is implemented and optimized for three general purpose parallel architectures: Connection Machine CM-5, Cray T3D, and IBM SP2. Separate measurements on computations and communications of the algorithm allow us to evaluate processing node and network communication performance of the machines. Moreover. the overall performance of the full application is analyzed, in order to understand the scalability and the range of applicability of this prototype of lattice problem.

2021 ◽  
pp. 1-63
Author(s):  
Jin Lixing ◽  
Duan Xingguang ◽  
Li Changsheng ◽  
Shi Qingxin ◽  
Wen Hao ◽  
...  

Abstract This paper presents a novel parallel architecture with seven active degrees of freedom (DOFs) for general-purpose haptic devices. The prime features of the proposed mechanism are partial decoupling, large dexterous working area, and fixed actuators. The detailed processes of design, modeling, and optimization are introduced and the performance is simulated. After that, a mechanical prototype is fabricated and tested. Results of the simulations and experiments reveal that the proposed mechanism possesses excellent performances on motion flexibility and force feedback. This paper aims to provide a remarkable solution of the general-purpose haptic device for teleoperation systems with uncertain mission in complex applications.


Author(s):  
Pierre Collet

Evolutionary computation is an old field of computer science, that started in the 1960s nearly simultaneously in different parts of the world. It is an optimization technique that mimics the principles of Darwinian evolution in order to find good solutions to intractable problems faster than a random search. Artificial Evolution is only one among many stochastic optimization methods, but recently developed hardware (General Purpose Graphic Processing Units or GPGPU) gives it a tremendous edge over all the other algorithms, because its inherently parallel nature can directly benefit from the difficult to use Single Instruction Multiple Data parallel architecture of these cheap, yet very powerful cards.


Electronics ◽  
2019 ◽  
Vol 8 (11) ◽  
pp. 1342
Author(s):  
Gianvito Urgese ◽  
Francesco Barchi ◽  
Emanuele Parisi ◽  
Evelina Forno ◽  
Andrea Acquaviva ◽  
...  

SpiNNaker is a neuromorphic globally asynchronous locally synchronous (GALS) multi-core architecture designed for simulating a spiking neural network (SNN) in real-time. Several studies have shown that neuromorphic platforms allow flexible and efficient simulations of SNN by exploiting the efficient communication infrastructure optimised for transmitting small packets across the many cores of the platform. However, the effectiveness of neuromorphic platforms in executing massively parallel general-purpose algorithms, while promising, is still to be explored. In this paper, we present an implementation of a parallel DNA sequence matching algorithm implemented by using the MPI programming paradigm ported to the SpiNNaker platform. In our implementation, all cores available in the board are configured for executing in parallel an optimised version of the Boyer-Moore (BM) algorithm. Exploiting this application, we benchmarked the SpiNNaker platform in terms of scalability and synchronisation latency. Experimental results indicate that the SpiNNaker parallel architecture allows a linear performance increase with the number of used cores and shows better scalability compared to a general-purpose multi-core computing platform.


2019 ◽  
Vol 19 (04) ◽  
pp. 1950023
Author(s):  
Ahmed S. Mashaly

Image segmentation is one of the most challenging research fields for both image analysis and interpretation. The applications of image segmentation could be found as the primary step in various computer vision systems. Therefore, the choice of a reliable and accurate segmentation method represents a non-trivial task. Since the selected image segmentation method influences the overall performance of the remaining system steps, sky segmentation appears as a vital step for Unmanned Aerial Vehicle (UAV) autonomous obstacle avoidance missions. In this paper, we are going to introduce a comprehensive literature survey of the different types of image segmentation methodology followed by a detailed illustration of the general-purpose methods and the state-of-art sky segmentation approaches. In addition, we introduce an improved version of our previously published work for sky segmentation purpose. The performance of the proposed sky segmentation approach is compared with various image segmentation approaches using different parameters and datasets. For performance assessment, we test our approach under different situations and compare its performance with commonly used approaches in terms of several assessment indexes. From the experimental results, the proposed method gives promising results compared with the other image segmentation approaches.


2000 ◽  
Vol 12 (5) ◽  
pp. 521-526
Author(s):  
Masanori Hariyama ◽  
◽  
Michitaka Kameyama

This article presents a stereo-matching algorithm to establish reliable correspondence between images by selecting a desirable window size for SAD (Sum of Absolute Differences) computation. In SAD computation, parallelism between pixels in a window changes depending on its window size, while parallelism between windows is predetermined by the input-image size. Based on this consideration, a window-parallel and pixel-serial architecture is proposed to achieve 100% utilization of processing elements. Performance of the VLSI processor is evaluated to be more than 10,000 times higher than that of a general-purpose processor.


2004 ◽  
Vol 10 (9) ◽  
pp. 1335-1357
Author(s):  
Takashi Nagata

This paper presents a general and efficient formulation applicable to a vast variety of rigid and flexible multibody systems. It is based on a variable-gain error correction with scaling and adaptive control of the convergence parameter. The methodology has the following distinctive features. (i) All types of holonomic and non-holonomic equality constraints as well as a class of inequalities can be treated in a plain and unified manner. (ii) Stability of the constraints is assured. (iii) The formulation has an order Ncomputational cost in terms of both the constrained and unconstrained degrees of freedom, regardless of the system topology. (iv) Unlike the traditional recursive order Nalgorithms, it is quite amenable to parallel computation. (v) Because virtually no matrix operations are involved, it can be implemented to very simple general-purpose simulation programs. Noting the advantages, the algorithm has been realized as a C++ code supporting distributed processing through the Message-Passing Interface (MPI). Versatility, dynamical validity and efficiency of the approach are demonstrated through numerical studies of several particular systems including a crawler and a flexible space structure.


1993 ◽  
Vol 04 (01) ◽  
pp. 5-16 ◽  
Author(s):  
ALBERTO BROGGI ◽  
VINCENZO D'ANDREA ◽  
GIULIO DESTRI

In this paper we discuss the use of the Cellular Automata (CA) computational model in computer vision applications on massively parallel architectures. Motivations and guidelines of this approach to low-level vision in the frame of the PROMETHEUS project are discussed. The hard real-time requirement of actual application can be only satisfied using an ad hoc VLSI massively parallel architecture (PAPRICA). The hardware solutions and the specific algorithms can be efficiently verified and tested only using, as a simulator, a general purpose machine with a parent architecture (CM-2). An example of application related to feature extraction is discussed.


Sign in / Sign up

Export Citation Format

Share Document