scholarly journals High-Performance Parallel Implementation of Genetic Algorithm on FPGA

2019 ◽  
Vol 38 (9) ◽  
pp. 4014-4039 ◽  
Author(s):  
Matheus F. Torquato ◽  
Marcelo A. C. Fernandes
Author(s):  
Breno A. de Melo Menezes ◽  
Nina Herrmann ◽  
Herbert Kuchen ◽  
Fernando Buarque de Lima Neto

AbstractParallel implementations of swarm intelligence algorithms such as the ant colony optimization (ACO) have been widely used to shorten the execution time when solving complex optimization problems. When aiming for a GPU environment, developing efficient parallel versions of such algorithms using CUDA can be a difficult and error-prone task even for experienced programmers. To overcome this issue, the parallel programming model of Algorithmic Skeletons simplifies parallel programs by abstracting from low-level features. This is realized by defining common programming patterns (e.g. map, fold and zip) that later on will be converted to efficient parallel code. In this paper, we show how algorithmic skeletons formulated in the domain specific language Musket can cope with the development of a parallel implementation of ACO and how that compares to a low-level implementation. Our experimental results show that Musket suits the development of ACO. Besides making it easier for the programmer to deal with the parallelization aspects, Musket generates high performance code with similar execution times when compared to low-level implementations.


Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 627
Author(s):  
David Marquez-Viloria ◽  
Luis Castano-Londono ◽  
Neil Guerrero-Gonzalez

A methodology for scalable and concurrent real-time implementation of highly recurrent algorithms is presented and experimentally validated using the AWS-FPGA. This paper presents a parallel implementation of a KNN algorithm focused on the m-QAM demodulators using high-level synthesis for fast prototyping, parameterization, and scalability of the design. The proposed design shows the successful implementation of the KNN algorithm for interchannel interference mitigation in a 3 × 16 Gbaud 16-QAM Nyquist WDM system. Additionally, we present a modified version of the KNN algorithm in which comparisons among data symbols are reduced by identifying the closest neighbor using the rule of the 8-connected clusters used for image processing. Real-time implementation of the modified KNN on a Xilinx Virtex UltraScale+ VU9P AWS-FPGA board was compared with the results obtained in previous work using the same data from the same experimental setup but offline DSP using Matlab. The results show that the difference is negligible below FEC limit. Additionally, the modified KNN shows a reduction of operations from 43 percent to 75 percent, depending on the symbol’s position in the constellation, achieving a reduction 47.25% reduction in total computational time for 100 K input symbols processed on 20 parallel cores compared to the KNN algorithm.


1997 ◽  
Vol 6 (1) ◽  
pp. 127-152
Author(s):  
Eric De Sturler ◽  
Volker Strumpen

Recently, the first commercial High Performance Fortran (HPF) subset compilers have appeared. This article reports on our experiences with the xHPF compiler of Applied Parallel Research, version 1.2, for the Intel Paragon. At this stage, we do not expect very High Performance from our HPF programs, even though performance will eventually be of paramount importance for the acceptance of HPF. Instead, our primary objective is to study how to convert large Fortran 77 (F77) programs to HPF such that the compiler generates reasonably efficient parallel code. We report on a case study that identifies several problems when parallelizing code with HPF; most of these problems affect current HPF compiler technology in general, although some are specific for the xHPF compiler. We discuss our solutions from the perspective of the scientific programmer, and presenttiming results on the Intel Paragon. The case study comprises three programs of different complexity with respect to parallelization. We use the dense matrix-matrix product to show that the distribution of arrays and the order of nested loops significantly influence the performance of the parallel program. We use Gaussian elimination with partial pivoting to study the parallelization strategy of the compiler. There are various ways to structure this algorithm for a particular data distribution. This example shows how much effort may be demanded from the programmer to support the compiler in generating an efficient parallel implementation. Finally, we use a small application to show that the more complicated structure of a larger program may introduce problems for the parallelization, even though all subroutines of the application are easy to parallelize by themselves. The application consists of a finite volume discretization on a structured grid and a nested iterative solver. Our case study shows that it is possible to obtain reasonably efficient parallel programs with xHPF, although the compiler needs substantial support from the programmer.


2012 ◽  
Vol 479-481 ◽  
pp. 65-70
Author(s):  
Xiao Hui Zhang ◽  
Liu Qing ◽  
Mu Li

Based on the target detection of alignment template, the paper designs a lane alignment template by using correlation matching method, and combines with genetic algorithm for template stochastic matching and optimization to realize the lane detection. In order to solve the real-time problem of lane detection algorithm based on genetic algorithm, this paper uses the high performance multi-core DSP chip TMS320C6474 as the core, combines with high-speed data transmission technology of Rapid10, realizes the hardware parallel processing of the lane detection algorithm. By Rapid10 bus, the data transmission speed between the DSP and the DSP can reach 3.125Gbps, it basically realizes transmission without delay, and thereby solves the high speed transmission of the large data quantity between processor. The experimental results show that, no matter the calculated lane line, or the running time is better than the single DSP and PC at the parallel C6474 platform. In addition, the road detection is accurate and reliable, and it has good robustness.


2011 ◽  
Vol 328-330 ◽  
pp. 1881-1886
Author(s):  
Cen Zeng ◽  
Qiang Zhang ◽  
Xiao Peng Wei

Genetic algorithm (GA), a kind of global and probabilistic optimization algorithms with high performance, have been paid broad attentions by researchers world wide and plentiful achievements have been made.This paper presents a algorithm to develop the path planning into a given search space using GA in the order of full-area coverage and the obstacle avoiding automatically. Specific genetic operators (such as selection, crossover, mutation) are introduced, and especially the handling of exceptional situations is described in detail. After that, an active genetic algorithm is introduced which allows to overcome the drawbacks of the earlier version of Full-area coverage path planning algorithms.The comparison between some of the well-known algorithms and genetic algorithm is demonstrated in this paper. our path-planning genetic algorithm yields the best performance on the flexibility and the coverage. This meets the needs of polygon obstacles. For full-area coverage path-planning, a genotype that is able to address the more complicated search spaces.


2013 ◽  
Vol 2013 ◽  
pp. 1-5 ◽  
Author(s):  
Stelios A. Mitilineos ◽  
Symeon K. Symeonidis ◽  
Ioannis B. Mpatsis ◽  
Dimitrios Iliopoulos ◽  
Georgios S. Kliros ◽  
...  

Conformal antennas and antenna arrays (arrays) have become necessary for vehicular communications where a high degree of aerodynamic drag reduction is needed, like in avionics and ships. However, the necessity to conform to a predefined shape (e.g., of an aircraft’s nose) directly affects antenna performance since it imposes strict constraints to the antenna array’s shape, element spacing, relative signal phase, and so forth. Thereupon, it is necessary to investigate counterintuitive and arbitrary antenna shapes in order to compensate for these constraints. Since there does not exist any available theoretical frame for designing and developing arbitrary-shape antennas in a straightforward manner, we have developed a platform combining a genetic algorithm-based design, optimization suite, and an electromagnetic simulator for designing patch antennas with a shape that is not a priori known (the genetic algorithm optimizes the shape of the patch antenna). The proposed platform is further enhanced by the ability to design and optimize antenna arrays and is intended to be used for the design of a series of antennas including conformal antennas for shipping applications. The flexibility and performance of the proposed platform are demonstrated herein via the design of a high-performance GPS patch antenna.


Sign in / Sign up

Export Citation Format

Share Document