Realizing reconfigurable mesh algorithms on softcore arrays

Author(s):  
Heiner Giefers ◽  
Marco Platzner
Keyword(s):  
1998 ◽  
Vol 09 (02) ◽  
pp. 199-211
Author(s):  
SANGUTHEVAR RAJASEKARAN ◽  
THEODORE MCKENDALL

In this paper we demonstrate the power of reconfiguration by presenting efficient randomized algorithms for both packet routing and sorting on a reconfigurable mesh connected computer. The run times of these algorithms are better than the best achievable time bounds on a conventional mesh. Many variations of the reconfigurable mesh can be found in the literature. We define yet another variation which we call as Mr. We also make use of the standard PARBUS model. We show that permutation routing problem can be solved on a linear array Mr of size n in [Formula: see text] steps, whereas n-1 is the best possible run time without reconfiguration. A trivial lower bound for routing on Mr will be [Formula: see text]. On the PARBUS linear array, n is a lower bound and hence any standard n-step routing algorithm will be optimal. We also show that permutation routing on an n×n reconfigurable mesh Mr can be done in time n+o(n) using a randomized algorithm or in time 1.25n+o(n) deterministically. In contrast, 2n-2 is the diameter of a conventional mesh and hence routing and sorting will need at least 2n-2 steps on a conventional mesh. A lower bound of [Formula: see text] is in effect for routing on the 2D mesh Mr as well. On the other hand, n is a lower bound for routing on the PARBUS and our algorithms have the same time bounds on the PARBUS as well. Thus our randomized routing algorithm is optimal upto a lower order term. In addition we show that the problem of sorting can be solved in randomized time n+o(n) on Mr as well as on PARBUS. Clearly, this sorting algorithm will be optimal on the PARBUS model. The time bounds of our randomized algorithms hold with high probability.


Author(s):  
Yosi Ben-Asher ◽  
Esti Stein ◽  
Vladislav Tartakovsky

Pass transistor logic (PTL) is a circuit design technique wherein transistors are used as switches. The reconfigurable mesh (RM) is a model that exploits the power of PTLs signal switching, by enabling flexible bus connections in a grid of processing elements containing switches. RM algorithms have theoretical results proving that [Formula: see text] can speed up computations significantly. However, the RM assumes that the latency of broadcasting a signal through [Formula: see text] switches (bus length) is 1. This is an unrealistic assumption preventing physical realizations of the RM. We propose the restricted-RM (RRM) wherein the bus lengths are restricted to [Formula: see text], [Formula: see text]. We show that counting the number of 1-bits in an input of [Formula: see text] bits can be done in [Formula: see text] steps for [Formula: see text] by an [Formula: see text] RRM. An almost matching lower bound is presented, using a technique which adds to the few existing lower-bound techniques in this area. Finally, the algorithm was directly coded over an FPGA, outperforming an optimal tree of adders. This work presents an alternative way of counting, which is fundamental for summing, beating regular Boolean circuits for large numbers, where summing a vast amount of numbers is the basis of any accelerator in embedded systems such as neural-nets and streaming. a


VLSI Design ◽  
1999 ◽  
Vol 9 (1) ◽  
pp. 55-67
Author(s):  
Hsiu-Niang Chen ◽  
Kuo-Liang Chung

String matching (SM) problem is to find the occurrences of a pattern within a text. A vanable length don't care (VLDC) is a special symbol, not belonging to a finite alphabet ∑ but in ∑*. Each VLDC in the pattern can match any substring in the text. Given a run-length coded text of length 2n over ∑ and a run-length coded pattern of length 2m over ∑*, this paper first presents an O(1) time parallel SM algorithm for run-length coded strings with VLDCs on a reconfigurable mesh (RM) using O(nm) processors. Consider the hardware limitation in VLSI implementation. In order to be suitable for VLSI modular implementation, a partitionable parallel algorithm on the RM with limited processors is further presented. For N < n and M < m, the SM for run-length coded strings with VLDCs can be solved in O(X^Y^) time on the RM using O(NM)(= O((nm)/((X^Y^))) processors, where X^ = [(n – 1)/(N – 1)] and Y^ = [(m – 1)/(M – 1)].


2004 ◽  
Vol 14 (03n04) ◽  
pp. 337-350 ◽  
Author(s):  
MARY M. ESHAGHIAN-WILNER ◽  
RUSS MILLER

In this paper, we introduce the Systolic Reconfigurable Mesh (SRM), which combines aspects of the reconfigurable mesh with that of systolic arrays. Every processor controls a local switch that can be reconfigured during every clock cycle in order to control the physical connections between its four bi-directional bus lines. Data is input on one side of the systolic reconfigurable mesh and output from another side, one row/column per unit time. Efficient algorithms are presented for intermediate-level vision tasks, including histograming, connectivity, convexity, and proximity.


Sign in / Sign up

Export Citation Format

Share Document