Efficient Algorithms for Some Common Applications on GHCC

2005 ◽  
Vol 06 (04) ◽  
pp. 417-433
Author(s):  
Srabani Mukhopadhyaya ◽  
Bhabani P. Sinha

Generalized Hypercube-Connected-Cycles (GHCC), is a challenging interconnection network, proposed earlier in the literature. In this paper, we discuss how some important, useful algorithms, like, matrix transpose, matrix multiplication and sorting can efficiently be implemented on GHCC. Matrix transpose and matrix-by-matrix multiplication of matrices of order n × n, [Formula: see text], takes O(l) and [Formula: see text] time, respectively, on GHCC(l,m), with lml processors. Using the same number of processors, a list of ml numbers can be sorted in O(l2 log 3 m) time.

Algorithms ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 347
Author(s):  
Anne Berry ◽  
Geneviève Simonet

The atom graph of a graph is a graph whose vertices are the atoms obtained by clique minimal separator decomposition of this graph, and whose edges are the edges of all possible atom trees of this graph. We provide two efficient algorithms for computing this atom graph, with a complexity in O(min(nωlogn,nm,n(n+m¯)) time, where n is the number of vertices of G, m is the number of its edges, m¯ is the number of edges of the complement of G, and ω, also denoted by α in the literature, is a real number, such that O(nω) is the best known time complexity for matrix multiplication, whose current value is 2,3728596. This time complexity is no more than the time complexity of computing the atoms in the general case. We extend our results to α-acyclic hypergraphs, which are hypergraphs having at least one join tree, a join tree of an hypergraph being defined by its hyperedges in the same way as an atom tree of a graph is defined by its atoms. We introduce the notion of union join graph, which is the union of all possible join trees; we apply our algorithms for atom graphs to efficiently compute union join graphs.


2003 ◽  
Vol 14 (01) ◽  
pp. 59-78
Author(s):  
MARTIN SCHMOLLINGER ◽  
MICHAEL KAUFMANN

Clusters of symmetric multiprocessor nodes (SMP clusters) are one of the most important parallel architectures at the moment. The architecture consists of shared-memory nodes with multiple processors and a fast interconnection network between the nodes. New programming models try to exploit this architecture by using threads in the nodes and using message-passing-libraries for inter-node communication. In order to develop efficient algorithms, it is necessary to consider the hybrid nature of the architecture and of the programming models. We present the κNUMA-model and a methodology that build a good base for designing efficient algorithms for SMP clusters. The κNUMA-model is a computational model that extends the bulk-synchronous parallel (BSP) model with the characteristics of SMP clusters and new hybrid programming models. The κNUMA-methodology suggests to develop efficient overall algorithms by developing efficient algorithms for each level in the hierarchy. We use the problem of personalized one-to-all-broadcast and the dense matrix-vector-multiplication for the presentation. The theoretical results of the analysis of the dense matrix-vector-multiplication are verified practically. We show results of experiments, made on a Linux-cluster of dual Pentium-III nodes.


2021 ◽  
Vol 3 (1) ◽  
pp. 12-23
Author(s):  
Agnes Ona Bliti Puka

The purpose of this study is to determine the ability to understand the mathematical concept of students of class XI Culture SMAK St. Francis of Assisi Larantuka. The data are collected by the results of students’ ability test in understanding of mathematical concepts and unstructured interviews. The test questions used to measure students’ abilities in understanding mathematical concepts consist of 2 questions in the form of descriptions with matrix transpose material and matrix multiplication. The test results are analysed based on indicators of mathematical understanding, namely: 1) the ability to explain a definition in their own words according to essential traits/characteristics, 2) the ability to make examples in mathematical concepts, 3) the ability to use concepts in solving problems. Population in this study are students on eleventh culture class of SMAK St. Francis of Assisi Larantuka with two sample of students. The results show that there are the differences in the ability to understand mathematical concepts between students through Problem Based Learning.Keywords: Mathematical concepts understanding, Problem Based Learning, matrix.


Author(s):  
E. A. Ashcroft ◽  
A. A. Faustini ◽  
R. Jaggannathan ◽  
W. W. Wadge

In Chapter 1, we saw how Lucid could be used to express solutions to standard problems such as sorting and matrix multiplication. One of the unique characteristics of Lucid is not only that it can be used as a programming language but it can also be used as a “composition” language. That is, instead of using Lucid to specify computations, it can be used to express how computation components (expressed in some other language) can be “glued” together to form a coherent application. By doing so, the resulting application can enjoy some of the practical benefits attributable to Lucid such as high performance through exploitation of implicit parallelism and robustness through software fault tolerance. In this chapter, we discuss one such use of Lucid—as part of a hybrid language to construct parallel applications to be executed on conventional parallel computers. A conventional parallel computer either consists of a number of processors each with local memory interconnected by a network (as in distributed memory architectures) or a number of processors that share memory possibly using an interconnection network (as in shared memory architectures). The past decade has seen the advent of conventional parallel computers starting with the Denelcor HEP evolving to the CM-2 and Intel Hypercube and further evolving to the CM-5, Intel Paragon, Cray T3D, and IBM SP-2. Even networks of workstations (or workstation clusters) are seen as low-cost (“poor man’s”) parallel computers. Programming of conventional parallel computers has proven to be far more challenging than had been expected. Part of the reason is the continued use of low-level, explicitly parallel programming models such as PVM [42], Linda [10]. Two factors have fueled the continuing use of such languages despite their limited success. 1. The need to reuse existing sequential code because the cost of rewriting legacy applications from scratch is considered prohibitive both in economic and technical terms. 2. The need to run on conventional parallel computers that view a “parallel program” at a low level—as consisting of sequential processes that frequently synchronize and communicate with each other using some form of message passing.


2011 ◽  
Vol 22 (05) ◽  
pp. 1001-1018 ◽  
Author(s):  
YAMIN LI ◽  
SHIETUNG PENG ◽  
WANMING CHU

The recursive dual-net is a newly proposed interconnection network for massive parallel computers. The recursive dual-net is based on recursive dual-construction of a symmetric base network. A k-level dual-construction for k > 0 creates a network containing (2n0)2k/2 nodes with node-degree d0 + k, where n0 and d0 are the number of nodes and the node-degree of the base network, respectively. The recursive dual-net is node and edge symmetric and can contain huge number of nodes with small node-degree and short diameter. Disjoint-paths routing and fault-tolerant routing are fundamental and critical issues for the performance of an interconnection network. In this paper, we propose efficient algorithms for disjoint-paths and fault-tolerant routings on the recursive dual-net.


1995 ◽  
Vol 05 (01) ◽  
pp. 37-48 ◽  
Author(s):  
ARNOLD L. ROSENBERG ◽  
VITTORIO SCARANO ◽  
RAMESH K. SITARAMAN

We propose a design for, and investigate the computational power of a dynamically reconfigurable parallel computer that we call the Reconfigurable Ring of Processors ([Formula: see text], for short). The [Formula: see text] is a ring of identical processing elements (PEs) that are interconnected via a flexible multi-line reconfigurable bus, each of whose lines has one-packet width and can be configured, independently of the other lines, to establish an arbitrary PE-to-PE connection. A novel aspect of our design is a communication protocol we call COMET — for Cooperative MEssage Transmission — which allows PEs of an [Formula: see text] to exchange one-packet messages with latency that is logarithmic in the number of PEs the message passes over in transit. The main contribution of this paper is an algorithm that allows an N-PE, N-line [Formula: see text] to simulate an N-PE hypercube executing a normal algorithm, with slowdown less than 4 log log N, provided that the local state of a hypercube PE can be encoded and transmitted using a single packet. This simulation provides a rich class of efficient algorithms for the [Formula: see text], including algorithms for matrix multiplication, sorting, and the Fast Fourer Transform (often using fewer than N buslines). The resulting algorithms for the [Formula: see text] are often within a small constant factor of optimal.


2018 ◽  
Vol 12 ◽  
pp. 25-41
Author(s):  
Matthew C. FONTAINE

Among the most interesting problems in competitive programming involve maximum flows. However, efficient algorithms for solving these problems are often difficult for students to understand at an intuitive level. One reason for this difficulty may be a lack of suitable metaphors relating these algorithms to concepts that the students already understand. This paper introduces a novel maximum flow algorithm, Tidal Flow, that is designed to be intuitive to undergraduate andpre-university computer science students.


Sign in / Sign up

Export Citation Format

Share Document