CYBERPLUS, a High Performance Parallel Processing System

1985 ◽  
pp. 24-29 ◽  
Author(s):  
Wayne A. Ray
2021 ◽  
Vol 17 (11) ◽  
pp. 155014772110331
Author(s):  
Jung-hyun Seo ◽  
HyeongOk Lee

One method to create a high-performance computer is to use parallel processing to connect multiple computers. The structure of the parallel processing system is represented as an interconnection network. Traditionally, the communication links that connect the nodes in the interconnection network use electricity. With the advent of optical communication, however, optical transpose interconnection system networks have emerged, which combine the advantages of electronic communication and optical communication. Optical transpose interconnection system networks use electronic communication for relatively short distances and optical communication for long distances. Regardless of whether the interconnection network uses electronic communication or optical communication, network cost is an important factor among the various measures used for the evaluation of networks. In this article, we first propose a novel optical transpose interconnection system–Petersen-star network with a small network cost and analyze its basic topological properties. Optical transpose interconnection system–Petersen-star network is an undirected graph where the factor graph is Petersen-star network. OTIS–PSN n has the number of nodes 102n, degree n+3, and diameter 6 n − 1. Second, we compare the network cost between optical transpose interconnection system–Petersen-star network and other optical transpose interconnection system networks. Finally, we propose a routing algorithm with a time complexity of 6 n − 1 and a one-to-all broadcasting algorithm with a time complexity of 2 n − 1.


2013 ◽  
Vol 278-280 ◽  
pp. 1043-1046
Author(s):  
Xin Cheng ◽  
Hua Chun Wu

Rapid increases in the complexity of algorithms for real-time signal processing applications have made multi-processors parallel processing technology needed. This paper proposes a design of high-performance real-time bus (RTB), based on which distributed shared memory (DSM) mechanism is established to implement data exchange among multiple processors. Adopting DSM mechanism can reduce the software overhead and improve data processing performance significantly. Definition and implementation details of RTB and data transmission model are discussed. Experimental results show the stable data transmission bandwidth is achieved with performance not affected by the increasing number of processors.


2021 ◽  
Vol 13 (3) ◽  
pp. 78
Author(s):  
Chuanhong Li ◽  
Lei Song ◽  
Xuewen Zeng

The continuous increase in network traffic has sharply increased the demand for high-performance packet processing systems. For a high-performance packet processing system based on multi-core processors, the packet scheduling algorithm is critical because of the significant role it plays in load distribution, which is related to system throughput, attracting intensive research attention. However, it is not an easy task since the canonical flow-level packet scheduling algorithm is vulnerable to traffic locality, while the packet-level packet scheduling algorithm fails to maintain cache affinity. In this paper, we propose an adaptive throughput-first packet scheduling algorithm for DPDK-based packet processing systems. Combined with the feature of DPDK burst-oriented packet receiving and transmitting, we propose using Subflow as the scheduling unit and the adjustment unit making the proposed algorithm not only maintain the advantages of flow-level packet scheduling algorithms when the adjustment does not happen but also avoid packet loss as much as possible when the target core may be overloaded Experimental results show that the proposed method outperforms Round-Robin, HRW (High Random Weight), and CRC32 on system throughput and packet loss rate.


2021 ◽  
Vol 49 (4) ◽  
pp. 12-17
Author(s):  
Feilong Liu ◽  
Claude Barthels ◽  
Spyros Blanas ◽  
Hideaki Kimura ◽  
Garret Swart

Networkswith Remote DirectMemoryAccess (RDMA) support are becoming increasingly common. RDMA, however, offers a limited programming interface to remote memory that consists of read, write and atomic operations. With RDMA alone, completing the most basic operations on remote data structures often requires multiple round-trips over the network. Data-intensive systems strongly desire higher-level communication abstractions that supportmore complex interaction patterns. A natural candidate to consider is MPI, the de facto standard for developing high-performance applications in the HPC community. This paper critically evaluates the communication primitives of MPI and shows that using MPI in the context of a data processing system comes with its own set of insurmountable challenges. Based on this analysis, we propose a new communication abstraction named RDMO, or Remote DirectMemory Operation, that dispatches a short sequence of reads, writes and atomic operations to remote memory and executes them in a single round-trip.


Sign in / Sign up

Export Citation Format

Share Document