CYBERPLUS, a High Performance Parallel Processing System

One method to create a high-performance computer is to use parallel processing to connect multiple computers. The structure of the parallel processing system is represented as an interconnection network. Traditionally, the communication links that connect the nodes in the interconnection network use electricity. With the advent of optical communication, however, optical transpose interconnection system networks have emerged, which combine the advantages of electronic communication and optical communication. Optical transpose interconnection system networks use electronic communication for relatively short distances and optical communication for long distances. Regardless of whether the interconnection network uses electronic communication or optical communication, network cost is an important factor among the various measures used for the evaluation of networks. In this article, we first propose a novel optical transpose interconnection system–Petersen-star network with a small network cost and analyze its basic topological properties. Optical transpose interconnection system–Petersen-star network is an undirected graph where the factor graph is Petersen-star network. OTIS–PSN n has the number of nodes 102n, degree n+3, and diameter 6 n − 1. Second, we compare the network cost between optical transpose interconnection system–Petersen-star network and other optical transpose interconnection system networks. Finally, we propose a routing algorithm with a time complexity of 6 n − 1 and a one-to-all broadcasting algorithm with a time complexity of 2 n − 1.

Download Full-text

Design of High-Performance Real-Time Bus in Parallel Processing System

International Journal of Education and Management Engineering ◽

10.5815/ijeme.2011.05.02 ◽

2011 ◽

Vol 1 (5) ◽

pp. 10-16

Author(s):

Cheng Xin ◽

Zhou Yunfei

Keyword(s):

Parallel Processing ◽

Real Time ◽

High Performance ◽

Processing System

Download Full-text

A new framework of cluster-based parallel processing system for high-performance geo-computing

2009 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2009.5417598 ◽

2009 ◽

Cited By ~ 1

Author(s):

Yan Ma ◽

Dingsheng Liu ◽

Jingshan Li

Keyword(s):

Parallel Processing ◽

High Performance ◽

Processing System ◽

New Framework

Download Full-text

Cyberplus, a high performance parallel processing system for simulation applications

Informatik-Fachberichte - Simulationstechnik ◽

10.1007/978-3-642-70640-0_3 ◽

1985 ◽

pp. 23-34

Author(s):

W. A. Ray

Keyword(s):

Parallel Processing ◽

High Performance ◽

Processing System

Download Full-text

A high-performance embedded massively parallel processing system

Proceedings of the First International Conference on Massively Parallel Computing Systems (MPCS) The Challenges of General-Purpose and Special-Purpose Computing ◽

10.1109/mpcs.1994.367077 ◽

2002 ◽

Cited By ~ 3

Author(s):

L. Bengtsson ◽

K. Nilsson ◽

B. Svensson

Keyword(s):

Parallel Processing ◽

High Performance ◽

Processing System ◽

Massively Parallel ◽

Massively Parallel Processing

Download Full-text

High-Performance Real-Time Bus in Parallel Processing System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.278-280.1043 ◽

2013 ◽

Vol 278-280 ◽

pp. 1043-1046

Author(s):

Xin Cheng ◽

Hua Chun Wu

Keyword(s):

Parallel Processing ◽

Real Time ◽

Data Transmission ◽

High Performance ◽

Data Exchange ◽

Processing System ◽

Distributed Shared Memory ◽

Transmission Model ◽

Time Signal ◽

Complexity Of Algorithms

Rapid increases in the complexity of algorithms for real-time signal processing applications have made multi-processors parallel processing technology needed. This paper proposes a design of high-performance real-time bus (RTB), based on which distributed shared memory (DSM) mechanism is established to implement data exchange among multiple processors. Adopting DSM mechanism can reduce the software overhead and improve data processing performance significantly. Definition and implementation details of RTB and data transmission model are discussed. Experimental results show the stable data transmission bandwidth is achieved with performance not affected by the increasing number of processors.

Download Full-text

Periodic Task Scheduling Algorithm for Homogeneous Multi-core Parallel Processing System

2019 IEEE International Conference on Unmanned Systems (ICUS) ◽

10.1109/icus48101.2019.8995985 ◽

2019 ◽

Author(s):

Siyu Xiao ◽

Dongguang Li ◽

Shiyao Wang

Keyword(s):

Parallel Processing ◽

Task Scheduling ◽

Scheduling Algorithm ◽

Processing System ◽

Periodic Task ◽

Task Scheduling Algorithm

Download Full-text

An Adaptive Throughput-First Packet Scheduling Algorithm for DPDK-Based Packet Processing Systems

Future Internet ◽

10.3390/fi13030078 ◽

2021 ◽

Vol 13 (3) ◽

pp. 78

Author(s):

Chuanhong Li ◽

Lei Song ◽

Xuewen Zeng

Keyword(s):

Packet Loss ◽

High Performance ◽

Packet Scheduling ◽

Scheduling Algorithm ◽

Processing System ◽

System Throughput ◽

Packet Processing ◽

Research Attention ◽

Continuous Increase ◽

Packet Scheduling Algorithm

The continuous increase in network traffic has sharply increased the demand for high-performance packet processing systems. For a high-performance packet processing system based on multi-core processors, the packet scheduling algorithm is critical because of the significant role it plays in load distribution, which is related to system throughput, attracting intensive research attention. However, it is not an easy task since the canonical flow-level packet scheduling algorithm is vulnerable to traffic locality, while the packet-level packet scheduling algorithm fails to maintain cache affinity. In this paper, we propose an adaptive throughput-first packet scheduling algorithm for DPDK-based packet processing systems. Combined with the feature of DPDK burst-oriented packet receiving and transmitting, we propose using Subflow as the scheduling unit and the adjustment unit making the proposed algorithm not only maintain the advantages of flow-level packet scheduling algorithms when the adjustment does not happen but also avoid packet loss as much as possible when the target core may be overloaded Experimental results show that the proposed method outperforms Round-Robin, HRW (High Random Weight), and CRC32 on system throughput and packet loss rate.

Download Full-text

High performance computing through parallel processing

Proceedings of the international conference on APL-Berlin-2000 conference - APL '00 ◽

10.1145/570475.570479 ◽

2000 ◽

Author(s):

Martin Barghoorn

Keyword(s):

Parallel Processing ◽

High Performance Computing ◽

High Performance ◽

Performance Computing

Download Full-text

Beyond MPI

ACM SIGMOD Record ◽

10.1145/3456859.3456862 ◽

2021 ◽

Vol 49 (4) ◽

pp. 12-17

Author(s):

Feilong Liu ◽

Claude Barthels ◽

Spyros Blanas ◽

Hideaki Kimura ◽

Garret Swart

Keyword(s):

High Performance ◽

Processing System ◽

Complex Interaction ◽

Remote Memory ◽

Interaction Patterns ◽

Round Trip ◽

Data Processing System ◽

Data Intensive ◽

Multiple Round ◽

Programming Interface

Networkswith Remote DirectMemoryAccess (RDMA) support are becoming increasingly common. RDMA, however, offers a limited programming interface to remote memory that consists of read, write and atomic operations. With RDMA alone, completing the most basic operations on remote data structures often requires multiple round-trips over the network. Data-intensive systems strongly desire higher-level communication abstractions that supportmore complex interaction patterns. A natural candidate to consider is MPI, the de facto standard for developing high-performance applications in the HPC community. This paper critically evaluates the communication primitives of MPI and shows that using MPI in the context of a data processing system comes with its own set of insurmountable challenges. Based on this analysis, we propose a new communication abstraction named RDMO, or Remote DirectMemory Operation, that dispatches a short sequence of reads, writes and atomic operations to remote memory and executes them in a single round-trip.

Download Full-text