Instruction Set Enhancements for High-Performance Multicore Execution on the REALJava Platform

Author(s):  
Joonas Tyystjarvi ◽  
Tero Saantti ◽  
Juha Plosila
2021 ◽  
Vol 336 ◽  
pp. 04018
Author(s):  
Ping Deng ◽  
Xiaolong Zhu ◽  
Haiyan Sun ◽  
Yi Ren

The processor FT_MX is a high-performance chip independently developed by the National University of Defense Technology, with an innovative architecture and instruction set. LLVM architecture is a widely used and efficient open source compiler framework initiated by the University of Illinois. This paper introduces the basic architecture and functions of LLVM, analyzes the back-end migration mechanism of the architecture in detail, and gives the specific process of implementing FT_MX back-end migration, and realizes the support of LLVM architecture to the back-end of FT_MX processor.


2012 ◽  
Vol 2012 ◽  
pp. 1-10 ◽  
Author(s):  
Dau-Chyrh Chang ◽  
Lihong Zhang ◽  
Xiaoling Yang ◽  
Shao-Hsiang Yen ◽  
Wenhua Yu

We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD) method using the SSE (streaming (single instruction multiple data) SIMD extensions) instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the multi-CPU workstation and computer cluster have demonstrated the advantages of (vector arithmetic logic unit) VALU acceleration over GPU acceleration. Several engineering applications are employed to demonstrate the performance of parallel FDTD method enhanced by SSE instruction set.


1995 ◽  
Vol 04 (01n02) ◽  
pp. 33-53 ◽  
Author(s):  
ARVIND K. BANSAL

Associative computation is characterized by seamless intertwining of search-by-content and data parallel computation. The search-by-content paradigm is natural to scalable high performance heterogeneous computing since the use of tagged data avoids the need for explicit addressing mechanisms. In this paper, the author presents an algebra for associative logic programming, an associative resolution scheme, and a generic framework of an associative abstract instruction set. The model is based on the integration of data alignment and the use of two types of bags: data element bags and filter bags of Boolean values to select and restrict computation on data elements. The use of filter bags integrated with data alignment reduces computation and data transfer overhead, and the use of tagged data reduces overhead of preparing data before data transmission. The abstract instruction set has been illustrated by an example. Performance results are presented for a simulation in a homogeneous address space.


Pipelining is the concept of overlapping of multiple instructions to perform their operations to optimize the time and ability of hardware units. This paper presents the design and implementation of 6 stage pipelined architecture for High performance 64-bit Microprocessor without Interlocked Pipeline Stages (MIPS) based Reduced Instruction set computing (RISC) processor. In this work, combining efforts of pre-fetching unit, forwarding unit, Branch and Jump predicting unit, Hazard unit are used to reduce the hazards. Low power unit is used to minimize the power. Cache Memories, other devices and especially balancing pipeline stages optimize the Speed in this work. DDR4 SDRAM (Double Data Rate type4 Synchronous Dynamic Random Access Memory) controller is employed in this pipeline to achieve high-speed data transfers and to manage the entire system efficiently. Low power, Low delay Flip flops are used in pipeline registers that implicitly enhance the performance of the system. The proposed method provides better results compared to the existing models. The simulation and synthesis results of the proposed Architecture are evaluated by Xilinx 14.7 software and supporting graphs are plotted through MATLAB tool


Sign in / Sign up

Export Citation Format

Share Document