Instruction Set Enhancements for High-Performance Multicore Execution on the REALJava Platform

The processor FT_MX is a high-performance chip independently developed by the National University of Defense Technology, with an innovative architecture and instruction set. LLVM architecture is a widely used and efficient open source compiler framework initiated by the University of Illinois. This paper introduces the basic architecture and functions of LLVM, analyzes the back-end migration mechanism of the architecture in detail, and gives the specific process of implementing FT_MX back-end migration, and realizes the support of LLVM architecture to the back-end of FT_MX processor.

Download Full-text

A High-Performance Framework for Instruction-Set Simulator

Recent Advances in Computer Science and Information Engineering - Lecture Notes in Electrical Engineering ◽

10.1007/978-3-642-25792-6_2 ◽

2012 ◽

pp. 9-14 ◽

Cited By ~ 1

Author(s):

Zhu Hao ◽

Peng Chu ◽

Tiejun Zhang ◽

Donghui Wang ◽

Chaohuan Hou

Keyword(s):

High Performance ◽

Instruction Set ◽

Performance Framework

Download Full-text

A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set

International Journal of Antennas and Propagation ◽

10.1155/2012/851465 ◽

2012 ◽

Vol 2012 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Dau-Chyrh Chang ◽

Lihong Zhang ◽

Xiaoling Yang ◽

Shao-Hsiang Yen ◽

Wenhua Yu

Keyword(s):

High Performance ◽

Fdtd Method ◽

Hardware Acceleration ◽

Single Instruction Multiple Data ◽

Instruction Set ◽

Computer Cluster ◽

Simulation Performance ◽

Acceleration Technique ◽

Multiple Data ◽

Difference Time

We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD) method using the SSE (streaming (single instruction multiple data) SIMD extensions) instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the multi-CPU workstation and computer cluster have demonstrated the advantages of (vector arithmetic logic unit) VALU acceleration over GPU acceleration. Several engineering applications are employed to demonstrate the performance of parallel FDTD method enhanced by SSE instruction set.

Download Full-text

A FRAMEWORK FOR HETEROGENEOUS ASSOCIATIVE LOGIC PROGRAMMING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213095000036 ◽

1995 ◽

Vol 04 (01n02) ◽

pp. 33-53 ◽

Cited By ~ 2

Author(s):

ARVIND K. BANSAL

Keyword(s):

Logic Programming ◽

High Performance ◽

Heterogeneous Computing ◽

Data Transfer ◽

Instruction Set ◽

Data Parallel ◽

Data Alignment ◽

Resolution Scheme ◽

Data Elements ◽

Performance Results

Associative computation is characterized by seamless intertwining of search-by-content and data parallel computation. The search-by-content paradigm is natural to scalable high performance heterogeneous computing since the use of tagged data avoids the need for explicit addressing mechanisms. In this paper, the author presents an algebra for associative logic programming, an associative resolution scheme, and a generic framework of an associative abstract instruction set. The model is based on the integration of data alignment and the use of two types of bags: data element bags and filter bags of Boolean values to select and restrict computation on data elements. The use of filter bags integrated with data alignment reduces computation and data transfer overhead, and the use of tagged data reduces overhead of preparing data before data transmission. The abstract instruction set has been illustrated by an example. Performance results are presented for a simulation in a homogeneous address space.

Download Full-text

ASiPEC: An Application Specific Instruction-Set Processor for High Performance Entropy Coding

Ubiquitous Computing Application and Wireless Sensor - Lecture Notes in Electrical Engineering ◽

10.1007/978-94-017-9618-7_7 ◽

2015 ◽

pp. 67-75

Author(s):

Seung-Hyun Choi ◽

Neungsoo Park ◽

Yong Ho Song ◽

Seong-Won Lee

Keyword(s):

High Performance ◽

Entropy Coding ◽

Instruction Set ◽

Specific Instruction ◽

Application Specific

Download Full-text

A High Performance Java Card Virtual Machine Interpreter Based on an Application Specific Instruction-Set Processor

2014 17th Euromicro Conference on Digital System Design ◽

10.1109/dsd.2014.47 ◽

2014 ◽

Author(s):

Massimiliano Zilli ◽

Wolfgang Raschke ◽

Reinhold Weiss ◽

Johannes Loinig ◽

Christian Steger

Keyword(s):

Virtual Machine ◽

High Performance ◽

Instruction Set ◽

Specific Instruction ◽

Java Card ◽

Application Specific

Download Full-text

Refining instruction set architecture for high-performance multimedia processing in constrained environments

Proceedings IEEE International Conference on Application- Specific Systems, Architectures, and Processors ◽

10.1109/asap.2002.1030724 ◽

2003 ◽

Cited By ~ 5

Author(s):

R.B. Lee ◽

A.M. Fiskiran ◽

Zhijie Shi ◽

Xiao Yang

Keyword(s):

High Performance ◽

Instruction Set ◽

Instruction Set Architecture ◽

Multimedia Processing ◽

Constrained Environments

Download Full-text

Test methodology for freescale's high performance e600 core based on powerPC instruction set architecture

IEEE International Conference on Test, 2005. ◽

10.1109/test.2005.1583968 ◽

2006 ◽

Cited By ~ 9

Author(s):

N. Tendolkar ◽

D. Belete ◽

A. Razdan ◽

H. Reyes ◽

B. Schwarz ◽

...

Keyword(s):

High Performance ◽

Instruction Set ◽

Instruction Set Architecture ◽

Test Methodology

Download Full-text

Design and Implementation of 6-Stage 64-bit MIPS Pipelined Architecture

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1201.0886s219 ◽

2019 ◽

Vol 8 (6S2) ◽

pp. 790-796

Keyword(s):

Low Power ◽

High Speed ◽

High Performance ◽

Random Access ◽

Instruction Set ◽

Cache Memories ◽

Design And Implementation ◽

Pipelined Architecture ◽

Risc Processor ◽

High Speed Data

Pipelining is the concept of overlapping of multiple instructions to perform their operations to optimize the time and ability of hardware units. This paper presents the design and implementation of 6 stage pipelined architecture for High performance 64-bit Microprocessor without Interlocked Pipeline Stages (MIPS) based Reduced Instruction set computing (RISC) processor. In this work, combining efforts of pre-fetching unit, forwarding unit, Branch and Jump predicting unit, Hazard unit are used to reduce the hazards. Low power unit is used to minimize the power. Cache Memories, other devices and especially balancing pipeline stages optimize the Speed in this work. DDR4 SDRAM (Double Data Rate type4 Synchronous Dynamic Random Access Memory) controller is employed in this pipeline to achieve high-speed data transfers and to manage the entire system efficiently. Low power, Low delay Flip flops are used in pipeline registers that implicitly enhance the performance of the system. The proposed method provides better results compared to the existing models. The simulation and synthesis results of the proposed Architecture are evaluated by Xilinx 14.7 software and supporting graphs are plotted through MATLAB tool

Download Full-text

Using AVX2 Instruction Set to Increase Performance of High Performance Computing Code

Computing and Informatics ◽

10.4149/cai_2017_5_1001 ◽

2017 ◽

Vol 36 (5) ◽

pp. 1001-1018 ◽

Cited By ~ 3

Author(s):

Pawel Gepner

Keyword(s):

High Performance Computing ◽

High Performance ◽

Instruction Set ◽

Performance Computing

Download Full-text