An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration

Background: With the growing demand of image processing and the use of Digital Signal Processors (DSP), the efficiency of the Multipliers and Accumulators has become a bottleneck to get through. We revised a few patents on an Application Specific Instruction Set Processor (ASIP), where the design considerations are proposed for application-specific computing in an efficient way to enhance the throughput. Objective: The study aims to develop and analyze a computationally efficient method to optimize the speed performance of MAC. Methods: The work presented here proposes the design of an Application Specific Instruction Set Processor, exploiting a Multiplier Accumulator integrated as the dedicated hardware. This MAC is optimized for high-speed performance and is the application-specific part of the processor; here it can be the DSP block of an image processor while a 16-bit Reduced Instruction Set Computer (RISC) processor core gives the flexibility to the design for any computing. The design was emulated on a Xilinx Field Programmable Gate Array (FPGA) and tested for various real-time computing. Results: The synthesis of the hardware logic on FPGA tools gave the operating frequencies of the legacy methods and the proposed method, the simulation of the logic verified the functionality. Conclusion: With the proposed method, a significant improvement of 16% increase in throughput has been observed for 256 steps iterations of multiplier and accumulators on an 8-bit sample data. Such an improvement can help in reducing the computation time in many digital signal processing applications where multiplication and addition are done iteratively.

Download Full-text

Application-Specific Instruction Set Architecture for an Ultralight Hardware Security Module

2020 IEEE International Symposium on Hardware Oriented Security and Trust (HOST) ◽

10.1109/host45689.2020.9300292 ◽

2020 ◽

Author(s):

Ahmed A. Ayoub ◽

Mark D. Aagaard

Keyword(s):

Hardware Security ◽

Instruction Set ◽

Instruction Set Architecture ◽

Specific Instruction ◽

Security Module ◽

Application Specific

Download Full-text

Automatic application-specific instruction-set extensions under microarchitectural constraints

10.1109/dac.2003.1219004 ◽

2004 ◽

Cited By ~ 19

Author(s):

K. Atasu ◽

L. Pozzi ◽

P. Lenne

Keyword(s):

Instruction Set ◽

Specific Instruction ◽

Instruction Set Extensions ◽

Application Specific

Download Full-text

Fine-Grained Checkpoint Recovery for Application-Specific Instruction-Set Processors

IEEE Transactions on Computers ◽

10.1109/tc.2016.2606378 ◽

2017 ◽

Vol 66 (4) ◽

pp. 647-660 ◽

Cited By ~ 3

Author(s):

Tuo Li ◽

Muhammad Shafique ◽

Jude Angelo Ambrose ◽

Jorg Henkel ◽

Sri Parameswaran

Keyword(s):

Instruction Set ◽

Specific Instruction ◽

Fine Grained ◽

Checkpoint Recovery ◽

Instruction Set Processors ◽

Application Specific

Download Full-text

A Hardware/Software Cooperative Custom Register Binding Approach for Register Spill Elimination in Application-Specific Instruction Set Processors

ACM Transactions on Design Automation of Electronic Systems ◽

10.1145/2348839.2348844 ◽

2012 ◽

Vol 17 (4) ◽

pp. 1-19

Author(s):

Hai Lin ◽

Tiansi Hu ◽

Yunsi Fei

Keyword(s):

Instruction Set ◽

Specific Instruction ◽

Instruction Set Processors ◽

Application Specific

Download Full-text

Application Specific Instruction Set DSP Processors

Handbook of Signal Processing Systems ◽

10.1007/978-1-4614-6859-2_21 ◽

2013 ◽

pp. 671-706

Author(s):

Dake Liu ◽

Jian Wang

Keyword(s):

Instruction Set ◽

Specific Instruction ◽

Dsp Processors ◽

Application Specific

Download Full-text

Instruction Set Optimization for Application Specific Processors

Lecture Notes in Computer Science - Reconfigurable Computing: Architectures, Tools, and Applications ◽

10.1007/978-3-319-05960-0_28 ◽

2014 ◽

pp. 268-274

Author(s):

Max Ferger ◽

Michael Hübner

Keyword(s):

Set Optimization ◽

Instruction Set ◽

Application Specific

Download Full-text

An Optimization Methodology for Adapting Legacy SGX Applications to Use Switchless Calls

Applied Sciences ◽

10.3390/app11188379 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8379

Author(s):

Seongmin Kim

Keyword(s):

System Level ◽

Instruction Set ◽

Optimization Strategy ◽

Design Decisions ◽

Systematic Analysis ◽

Kernel Module ◽

Recent Innovation ◽

Optimization Methodology ◽

Application Specific ◽

Trusted Execution Environment

A recent innovation in the trusted execution environment (TEE) technologies enables the delegation of privacy-preserving computation to the cloud system. In particular, Intel SGX, an extension of x86 instruction set architecture (ISA), accelerates this trend by offering hardware-protected isolation with near-native performance. However, SGX inherently suffers from performance degradation depending on the workload characteristics due to the hardware restriction and design decisions that primarily concern the security guarantee. The system-level optimizations on SGX runtime and kernel module have been proposed to resolve this, but they cannot effectively reflect application-specific characteristics that largely impact the performance of legacy SGX applications. This work presents an optimization strategy to achieve application-level optimization by utilizing asynchronous switchless calls to reduce enclave transition, one of the dominant overheads of using SGX. Based on the systematic analysis, our methodology examines the performance benefit for each enclave transition wrapper and selectively applies switchless calls without modifying the legacy codebases. The evaluation shows that our optimization strategy successfully improves the end-to-end performance of our showcasing application, an SGX-enabled network middlebox.

Download Full-text