Hybridthreads Compiler: Generation of Application Specific Hardware Thread Cores from C

Author(s):  
Jim Stevens
2003 ◽  
Vol 12 (03) ◽  
pp. 353-375 ◽  
Author(s):  
Dirk Fischer ◽  
Jürgen Teich ◽  
Ralph Weper ◽  
Michael Thies

With the term Architecture/Compiler Co-exploration, we denote the problem of simultaneously optimizing an application-specific instruction set processor (ASIP) architecture as well as its generated compiler. In this paper, we characterize the design space of both compiler frontend (intermediate code optimization) and backend (changes of the machine model) and present the workflow of our framework BUILDABONG. The project consists of four phases: (a) architecture entry and composition, (b) automatic simulator generation, (c) compiler generation (in particular, retargeting), and (d) automatic architecture/compiler design space exploration. We demonstrate the feasibility of our approach by a detailed case study.


2012 ◽  
Vol E95-C (4) ◽  
pp. 534-545 ◽  
Author(s):  
Wei ZHONG ◽  
Takeshi YOSHIMURA ◽  
Bei YU ◽  
Song CHEN ◽  
Sheqin DONG ◽  
...  

2019 ◽  
Vol 13 (2) ◽  
pp. 174-180
Author(s):  
Poonam Sharma ◽  
Ashwani Kumar Dubey ◽  
Ayush Goyal

Background: With the growing demand of image processing and the use of Digital Signal Processors (DSP), the efficiency of the Multipliers and Accumulators has become a bottleneck to get through. We revised a few patents on an Application Specific Instruction Set Processor (ASIP), where the design considerations are proposed for application-specific computing in an efficient way to enhance the throughput. Objective: The study aims to develop and analyze a computationally efficient method to optimize the speed performance of MAC. Methods: The work presented here proposes the design of an Application Specific Instruction Set Processor, exploiting a Multiplier Accumulator integrated as the dedicated hardware. This MAC is optimized for high-speed performance and is the application-specific part of the processor; here it can be the DSP block of an image processor while a 16-bit Reduced Instruction Set Computer (RISC) processor core gives the flexibility to the design for any computing. The design was emulated on a Xilinx Field Programmable Gate Array (FPGA) and tested for various real-time computing. Results: The synthesis of the hardware logic on FPGA tools gave the operating frequencies of the legacy methods and the proposed method, the simulation of the logic verified the functionality. Conclusion: With the proposed method, a significant improvement of 16% increase in throughput has been observed for 256 steps iterations of multiplier and accumulators on an 8-bit sample data. Such an improvement can help in reducing the computation time in many digital signal processing applications where multiplication and addition are done iteratively.


Author(s):  
Shrikant S. Jadhav ◽  
Clay Gloster ◽  
Jannatun Naher ◽  
Christopher Doss ◽  
Youngsoo Kim

Sign in / Sign up

Export Citation Format

Share Document