Software acceleration of floating-point multiplication using runtime code generation — Student paper

Author(s):  
Charles Aracil ◽  
Damien Courousse
2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Anitha Juliette Albert ◽  
Seshasayanan Ramachandran

Floating point multiplication is a critical part in high dynamic range and computational intensive digital signal processing applications which require high precision and low power. This paper presents the design of an IEEE 754 single precision floating point multiplier using asynchronous NULL convention logic paradigm. Rounding has not been implemented to suit high precision applications. The novelty of the research is that it is the first ever NULL convention logic multiplier, designed to perform floating point multiplication. The proposed multiplier offers substantial decrease in power consumption when compared with its synchronous version. Performance attributes of the NULL convention logic floating point multiplier, obtained from Xilinx simulation and Cadence, are compared with its equivalent synchronous implementation.


Sensors ◽  
2020 ◽  
Vol 20 (5) ◽  
pp. 1362 ◽  
Author(s):  
Marco Bassoli ◽  
Valentina Bianchi ◽  
Ilaria De Munari

Recent research in wearable sensors have led to the development of an advanced platform capable of embedding complex algorithms such as machine learning algorithms, which are known to usually be resource-demanding. To address the need for high computational power, one solution is to design custom hardware platforms dedicated to the specific application by exploiting, for example, Field Programmable Gate Array (FPGA). Recently, model-based techniques and automatic code generation have been introduced in FPGA design. In this paper, a new model-based floating-point accumulation circuit is presented. The architecture is based on the state-of-the-art delayed buffering algorithm. This circuit was conceived to be exploited in order to compute the kernel function of a support vector machine. The implementation of the proposed model was carried out in Simulink, and simulation results showed that it had better performance in terms of speed and occupied area when compared to other solutions. To better evaluate its figure, a practical case of a polynomial kernel function was considered. Simulink and VHDL post-implementation timing simulations and measurements on FPGA confirmed the good results of the stand-alone accumulator.


2016 ◽  
Vol 86 (304) ◽  
pp. 881-898 ◽  
Author(s):  
Claude-Pierre Jeannerod ◽  
Peter Kornerup ◽  
Nicolas Louvet ◽  
Jean-Michel Muller

Sign in / Sign up

Export Citation Format

Share Document