scholarly journals Asynchronous Floating-Point Adders and Communication Protocols: A Survey

Electronics ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. 1687
Author(s):  
Pallavi Srivastava ◽  
Edwin Chung ◽  
Stepan Ozana

Addition is the key operation in digital systems, and floating-point adder (FPA) is frequently used for real number addition because floating-point representation provides a large dynamic range. Most of the existing FPA designs are synchronous and their activities are coordinated by clock signal(s). However, technology scaling has imposed several challenges like clock skew, clock distribution, etc., on synchronous design due to presence of clock signal(s). Asynchronous design is an alternate approach to eliminate these challenges imposed by the clock, as it replaces the global clock with handshaking signals and utilizes a communication protocol to indicate the completion of activities. Bundled data and dual-rail coding are the most common communication protocols used in asynchronous design. All existing asynchronous floating-point adder (AFPA) designs utilize dual-rail coding for completion detection, as it allows the circuit to acknowledge as soon as the computation is done; while bundled data and synchronous designs utilizing single-rail encoding will have to wait for the worst-case delay irrespective of the actual completion time. This paper reviews all the existing AFPA designs and examines the effects of the selected communication protocol on its performance. It also discusses the probable outcome of AFPA designed using protocols other than dual-rail coding.

2005 ◽  
Vol 76 (11) ◽  
pp. 115103 ◽  
Author(s):  
Ivo Viščor ◽  
Josef Halámek ◽  
Marco Villa

In digital design, there are two types of design, synchronous design and asynchronous design. In synchronous design, global clock is one of the main system that consume a lot of power. The power in synchronous design is consumed by clock even if there is no data processing take place. The asynchronous design that depends on data is clockless and as far as the power is concerned, asynchronous design does not consume much power compared with synchronous design and this really make asynchronus design the preffered choice for low power consumption. Besides having low power consumption, there are many advantages of aynchronous design compared with synchronous design. This paper proposed new dual rail completion detector (CD), 3-6 CD, 2-7 CD and 1-4 CD for on-chip communication that are used widely in an asynchronous communication system. The design of CD is based on the principle of sum adder. The circuit is designed by using Altera Quartus II CAD tools, synthesis and implementation process is executed to check the syntax error of the design. The design proved to be successful by using asynchronous on-chip communication in the simulation.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Anitha Juliette Albert ◽  
Seshasayanan Ramachandran

Floating point multiplication is a critical part in high dynamic range and computational intensive digital signal processing applications which require high precision and low power. This paper presents the design of an IEEE 754 single precision floating point multiplier using asynchronous NULL convention logic paradigm. Rounding has not been implemented to suit high precision applications. The novelty of the research is that it is the first ever NULL convention logic multiplier, designed to perform floating point multiplication. The proposed multiplier offers substantial decrease in power consumption when compared with its synchronous version. Performance attributes of the NULL convention logic floating point multiplier, obtained from Xilinx simulation and Cadence, are compared with its equivalent synchronous implementation.


2011 ◽  
Vol 20 (05) ◽  
pp. 881-898 ◽  
Author(s):  
SHANNON M. KURTAS ◽  
BARIS TASKIN

Statistical static timing analysis (SSTA) methods, which model process variations statistically as probability distribution function rather than deterministically, have been thoroughly performed on traditional zero clock skew circuits. In the traditional zero clock skew circuits, the synchronizing clock signal is designed to arrive in phase with respect to each register. However, designers will often schedule the clock skew to different registers in order to decrease the minimum clock period of the entire circuit. Clock skew scheduling imparts very different timing constraints that are based, in part, on the topology of the circuit. In this paper, SSTA is applied to nonzero clock skew circuits in order to determine the accuracy improvement relative to their zero skew counterparts, and also to assess how the results of skew scheduling might be impacted with more accurate statistical modeling. For 99.7% timing yield (3σ variation), SSTA is observed to improve the accuracy, and therefore increase the timing margin, of nonzero clock skew circuits by up to 2.5×, and on average by 1.3×, the amount seen by zero skew circuits.


1999 ◽  
Vol 34 (12) ◽  
pp. 1821-1834 ◽  
Author(s):  
D.X.D. Yang ◽  
A.E. Gamal ◽  
B. Fowler ◽  
H. Tian

Author(s):  
Toshiyuki Dobashi ◽  
Atsushi Tashiro ◽  
Masahiro Iwahashi ◽  
Hitoshi Kiya

A tone mapping operation (TMO) for HDR images with fixed-point arithmetic is proposed. A TMO generates a low dynamic range (LDR) image from a high dynamic range (HDR) image by compressing its dynamic range. Since HDR images are generally expressed in a floating-point data format, a TMO also deals with floating-point data even though resulting LDR images have integer data. As a result, conventional TMOs require many resources such as computational and memory cost. To reduce the resources, an integer TMO which treats a floating-point number as two 8-bit integer numbers was proposed. However, this method has the limitation of available input HDR image formats. The proposed method introduces an intermediate format to relieve the limitation of input formats, and expands the integer TMO for the intermediate format. The proposed integer TMO can be applied for multiple formats such as the RGBE and the OpenEXR. Moreover, the method can conduct all calculations in the TMO with fixed-point arithmetic. Using both integer data and fixed-point arithmetic, the method reduces not only the memory cost, but also the computational cost. The experimental and evaluation results show that the proposed method reduces the computational and memory cost, and gives almost same quality of LDR images, compared with the conventional method with floating-point arithmetic.


2019 ◽  
Vol 49 (1) ◽  
pp. 383-404
Author(s):  
Marcin Bednarek ◽  
Tadeusz Dąbrowski ◽  
Wiktor Olchowik

Abstract Industrial networks combine elements of distributed control systems: process stations, operator stations and engineering stations. DCS stations often communicate using a dedicated, closed communication protocol. The industrial networks can be also used to manage communication between the stations of various systems, separate in terms of configuration. The process station communicates here with an external operator station, that is the SCADA system. For this purpose, the process stations and the SCADA systems can also communicate according to standard communication protocols, e.g. Modbus TCP. The paper examines the selected variants of diagnosing the communication status between the process station and the external operator station conducted according to the Modbus TCP protocol. The practical methods of finding the communication system unfitness causes are discussed.


Author(s):  
Ashok Rathish S ◽  
Dr. Paramasivam K

Basic emulation of communication protocols involves the emulation or replicating the frames of the communication protocols using the port pins. This is useful when there is a particular need for a protocol inside a microcontroller where the required communication protocol is not present. The survey on emulation is suitable for the users to have a brief knowledge about the emulation before proceeding. This survey on emulation of communication protocols gives a brief information regarding the parameters, timing, and also the issues and problems faced during the emulation. A brief comparison was made with some different communication protocol emulation using a simple timer module. This will be helpful in concluding the behavior of each communication protocol on a simple timer module using which the protocol will be emulated.


Author(s):  
Thorben Moos

Semiconductor technology scaling faced tough engineering challenges while moving towards and beyond the deep sub-micron range. One of the most demanding issues, limiting the shrinkage process until the present day, is the difficulty to control the leakage currents in nanometer-scaled field-effect transistors. Previous articles have shown that this source of energy dissipation, at least in case of digital CMOS logic, can successfully be exploited as a side-channel to recover the secrets of cryptographic implementations. In this work, we present the first fair technology comparison with respect to static power side-channel measurements on real silicon and demonstrate that the effect of down-scaling on the potency of this security threat is huge. To this end, we designed two ASICs in sub-100nm CMOS nodes (90 nm, 65 nm) and got them fabricated by one of the leading foundries. Our experiments, which we performed at different operating conditions, show consistently that the ASIC technology with the smaller minimum feature size (65 nm) indeed exhibits substantially more informative leakages (factor of ~10) than the 90nm one, even though all targeted instances have been derived from identical RTL code. However, the contribution of this work extends well beyond a mere technology comparison. With respect to the real-world impact of static power attacks, we present the first realistic scenarios that allow to perform a static power side-channel analysis (including noise reduction) without requiring control over the clock signal of the target. Furthermore, as a follow-up to some proof-of-concept work indicating the vulnerability of masking schemes to static powerattacks, we perform a detailed study on how the reduction of the noise level in static leakage measurements affects the security provided by masked implementations. As a result of this study, we do not only find out that the threat for masking schemes is indeed real, but also that common leakage assessment techniques, such as the Welch’s t-test, together with essentially any moment-based analysis of the leakage traces, is simply not sufficient in low-noise contexts. In fact, we are able to show that either a conversion (resp. compression) of the leakage order or the recently proposed X2 test need to be considered in assessment and attack to avoid false negatives.


Author(s):  
Julio Villalba ◽  
Javier Hormigo

AbstractThis article proposes a family of high-radix floating-point representation to efficiently deal with floating-point addition in FPGA devices with no native floating-point support. Since variable shifter implementation (required in any FP adder) has a very high cost in FPGA, high-radix formats considerably reduce the number of possible shifts, decreasing the execution time and area highly. Although the high-radix format produces also a significant penalty in the implementation of multipliers, the experimental results show that the adder improvement overweights the multiplication penalty for most of the practical and common cases (digital filters, matrix multiplications, etc.). We also provide the designer with guidelines on selecting a suitable radix as a function of the ratio between the number of additions and multiplications of the targeted algorithm. For applications with similar numbers of additions and multiplications, the high-radix version may be up to 26% faster and even having a wider dynamic range and using higher number of significant bits. Furthermore, thanks to the proposed efficient converters between the standard IEEE-754 format and our internal high-radix format, the cost of the input/output conversions in FPGA accelerators is negligible.


Sign in / Sign up

Export Citation Format

Share Document