Asynchronous Floating-Point Adders and Communication Protocols: A Survey

Pallavi Srivastava; Edwin Chung; Stepan Ozana

doi:10.3390/electronics9101687

Asynchronous Floating-Point Adders and Communication Protocols: A Survey

Electronics ◽

10.3390/electronics9101687 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1687

Author(s):

Pallavi Srivastava ◽

Edwin Chung ◽

Stepan Ozana

Keyword(s):

Dynamic Range ◽

Communication Protocol ◽

Communication Protocols ◽

Floating Point ◽

Clock Signal ◽

Clock Skew ◽

Worst Case ◽

Asynchronous Design ◽

Technology Scaling ◽

Global Clock

Addition is the key operation in digital systems, and floating-point adder (FPA) is frequently used for real number addition because floating-point representation provides a large dynamic range. Most of the existing FPA designs are synchronous and their activities are coordinated by clock signal(s). However, technology scaling has imposed several challenges like clock skew, clock distribution, etc., on synchronous design due to presence of clock signal(s). Asynchronous design is an alternate approach to eliminate these challenges imposed by the clock, as it replaces the global clock with handshaking signals and utilizes a communication protocol to indicate the completion of activities. Bundled data and dual-rail coding are the most common communication protocols used in asynchronous design. All existing asynchronous floating-point adder (AFPA) designs utilize dual-rail coding for completion detection, as it allows the circuit to acknowledge as soon as the computation is done; while bundled data and synchronous designs utilizing single-rail encoding will have to wait for the worst-case delay irrespective of the actual completion time. This paper reviews all the existing AFPA designs and examines the effects of the selected communication protocol on its performance. It also discusses the probable outcome of AFPA designed using protocols other than dual-rail coding.

Download Full-text

Clock signal requirement for high-frequency, high dynamic range acquisition systems

Review of Scientific Instruments ◽

10.1063/1.2130937 ◽

2005 ◽

Vol 76 (11) ◽

pp. 115103 ◽

Cited By ~ 2

Author(s):

Ivo Viščor ◽

Josef Halámek ◽

Marco Villa

Keyword(s):

High Frequency ◽

Dynamic Range ◽

High Dynamic Range ◽

Clock Signal ◽

High Dynamic

Download Full-text

Design of Completion Detectors in Asynchronous Communication System

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8576.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3329-3334

Keyword(s):

Power Consumption ◽

Low Power ◽

Communication System ◽

Implementation Process ◽

Asynchronous Communication ◽

Digital Design ◽

Low Power Consumption ◽

Asynchronous Design ◽

Global Clock ◽

On Chip

In digital design, there are two types of design, synchronous design and asynchronous design. In synchronous design, global clock is one of the main system that consume a lot of power. The power in synchronous design is consumed by clock even if there is no data processing take place. The asynchronous design that depends on data is clockless and as far as the power is concerned, asynchronous design does not consume much power compared with synchronous design and this really make asynchronus design the preffered choice for low power consumption. Besides having low power consumption, there are many advantages of aynchronous design compared with synchronous design. This paper proposed new dual rail completion detector (CD), 3-6 CD, 2-7 CD and 1-4 CD for on-chip communication that are used widely in an asynchronous communication system. The design of CD is based on the principle of sum adder. The circuit is designed by using Altera Quartus II CAD tools, synthesis and implementation process is executed to check the syntax error of the design. The design proved to be successful by using asynchronous on-chip communication in the simulation.

Download Full-text

NULL Convention Floating Point Multiplier

The Scientific World JOURNAL ◽

10.1155/2015/749569 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Anitha Juliette Albert ◽

Seshasayanan Ramachandran

Keyword(s):

High Precision ◽

Dynamic Range ◽

Digital Signal ◽

High Dynamic Range ◽

Floating Point ◽

Single Precision ◽

Critical Part ◽

Point Multiplication ◽

Null Convention Logic ◽

High Dynamic

Floating point multiplication is a critical part in high dynamic range and computational intensive digital signal processing applications which require high precision and low power. This paper presents the design of an IEEE 754 single precision floating point multiplier using asynchronous NULL convention logic paradigm. Rounding has not been implemented to suit high precision applications. The novelty of the research is that it is the first ever NULL convention logic multiplier, designed to perform floating point multiplication. The proposed multiplier offers substantial decrease in power consumption when compared with its synchronous version. Performance attributes of the NULL convention logic floating point multiplier, obtained from Xilinx simulation and Cadence, are compared with its equivalent synchronous implementation.

Download Full-text

STATISTICAL TIMING ANALYSIS OF THE CLOCK PERIOD IMPROVEMENT THROUGH CLOCK SKEW SCHEDULING

Journal of Circuits System and Computers ◽

10.1142/s0218126611007669 ◽

2011 ◽

Vol 20 (05) ◽

pp. 881-898 ◽

Cited By ~ 1

Author(s):

SHANNON M. KURTAS ◽

BARIS TASKIN

Keyword(s):

Timing Analysis ◽

Process Variations ◽

Clock Signal ◽

Static Timing Analysis ◽

Clock Skew ◽

Clock Period ◽

Static Timing ◽

Statistical Static Timing Analysis ◽

Clock Skew Scheduling ◽

Zero Skew

Statistical static timing analysis (SSTA) methods, which model process variations statistically as probability distribution function rather than deterministically, have been thoroughly performed on traditional zero clock skew circuits. In the traditional zero clock skew circuits, the synchronizing clock signal is designed to arrive in phase with respect to each register. However, designers will often schedule the clock skew to different registers in order to decrease the minimum clock period of the entire circuit. Clock skew scheduling imparts very different timing constraints that are based, in part, on the topology of the circuit. In this paper, SSTA is applied to nonzero clock skew circuits in order to determine the accuracy improvement relative to their zero skew counterparts, and also to assess how the results of skew scheduling might be impacted with more accurate statistical modeling. For 99.7% timing yield (3σ variation), SSTA is observed to improve the accuracy, and therefore increase the timing margin, of nonzero clock skew circuits by up to 2.5×, and on average by 1.3×, the amount seen by zero skew circuits.

Download Full-text

A 640×512 CMOS image sensor with ultrawide dynamic range floating-point pixel-level ADC

IEEE Journal of Solid-State Circuits ◽

10.1109/4.808907 ◽

1999 ◽

Vol 34 (12) ◽

pp. 1821-1834 ◽

Cited By ~ 184

Author(s):

D.X.D. Yang ◽

A.E. Gamal ◽

B. Fowler ◽

H. Tian

Keyword(s):

Dynamic Range ◽

Cmos Image Sensor ◽

Image Sensor ◽

Floating Point

Download Full-text

A fixed-point implementation of tone mapping operation for HDR images expressed in floating-point format

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2014.9 ◽

2014 ◽

Vol 3 ◽

Cited By ~ 10

Author(s):

Toshiyuki Dobashi ◽

Atsushi Tashiro ◽

Masahiro Iwahashi ◽

Hitoshi Kiya

Keyword(s):

Fixed Point ◽

Dynamic Range ◽

Floating Point ◽

Tone Mapping ◽

Fixed Point Arithmetic ◽

Memory Cost ◽

Point Data ◽

Point Arithmetic ◽

Intermediate Format ◽

Hdr Image

A tone mapping operation (TMO) for HDR images with fixed-point arithmetic is proposed. A TMO generates a low dynamic range (LDR) image from a high dynamic range (HDR) image by compressing its dynamic range. Since HDR images are generally expressed in a floating-point data format, a TMO also deals with floating-point data even though resulting LDR images have integer data. As a result, conventional TMOs require many resources such as computational and memory cost. To reduce the resources, an integer TMO which treats a floating-point number as two 8-bit integer numbers was proposed. However, this method has the limitation of available input HDR image formats. The proposed method introduces an intermediate format to relieve the limitation of input formats, and expands the integer TMO for the intermediate format. The proposed integer TMO can be applied for multiple formats such as the RGBE and the OpenEXR. Moreover, the method can conduct all calculations in the TMO with fixed-point arithmetic. Using both integer data and fixed-point arithmetic, the method reduces not only the memory cost, but also the computational cost. The experimental and evaluation results show that the proposed method reduces the computational and memory cost, and gives almost same quality of LDR images, compared with the conventional method with floating-point arithmetic.

Download Full-text

Selected Practical Aspects of Communication Diagnosis in the Industrial Network

Journal of Konbin ◽

10.2478/jok-2019-0020 ◽

2019 ◽

Vol 49 (1) ◽

pp. 383-404

Author(s):

Marcin Bednarek ◽

Tadeusz Dąbrowski ◽

Wiktor Olchowik

Keyword(s):

Control Systems ◽

Distributed Control ◽

Communication System ◽

Communication Protocol ◽

Communication Protocols ◽

Distributed Control Systems ◽

Industrial Networks ◽

Tcp Protocol ◽

Scada System ◽

Industrial Network

Abstract Industrial networks combine elements of distributed control systems: process stations, operator stations and engineering stations. DCS stations often communicate using a dedicated, closed communication protocol. The industrial networks can be also used to manage communication between the stations of various systems, separate in terms of configuration. The process station communicates here with an external operator station, that is the SCADA system. For this purpose, the process stations and the SCADA systems can also communicate according to standard communication protocols, e.g. Modbus TCP. The paper examines the selected variants of diagnosing the communication status between the process station and the external operator station conducted according to the Modbus TCP protocol. The practical methods of finding the communication system unfitness causes are discussed.

Download Full-text

A Survey on Emulation of Communication Protocols in Microcontrollers

Volume 5 - 2020, Issue 9 - September - International Journal of Innovative Science and Research Technology ◽

10.38124/ijisrt20jul480 ◽

2020 ◽

Vol 5 (7) ◽

pp. 395-399

Author(s):

Ashok Rathish S ◽

Dr. Paramasivam K

Keyword(s):

Communication Protocol ◽

Communication Protocols

Basic emulation of communication protocols involves the emulation or replicating the frames of the communication protocols using the port pins. This is useful when there is a particular need for a protocol inside a microcontroller where the required communication protocol is not present. The survey on emulation is suitable for the users to have a brief knowledge about the emulation before proceeding. This survey on emulation of communication protocols gives a brief information regarding the parameters, timing, and also the issues and problems faced during the emulation. A brief comparison was made with some different communication protocol emulation using a simple timer module. This will be helpful in concluding the behavior of each communication protocol on a simple timer module using which the protocol will be emulated.

Download Full-text

Static Power SCA of Sub-100 nm CMOS ASICs and the Insecurity of Masking Schemes in Low-Noise Environments

IACR Transactions on Cryptographic Hardware and Embedded Systems ◽

10.46586/tches.v2019.i3.202-232 ◽

2019 ◽

pp. 202-232

Author(s):

Thorben Moos

Keyword(s):

Field Effect Transistors ◽

Low Noise ◽

Operating Conditions ◽

Side Channel ◽

Clock Signal ◽

Channel Measurements ◽

Security Threat ◽

Technology Scaling ◽

Static Power ◽

Technology Comparison

Semiconductor technology scaling faced tough engineering challenges while moving towards and beyond the deep sub-micron range. One of the most demanding issues, limiting the shrinkage process until the present day, is the difficulty to control the leakage currents in nanometer-scaled field-effect transistors. Previous articles have shown that this source of energy dissipation, at least in case of digital CMOS logic, can successfully be exploited as a side-channel to recover the secrets of cryptographic implementations. In this work, we present the first fair technology comparison with respect to static power side-channel measurements on real silicon and demonstrate that the effect of down-scaling on the potency of this security threat is huge. To this end, we designed two ASICs in sub-100nm CMOS nodes (90 nm, 65 nm) and got them fabricated by one of the leading foundries. Our experiments, which we performed at different operating conditions, show consistently that the ASIC technology with the smaller minimum feature size (65 nm) indeed exhibits substantially more informative leakages (factor of ~10) than the 90nm one, even though all targeted instances have been derived from identical RTL code. However, the contribution of this work extends well beyond a mere technology comparison. With respect to the real-world impact of static power attacks, we present the first realistic scenarios that allow to perform a static power side-channel analysis (including noise reduction) without requiring control over the clock signal of the target. Furthermore, as a follow-up to some proof-of-concept work indicating the vulnerability of masking schemes to static powerattacks, we perform a detailed study on how the reduction of the noise level in static leakage measurements affects the security provided by masked implementations. As a result of this study, we do not only find out that the threat for masking schemes is indeed real, but also that common leakage assessment techniques, such as the Welch’s t-test, together with essentially any moment-based analysis of the leakage traces, is simply not sufficient in low-noise contexts. In fact, we are able to show that either a conversion (resp. compression) of the leakage order or the recently proposed X2 test need to be considered in assessment and attack to avoid false negatives.

Download Full-text

High-Radix Formats for Enhancing Floating-Point FPGA Implementations

Circuits Systems and Signal Processing ◽

10.1007/s00034-021-01855-x ◽

2021 ◽

Author(s):

Julio Villalba ◽

Javier Hormigo

Keyword(s):

Execution Time ◽

Digital Filters ◽

Dynamic Range ◽

Floating Point ◽

Input Output ◽

Point Support ◽

Point Representation ◽

High Radix ◽

The Cost ◽

Very High

AbstractThis article proposes a family of high-radix floating-point representation to efficiently deal with floating-point addition in FPGA devices with no native floating-point support. Since variable shifter implementation (required in any FP adder) has a very high cost in FPGA, high-radix formats considerably reduce the number of possible shifts, decreasing the execution time and area highly. Although the high-radix format produces also a significant penalty in the implementation of multipliers, the experimental results show that the adder improvement overweights the multiplication penalty for most of the practical and common cases (digital filters, matrix multiplications, etc.). We also provide the designer with guidelines on selecting a suitable radix as a function of the ratio between the number of additions and multiplications of the targeted algorithm. For applications with similar numbers of additions and multiplications, the high-radix version may be up to 26% faster and even having a wider dynamic range and using higher number of significant bits. Furthermore, thanks to the proposed efficient converters between the standard IEEE-754 format and our internal high-radix format, the cost of the input/output conversions in FPGA accelerators is negligible.

Download Full-text