FPGAPRO: A Defense Framework Against Crosstalk-Induced Secret Leakage in FPGA

With the emerging cloud-computing development, FPGAs are being integrated with cloud servers for higher performance. Recently, it has been explored to enable multiple users to share the hardware resources of a remote FPGA, i.e., to execute their own applications simultaneously. Although being a promising technique, multi-tenant FPGA unfortunately brings its unique security concerns. It has been demonstrated that the capacitive crosstalk between FPGA long-wires can be a side-channel to extract secret information, giving adversaries the opportunity to implement crosstalk-based side-channel attacks. Moreover, recent work reveals that medium-wires and multiplexers in configurable logic block (CLB) are also vulnerable to crosstalk-based information leakage. In this work, we propose FPGAPRO: a defense framework leveraging P lacement, R outing, and O bfuscation to mitigate the secret leakage on FPGA components, including long-wires, medium-wires, and logic elements in CLB. As a user-friendly defense strategy, FPGAPRO focuses on protecting the security-sensitive instances meanwhile considering critical path delay for performance maintenance. As the proof-of-concept, the experimental result demonstrates that FPGAPRO can effectively reduce the crosstalk-caused side-channel leakage by 138 times. Besides, the performance analysis shows that this strategy prevents the maximum frequency from timing violation.

Download Full-text

Exploring Shared SRAM Tables in FPGAs for Larger LUTs and Higher Degree of Sharing

International Journal of Reconfigurable Computing ◽

10.1155/2017/7021056 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Ali Asghar ◽

Muhammad Mazher Iqbal ◽

Waqar Ahmed ◽

Mujahid Ali ◽

Husain Parvez ◽

...

Keyword(s):

High Performance ◽

Critical Path ◽

Path Delay ◽

Gate Arrays ◽

Area Reduction ◽

Area Overhead ◽

Logic Block ◽

Field Programmable ◽

Boolean Matching ◽

Programmable Gate Arrays

In modern SRAM based Field Programmable Gate Arrays, a Look-Up Table (LUT) is the principal constituent logic element which can realize every possible Boolean function. However, this flexibility of LUTs comes with a heavy area penalty. A part of this area overhead comes from the increased amount of configuration memory which rises exponentially as the LUT size increases. In this paper, we first present a detailed analysis of a previously proposed FPGA architecture which allows sharing of LUTs memory (SRAM) tables among NPN-equivalent functions, to reduce the area as well as the number of configuration bits. We then propose several methods to improve the existing architecture. A new clustering technique has been proposed which packs NPN-equivalent functions together inside a Configurable Logic Block (CLB). We also make use of a recently proposed high performance Boolean matching algorithm to perform NPN classification. To enhance area savings further, we evaluate the feasibility of more than two LUTs sharing the same SRAM table. Consequently, this work explores the SRAM table sharing approach for a range of LUT sizes (4–7), while varying the cluster sizes (4–16). Experimental results on MCNC benchmark circuits set show an overall area reduction of ~7% while maintaining the same critical path delay.

Download Full-text

Implementation of Word Level Parallel Processing Unfolding Algorithm using VHDL

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8099.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 664-667

Keyword(s):

Parallel Processing ◽

Impulse Response ◽

Integrated Circuit ◽

High Speed ◽

Finite Impulse Response ◽

Critical Path ◽

Experimental Result ◽

Infinite Impulse Response ◽

Path Delay ◽

Iir Filter

Aim of this paper is to apply the unfolding algorithm to FIR (Finite Impulse Response) and IIR (Infinite Impulse Response) filter and compare with original filter and parallel processing filters architecture. FIR filter and IIR filter are implemented by using VHDL (Very High Speed Integrated Circuit Hardware Description Language).In this paper, 2-parallel processing and 3-parallel processing of FIR and IIR filter are implemented and FIR and IIR filter are also implemented with unfolding factor 2 and unfolding factor 3 using VHDL. The simulation is done on Artix-7 series FPGA, target device (xc7a200tfbg676) (speed grade -1) using VIVADO 2016.3. Implemented design works on 1200 KHz clock whereas parallel inputs are generated on 3600 KHz clock. The proposed technique reduces the critical path delay in comparison with existing literature. Also, the experimental result shows that the speed for 3-unfolded IIR filter is more than 3-parallel IIR filter

Download Full-text

Design and Implementation of a Farrow-Interpolator-Based Digital Front-End in LTE Receivers for Carrier Aggregation

Electronics ◽

10.3390/electronics10030231 ◽

2021 ◽

Vol 10 (3) ◽

pp. 231

Author(s):

Chester Sungchung Park ◽

Sunwoo Kim ◽

Jooho Wang ◽

Sungkyung Park

Keyword(s):

Integrated Circuit ◽

Building Block ◽

Orthogonal Frequency Division Multiplexing ◽

Critical Path ◽

Phase Error ◽

System Level ◽

Comb Filter ◽

Carrier Aggregation ◽

Path Delay ◽

Front End

A digital front-end decimation chain based on both Farrow interpolator for fractional sample-rate conversion and a digital mixer is proposed in order to comply with the long-term evolution standards in radio receivers with ten frequency modes. Design requirement specifications with adjacent channel selectivity, inband blockers, and narrowband blockers are all satisfied so that the proposed digital front-end is 3GPP-compliant. Furthermore, the proposed digital front-end addresses carrier aggregation in the standards via appropriate frequency translations. The digital front-end has a cascaded integrator comb filter prior to Farrow interpolator and also has a per-carrier carrier aggregation filter and channel selection filter following the digital mixer. A Farrow interpolator with an integrate-and-dump circuitry controlled by a condition signal is proposed and also a digital mixer with periodic reset to prevent phase error accumulation is proposed. From the standpoint of design methodology, three models are all developed for the overall digital front-end, namely, functional models, cycle-accurate models, and bit-accurate models. Performance is verified by means of the cycle-accurate model and subsequently, by means of a special C++ class, the bitwidths are minimized in a methodic manner for area minimization. For system-level performance verification, the orthogonal frequency division multiplexing receiver is also modeled. The critical path delay of each building block is analyzed and the spectral-domain view is obtained for each building block of the digital front-end circuitry. The proposed digital front-end circuitry is simulated, designed, and both synthesized in a 180 nm CMOS application-specific integrated circuit technology and implemented in the Xilinx XC6VLX550T field-programmable gate array (Xilinx, San Jose, CA, USA).

Download Full-text

Using Participatory Design to Design an eHealth Well-Being Program

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18147250 ◽

2021 ◽

Vol 18 (14) ◽

pp. 7250

Author(s):

Yannick van Hierden ◽

Timo Dietrich ◽

Sharyn Rundle-Thiele

Keyword(s):

Best Practices ◽

Health Professionals ◽

Participatory Design ◽

Online Community ◽

Well Being ◽

Proof Of Concept ◽

End User ◽

Ehealth Intervention ◽

Pilot Intervention ◽

User Friendly

In recent years, the relevance of eHealth interventions has become increasingly evident. However, a sequential procedural application to cocreating eHealth interventions is currently lacking. This paper demonstrates the implementation of a participatory design (PD) process to inform the design of an eHealth intervention aiming to enhance well-being. PD sessions were conducted with 57 people across four sessions. Within PD sessions participants experienced prototype activities, provided feedback and designed program interventions. A 5-week eHealth well-being intervention focusing on lifestyle, habits, physical activity, and meditation was proposed. The program is suggested to be delivered through online workshops and online community interaction. A five-step PD process emerged; namely, (1) collecting best practices, (2) participatory discovery, (3) initial proof-of-concept, (4) participatory prototyping, and (5) pilot intervention proof-of-concept finalisation. Health professionals, behaviour change practitioners and program planners can adopt this process to ensure end-user cocreation using the five-step process. The five-step PD process may help to create user-friendly programs.

Download Full-text

Power Side-Channel Analysis of RNS GLV ECC Using Machine and Deep Learning Algorithms

ACM Transactions on Internet Technology ◽

10.1145/3423555 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1-20

Author(s):

Mohamad Ali Mehrabi ◽

Naila Mukhtar ◽

Alireza Jolfaei

Keyword(s):

Deep Learning ◽

Elliptic Curve ◽

Smart Cities ◽

Information Leakage ◽

Side Channel ◽

Side Channel Attacks ◽

Public Key Cryptosystems ◽

Elliptic Curve Cryptosystems ◽

Hardware Implementations ◽

Substantial Progress

Many Internet of Things applications in smart cities use elliptic-curve cryptosystems due to their efficiency compared to other well-known public-key cryptosystems such as RSA. One of the important components of an elliptic-curve-based cryptosystem is the elliptic-curve point multiplication which has been shown to be vulnerable to various types of side-channel attacks. Recently, substantial progress has been made in applying deep learning to side-channel attacks. Conceptually, the idea is to monitor a core while it is running encryption for information leakage of a certain kind, for example, power consumption. The knowledge of the underlying encryption algorithm can be used to train a model to recognise the key used for encryption. The model is then applied to traces gathered from the crypto core in order to recover the encryption key. In this article, we propose an RNS GLV elliptic curve cryptography core which is immune to machine learning and deep learning based side-channel attacks. The experimental analysis confirms the proposed crypto core does not leak any information about the private key and therefore it is suitable for hardware implementations.

Download Full-text

High Efficiency Generalized Parallel Counters for Look-Up Table Based FPGAs

International Journal of Reconfigurable Computing ◽

10.1155/2015/518272 ◽

2015 ◽

Vol 2015 ◽

pp. 1-16 ◽

Cited By ~ 4

Author(s):

Burhan Khurshid ◽

Roohie Naaz Mir

Keyword(s):

Power Dissipation ◽

High Speed ◽

High Efficiency ◽

Critical Path ◽

Fir Filters ◽

Path Delay ◽

Look Up Table ◽

Improved Performance ◽

Ip Cores ◽

Low Efficiency

Generalized parallel counters (GPCs) are used in constructing high speed compressor trees. Prior work has focused on utilizing the fast carry chain and mapping the logic onto Look-Up Tables (LUTs). This mapping is not optimal in the sense that the LUT fabric is not fully utilized. This results in low efficiency GPCs. In this work, we present a heuristic that efficiently maps the GPC logic onto the LUT fabric. We have used our heuristic on various GPCs and have achieved an improvement in efficiency ranging from 33% to 100% in most of the cases. Experimental results using Xilinx 5th-, 6th-, and 7th-generation FPGAs and Stratix IV and V devices from Altera show a considerable reduction in resources utilization and dynamic power dissipation, for almost the same critical path delay. We have also implemented GPC-based FIR filters on 7th-generation Xilinx FPGAs using our proposed heuristic and compared their performance against conventional implementations. Implementations based on our heuristic show improved performance. Comparisons are also made against filters based on integrated DSP blocks and inherent IP cores from Xilinx. The results show that the proposed heuristic provides performance that is comparable to the structures based on these specialized resources.

Download Full-text

Layout-Aware Critical Path Delay Test Under Maximum Power Supply Noise Effects

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2011.2163159 ◽

2011 ◽

Vol 30 (12) ◽

pp. 1923-1934 ◽

Cited By ~ 18

Author(s):

Junxia Ma ◽

Mohammad Tehranipoor

Keyword(s):

Power Supply ◽

Critical Path ◽

Maximum Power ◽

Path Delay ◽

Power Supply Noise ◽

Delay Test ◽

Noise Effects ◽

Critical Path Delay ◽

Supply Noise ◽

Path Delay Test

Download Full-text

Exploring Linear Structures of Critical Path Delay Faults to Reduce Test Efforts

2006 IEEE/ACM International Conference on Computer Aided Design ◽

10.1109/iccad.2006.320072 ◽

2006 ◽

Author(s):

Shun-yen Lu ◽

Pei-ying Hsieh ◽

Jing-jia Liou

Keyword(s):

Critical Path ◽

Delay Faults ◽

Path Delay ◽

Path Delay Faults ◽

Linear Structures ◽

Critical Path Delay

Download Full-text

A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v10.i1.pp1-10 ◽

2021 ◽

Vol 10 (1) ◽

pp. 1

Author(s):

Kenta Shirane ◽

Takahiro Yamamoto ◽

Hiroyuki Tomiyama

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Design Methodology ◽

Critical Path ◽

High Accuracy ◽

Path Delay ◽

Trade Off ◽

Critical Path Delay

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

Download Full-text

Novel Design of Low-Power High-Speed Hybrid Full Adder Design using Gate Diffusion Input (GDI) Technique

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l7992.1091220 ◽

2020 ◽

Vol 9 (12) ◽

pp. 323-328

Keyword(s):

Power Consumption ◽

Low Power ◽

High Speed ◽

Critical Path ◽

Circuit Simulation ◽

Full Adder ◽

Cmos Process ◽

Path Delay ◽

Process Technology ◽

Xnor Gate

VLSI technology become one of the most significant and demandable because of the characteristics like device portability, device size, large amount of features, expenditure, consistency, rapidity and many others. Multipliers and Adders place an important role in various digital systems such as computers, process controllers and signal processors in order to achieve high speed and low power. Two input XOR/XNOR gate and 2:1 multiplexer modules are used to design the Hybrid Full adders. The XOR/XNOR gate is the key punter of power included in the Full adder cell. However this circuit increases the delay, area and critical path delay. Hence, the optimum design of the XOR/XNOR is required to reduce the power consumption of the Full adder Cell. So a 6 New Hybrid Full adder circuits are proposed based on the Novel Full-Swing XOR/XNOR gates and a New Gate Diffusion Input (GDI) design of Full adder with high-swing outputs. The speed, power consumption, power delay product and driving capability are the merits of the each proposed circuits. This circuit simulation was carried used cadence virtuoso EDA tool. The simulation results based on the 90nm CMOS process technology model.

Download Full-text