scholarly journals AMROFloor: An Efficient Aging Mitigation and Resource Optimization Floorplanner for Virtual Coarse-Grained Runtime Reconfigurable FPGAs

Electronics ◽  
2022 ◽  
Vol 11 (2) ◽  
pp. 273
Zeyu Li ◽  
Zhao Huang ◽  
Quan Wang ◽  
Junjie Wang

With the rapid reduction of CMOS process size, the FPGAs with high-silicon accumulation technology are becoming more sensitive to aging effects. This reduces the reliability and service life of the device. The offline aging-aware layout planning based on balance stress is an effective solution. However, the existing methods need to take a long time to solve the floorplanner, and the corresponding layout solutions occupy many on-chip resources. To this end, we proposed an efficient Aging Mitigation and Resource Optimization Floorplanner (AMROFloor) for FPGAs. First, the layout solution is implemented on the Virtual Coarse-Grained Runtime Reconfigurable Architecture, which contributes to avoiding rule constraints for placement and routing. Second, the Maximize Reconfigurable Regions Algorithm (MRRA) is proposed to quickly determine the RRs’ number and size to save the solving time and ensure an effective solution. Furthermore, the Resource Combination Algorithm (RCA) is proposed to optimize the on-chip resources, reducing the on-Chip Resource Utilization (CRU) while achieving the same aging relief effect. Experiments were simulated and implemented on Xilinx FPGA. The results demonstrate that the AMROFloor method designed in this paper can extend the Mean Time to Failure (MTTF) by 13.8% and optimize the resource overhead by 19.2% on average compared to the existing aging-aware layout solutions.

2018 ◽  
Vol 27 (11) ◽  
pp. 1850180 ◽  
Xinchao Shang ◽  
Weiwei Shan ◽  
Xinning Liu

Nowadays, countermeasures against side-channel attack (SCA) have become necessary in hardware security. And the need for supporting multiple crypto algorithms on a chip is increasing. We propose a reconfigurable crypto coprocessor, which not only supports multiple crypto algorithms, but also provides multiple effective SCA countermeasures of SPA, DPA and EMA, by making use of its own reconfigurable features other than using extra resources. The countermeasure methods include several global and encryption flow related countermeasures, which can also be reconfigured along with the circuit function. This coprocessor is a coarse-grained reconfigurable architecture composed of several reconfigurable modules, such as logic arithmetic, shift, modular ADD/Substrate, permutation, S-box and modular multiplication units, all of which are reconfigurable. This reconfigurable cryptographic coprocessor is integrated into a system-on chip with a 32-bit CPU and fabricated in 0.18 m CMOS process with 1.8[Formula: see text]V supply and 100 MHz maximum frequency. Experimental results show that it can successfully resist SPA and DPA with one million power traces. As for EMA, if we use full countermeasures, it can resist EMA with up to 1.2 million electromagnetic traces without revealing the right subkey. Thus, this reconfigurable coprocessor can provide a good solution for both supporting multiple algorithms and providing SCA resistance, with no frequency influence, neglectable area overhead and small power overhead.

2013 ◽  
Vol 543 ◽  
pp. 176-179 ◽  
D.Q. Zhao ◽  
Xia Zhang ◽  
P. Liu ◽  
F. Yang ◽  
C. Lin ◽  

In this work we studied the fabrication of a monolithic bimaterial micro-cantilever resonant IR sensor with on-chip drive circuits. The effects of high temperature process and stress induced performance degradation were investigated. The post-CMOS MEMS (micro electro mechanical system) fabrication process of this IR sensor is the focus of this paper, starting from theoretical analysis and simulation, and then moving to experimental verification. The capacitive cantilever structure was fabricated by surface micromachining method, and drive circuits were prepared by standard CMOS process. While the stress introduced by MEMS films, such as the tensile silicon nitride which works as a contact etch stopper layer for MOSFETs and releasing stop layer for the MEMS structure, increases the electron mobility of NMOS, PMOS hole mobility decreases. Moreover, the NMOS threshold voltage (Vth) shifts, and transconductance (Gm) degrades. An additional step of selective removing silicon nitride capping layer and polysilicon layer upon IC area were inserted into the standard CMOS process to lower the stress in MOSFET channel regions. Selective removing silicon nitride and polysilicon before annealing can void 77% Vth shift and 86% Gm loss.

Electronics ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 68
Woorham Bae ◽  
Sung-Yong Cho ◽  
Deog-Kyoon Jeong

This paper presents a fully integrated Peripheral Component Interconnect (PCI) Express (PCIe) Gen4 physical layer (PHY) transmitter. The prototype chip is fabricated in a 28 nm low-power CMOS process, and the active area of the proposed transmitter is 0.23 mm2. To enable voltage scaling across wide operating rates from 2.5 Gb/s to 16 Gb/s, two on-chip supply regulators are included in the transmitter. At the same time, the regulators maintain the output impedance of the transmitter to meet the return loss specification of the PCIe, by including replica segments of the output driver and reference resistance in the regulator loop. A three-tap finite-impulse-response (FIR) equalization is implemented and, therefore, the transmitter provides more than 9.5 dB equalization which is required in the PCIe specification. At 16 Gb/s, the prototype chip achieves energy efficiency of 1.93 pJ/bit including all the interface, bias, and built-in self-test circuits.

Cesar A. López ◽  
Animesh Agarwal ◽  
Que N. Van ◽  
Andrew G. Stephen ◽  
S. Gnanakaran

AbstractSmall GTPase proteins are ubiquitous and responsible for regulating several processes related to cell growth and differentiation. Mutations that stabilize their active state can lead to uncontrolled cell proliferation and cancer. Although these proteins are well characterized at the cellular scale, the molecular mechanisms governing their functions are still poorly understood. In addition, there is limited information about the regulatory function of the cell membrane which supports their activity. Thus, we have studied the dynamics and conformations of the farnesylated KRAS4b in various membrane model systems, ranging from binary fluid mixtures to heterogeneous raft mimics. Our approach combines long time-scale coarse-grained (CG) simulations and Markov state models to dissect the membrane-supported dynamics of KRAS4b. Our simulations reveal that protein dynamics is mainly modulated by the presence of anionic lipids and to some extent by the nucleotide state (activation) of the protein. In addition, our results suggest that both the farnesyl and the polybasic hypervariable region (HVR) are responsible for its preferential partitioning within the liquid-disordered (Ld) domains in membranes, potentially enhancing the formation of membrane-driven signaling platforms. Graphic Abstract

Dennis Wolf ◽  
Andreas Engel ◽  
Tajas Ruschke ◽  
Andreas Koch ◽  
Christian Hochberger

AbstractCoarse Grained Reconfigurable Arrays (CGRAs) or Architectures are a concept for hardware accelerators based on the idea of distributing workload over Processing Elements. These processors exploit instruction level parallelism, while being energy efficient due to their simplistic internal structure. However, the incorporation into a complete computing system raises severe challenges at the hardware and software level. This article evaluates a CGRA integrated into a control engineering environment targeting a Xilinx Zynq System on Chip (SoC) in detail. Besides the actual application execution performance, the practicability of the configuration toolchain is validated. Challenges of the real-world integration are discussed and practical insights are highlighted.

2021 ◽  
Vol 64 (6) ◽  
pp. 107-116
Yakun Sophia Shao ◽  
Jason Cemons ◽  
Rangharajan Venkatesan ◽  
Brian Zimmer ◽  
Matthew Fojtik ◽  

Package-level integration using multi-chip-modules (MCMs) is a promising approach for building large-scale systems. Compared to a large monolithic die, an MCM combines many smaller chiplets into a larger system, substantially reducing fabrication and design costs. Current MCMs typically only contain a handful of coarse-grained large chiplets due to the high area, performance, and energy overheads associated with inter-chiplet communication. This work investigates and quantifies the costs and benefits of using MCMs with finegrained chiplets for deep learning inference, an application domain with large compute and on-chip storage requirements. To evaluate the approach, we architected, implemented, fabricated, and tested Simba, a 36-chiplet prototype MCM system for deep-learning inference. Each chiplet achieves 4 TOPS peak performance, and the 36-chiplet MCM package achieves up to 128 TOPS and up to 6.1 TOPS/W. The MCM is configurable to support a flexible mapping of DNN layers to the distributed compute and storage units. To mitigate inter-chiplet communication overheads, we introduce three tiling optimizations that improve data locality. These optimizations achieve up to 16% speedup compared to the baseline layer mapping. Our evaluation shows that Simba can process 1988 images/s running ResNet-50 with a batch size of one, delivering an inference latency of 0.50 ms.

2013 ◽  
Vol 303-306 ◽  
pp. 2284-2288
Fang Yan ◽  
Yu An Tan

The world is increasingly awash in more and more unstructured data. Object-based data de-duplication is the current most advanced method and is the effective solution for detecting duplicate data. We developed an energy saving policy for conventional disk based RAID systems. According to the characteristics of object-based data de-duplication, we introduce object layout strategies for unstructured data applications; disk accesses are concentrated in a part of the disks in a long time which is conducive to scheduling other disks into standby or shutdown mode. Our proposed methods reduce energy consumption of de-duplication storage system.

2020 ◽  
Amanda C. Northrop ◽  
Vanessa Avalone ◽  
Aaron M. Ellison ◽  
Bryan A. Ballif ◽  
Nicholas J. Gotelli

Incremental increases in a driver variable, such as nutrients or detritus, can trigger abrupt shifts in aquatic ecosys-tems. Once these ecosystems change state, a simple reduction in the driver variable may not return them to their original state. Because of the long time scales involved, we still have a poor understanding of the dynamics of ecosys-tem recovery after a state change. A model system for understanding ecosystem recovery is the aquatic microecosystem that inhabits the cup-shaped leaves of the pitcher plant Sarracenia purpurea. With enrichment of organic matter, this system flips within 1 to 3 days from an oxygen-rich state to an oxygen-poor (hypoxic) state. In a replicated green-house experiment, we enriched pitcher plant leaves at different rates with bovine serum albumin (BSA), a molecular substitute for detritus. Changes in dissolved oxygen ([O2]) and undigested BSA concentration were monitored during enrichment and recovery phases. At low enrichment rates, ecosystems showed a substantial lag in the recovery of [O2] (clockwise hysteresis). At intermediate enrichment rates, [O2] tracked the levels of undigested BSA with the same profile during the enrichment and recovery phases (no hysteresis). At high enrichment rates, we observed a novel response: changes in [O2] were proportionally larger during the recovery phase than during the enrichment phase (counter-clockwise hysteresis). These experiments demonstrate that detrital enrichment rate can modulate a diversity of hysteretic responses in a single aquatic ecosystem. With counter-clockwise hysteresis, rapid reduction of a driver variable following high enrichment rates may be a viable restoration strategy.

Sign in / Sign up

Export Citation Format

Share Document