scholarly journals Similarity-Aware Architecture/Compiler Co-Designed Context-Reduction Framework for Modulo-Scheduled CGRA

Electronics ◽  
2021 ◽  
Vol 10 (18) ◽  
pp. 2210
Author(s):  
Zhongyuan Zhao ◽  
Weiguang Sheng ◽  
Jinchao Li ◽  
Pengfei Ye ◽  
Qin Wang ◽  
...  

Modulo-scheduled coarse-grained reconfigurable array (CGRA) processors have shown their potential for exploiting loop-level parallelism at high energy efficiency. However, these CGRAs need frequent reconfiguration during their execution, which makes them suffer from large area and power overhead for context memory and context-fetching. To tackle this challenge, this paper uses an architecture/compiler co-designed method for context reduction. From an architecture perspective, we carefully partition the context into several subsections and only fetch the subsections that are different to the former context word whenever fetching the new context. We package each different subsection with an opcode and index value to formulate a context-fetching primitive (CFP) and explore the hardware design space by providing the centralized and distributed CFP-fetching CGRA to support this CFP-based context-fetching scheme. From the software side, we develop a similarity-aware tuning algorithm and integrate it into state-of-the-art modulo scheduling and memory access conflict optimization algorithms. The whole compilation flow can efficiently improve the similarities between contexts in each PE for the purpose of reducing both context-fetching latency and context footprint. Experimental results show that our HW/SW co-designed framework can improve the area efficiency and energy efficiency to at most 34% and 21% higher with only 2% performance overhead.

Author(s):  
Siva Sankara Phani.T , Et. al.

Coarse-Grained Reconfigurable Architectures (CGRA) is an effective solution for speeding up computer-intensive activities due to its high energy efficiency and flexibility sacrifices. The timely implementation of CGRA loops was one of the hardest problems in the analysis. Modulo scheduling (MS) was productive in order to implement loops on CGRAs. The problem remains with current MS algorithms, namely to map large and irregular circuits to CGRAs over a fair period of compilation with restricted computational and high-performance routing tools. This is mainly due to an absence of awareness of major mapping limits and a time consuming approach to solving temporary and space-related mapping using CGRA buffer tools. It aims to boost the performance and robust compilation of the CGRA modulo planning algorithm. The problem with the CGRA MS is divided into time and space and the mechanisms between the two problems have to be reorganized. We have a detailed, systematic mapping fluid that addresses the algorithms of the time mapping problem with a powerful buffer algorithm and efficient connection and calculation limitations. We create a fast-stable algorithm for spatial mapping with a retransmission and rearrangement mechanism. With higher performance and quicker build-up time, our MS algorithm can map loops to CBGRA. The results show that, given the same compilation budget, our mapping algorithm results in a better rate for compilation. The performance of this method will be increased from 5% to 14%, better than the standard CGRA mapping algorithms available.


Author(s):  
C. C. Ahn ◽  
S. Karnes ◽  
M. Lvovsky ◽  
C. M. Garland ◽  
H. A. Atwater ◽  
...  

The bane of CCD imaging systems for transmission electron microscopy at intermediate and high voltages has been their relatively poor modulation transfer function (MTF), or line pair resolution. The problem originates primarily with the phosphor screen. On the one hand, screens should be thick so that as many incident electrons as possible are converted to photons, yielding a high detective quantum efficiency(DQE). The MTF diminishes as a function of scintillator thickness however, and to some extent as a function of fluorescence within the scintillator substrates. Fan has noted that the use of a thin layer of phosphor beneath a self supporting 2μ, thick Al substrate might provide the most appropriate compromise for high DQE and MTF in transmission electron microcscopes which operate at higher voltages. Monte Carlo simulations of high energy electron trajectories reveal that only little beam broadening occurs within this thickness of Al film. Consequently, the MTF is limited predominantly by broadening within the thin phosphor underlayer. There are difficulties however, in the practical implementation of this design, associated mostly with the mechanical stability of the Al support film.


Author(s):  
Xiaoyan Wang ◽  
Jinmei Du ◽  
Changhai Xu

Abstract:: Activated peroxide systems are formed by adding so-called bleach activators to aqueous solution of hydrogen peroxide, developed in the seventies of the last century for use in domestic laundry for their high energy efficiency and introduced at the beginning of the 21st century to the textile industry as an approach toward overcoming the extensive energy consumption in bleaching. In activated peroxide systems, bleach activators undergo perhydrolysis to generate more kinetically active peracids that enable bleaching under milder conditions while hydrolysis of bleach activators and decomposition of peracids may occur as side reactions to weaken the bleaching efficiency. This mini-review aims to summarize these competitive reactions in activated peroxide systems and their influence on bleaching performance.


2020 ◽  
Vol 639 ◽  
pp. A80
Author(s):  
Xiao-Na Sun ◽  
Rui-Zhi Yang ◽  
Yun-Feng Liang ◽  
Fang-Kun Peng ◽  
Hai-Ming Zhang ◽  
...  

We report the detection of high-energy γ-ray signal towards the young star-forming region, W40. Using 10-yr Pass 8 data from the Fermi Large Area Telescope (Fermi-LAT), we extracted an extended γ-ray excess region with a significance of ~18σ. The radiation has a spectrum with a photon index of 2.49 ± 0.01. The spatial correlation with the ionized gas content favors the hadronic origin of the γ-ray emission. The total cosmic-ray (CR) proton energy in the γ-ray production region is estimated to be the order of 1047 erg. However, this could be a small fraction of the total energy released in cosmic rays (CRs) by local accelerators, presumably by massive stars, over the lifetime of the system. If so, W40, together with earlier detections of γ-rays from Cygnus cocoon, Westerlund 1, Westerlund 2, NGC 3603, and 30 Dor C, supports the hypothesis that young star clusters are effective CR factories. The unique aspect of this result is that the γ-ray emission is detected, for the first time, from a stellar cluster itself, rather than from the surrounding “cocoons”.


Molecules ◽  
2021 ◽  
Vol 26 (13) ◽  
pp. 3932
Author(s):  
Jie Song ◽  
Qing Ye ◽  
Kun Wang ◽  
Zhiyuan Guo ◽  
Meiling Dou

The development of high efficient stacks is critical for the wide spread application of proton exchange membrane fuel cells (PEMFCs) in transportation and stationary power plant. Currently, the favorable operation conditions of PEMFCs are with single cell voltage between 0.65 and 0.7 V, corresponding to energy efficiency lower than 57%. For the long term, PEMFCs need to be operated at higher voltage to increase the energy efficiency and thus promote the fuel economy for transportation and stationary applications. Herein, PEMFC single cell was investigated to demonstrate its capability to working with voltage and energy efficiency higher than 0.8 V and 65%, respectively. It was demonstrated that the PEMFC encountered a significant performance degradation after the 64 h operation. The cell voltage declined by more than 13% at the current density of 1000 mA cm−2, due to the electrode de-activation. The high operation potential of the cathode leads to the corrosion of carbon support and then causes the detachment of Pt nanoparticles, resulting in significant Pt agglomeration. The catalytic surface area of cathode Pt is thus reduced for oxygen reduction and the cell performance decreased. Therefore, electrochemically stable Pt catalyst is highly desirable for efficient PEMFCs operated under cell voltage higher than 0.8 V.


1990 ◽  
Vol 43 (5) ◽  
pp. 583
Author(s):  
GL Price

Recent developments in the growth of semiconductor thin films are reviewed. The emphasis is on growth by molecular beam epitaxy (MBE). Results obtained by reflection high energy electron diffraction (RHEED) are employed to describe the different kinds of growth processes and the types of materials which can be constructed. MBE is routinely capable of heterostructure growth to atomic precision with a wide range of materials including III-V, IV, II-VI semiconductors, metals, ceramics such as high Tc materials and organics. As the growth proceeds in ultra high vacuum, MBE can take advantage of surface science techniques such as Auger, RHEED and SIMS. RHEED is the essential in-situ probe since the final crystal quality is strongly dependent on the surface reconstruction during growth. RHEED can also be used to calibrate the growth rate, monitor growth kinetics, and distinguish between various growth modes. A major new area is lattice mismatched growth where attempts are being made to construct heterostructures between materials of different lattice constants such as GaAs on Si. Also described are the new techniques of migration enhanced epitaxy and tilted superlattice growth. Finally some comments are given On the means of preparing large area, thin samples for analysis by other techniques from MBE grown films using capping, etching and liftoff.


Author(s):  
Dennis Wolf ◽  
Andreas Engel ◽  
Tajas Ruschke ◽  
Andreas Koch ◽  
Christian Hochberger

AbstractCoarse Grained Reconfigurable Arrays (CGRAs) or Architectures are a concept for hardware accelerators based on the idea of distributing workload over Processing Elements. These processors exploit instruction level parallelism, while being energy efficient due to their simplistic internal structure. However, the incorporation into a complete computing system raises severe challenges at the hardware and software level. This article evaluates a CGRA integrated into a control engineering environment targeting a Xilinx Zynq System on Chip (SoC) in detail. Besides the actual application execution performance, the practicability of the configuration toolchain is validated. Challenges of the real-world integration are discussed and practical insights are highlighted.


Sign in / Sign up

Export Citation Format

Share Document