Power Efficiency and Performance with ORNL's Cray XK7 Titan

Author(s):  
Jim Rogers
2010 ◽  
Vol 2010 ◽  
pp. 1-7 ◽  
Author(s):  
D. Y. C. Lie

RFIC integration has seen dramatic progress since the early 1990s. For example, Si-based single-chip products for GSM, WLAN, Bluetooth, and DECT applications have become commercially available. However, RF power amplifiers (PAs) and switches tend to remain off-chip in the context of single-chip CMOS/BiCMOS transceiver ICs for handset applications. More recently, several WLAN/Bluetooth vendors have successfully integrated less demanding PAs onto the transceivers. This paper will focus on single-chip RF-system-on-a-chip (i.e., “RF-SoC”) implementations that include a high-power PA. An analysis of all tradeoffs inherent to integrating higher power PAs is provided. The analysis includes the development cost, time-to-market, power efficiency, yield, reliability, and performance issues. Recent design trends on highly integrated CMOS WiFi transceivers in the literature will be briefly reviewed with emphasis on the RF-SoC product design tradeoffs impacted by the choice between integrated versus external PAs.


2020 ◽  
Vol 13 (2) ◽  
pp. 821-838 ◽  
Author(s):  
Conor G. Bolas ◽  
Valerio Ferracci ◽  
Andrew D. Robinson ◽  
Mohammed I. Mead ◽  
Mohd Shahrul Mohd Nadzir ◽  
...  

Abstract. The iDirac is a new instrument to measure selected hydrocarbons in the remote atmosphere. A robust design is central to its specifications, with portability, power efficiency, low gas consumption and autonomy as the other driving factors in the instrument development. The iDirac is a dual-column isothermal oven gas chromatograph with photoionisation detection (GC-PID). The instrument is designed and built in-house. It features a modular design, with the novel use of open-source technology for accurate instrument control. Currently configured to measure biogenic isoprene, the system is suitable for a range of compounds. For isoprene measurements in the field, the instrument precision (relative standard deviation) is ±10 %, with a limit of detection down to 38 pmol mol−1 (or ppt). The instrument was first tested in the field in 2015 during a ground-based campaign, and has since shown itself suitable for deployment in a variety of environments and platforms. This paper describes the instrument design, operation and performance based on laboratory tests in a controlled environment as well as during deployments in forests in Malaysian Borneo and central England.


Author(s):  
Mahadevan Suryakumar ◽  
Lu-Vong T. Phan ◽  
Mathew Ma ◽  
Wajahat Ahmed

The alarming growth of power increase has presented numerous packaging challenges for high performance processors. The average power consumed by a processor is the sum of dynamic and leakage power. The dynamic power is proportional to V^2, while the leakage current (therefore leakage power) is proportional to V^b where V is the voltage and b>1 for modern processes. This means lowering voltage reduces energy consumed per clock cycle but reduces the maximum frequency at which the processor can operate at. Since reducing voltage reduces power faster than it does frequency, integrating more cores into the processor would result in better performance/power efficiency but would generate more memory accesses, driving a need for larger cache and high speed signaling [1]. In addition, the design goal to create unified package pinout for both single core and multicore product flavors adds additional constraint to create a cost effective package solution for both market segments. This paper discusses the design strategy and performance of dual die package to optimize package performance for cost.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5916
Author(s):  
Diego Romano ◽  
Marco Lapegna

Image Coregistration for InSAR processing is a time-consuming procedure that is usually processed in batch mode. With the availability of low-energy GPU accelerators, processing at the edge is now a promising perspective. Starting from the individuation of the most computationally intensive kernels from existing algorithms, we decomposed the cross-correlation problem from a multilevel point of view, intending to design and implement an efficient GPU-parallel algorithm for multiple settings, including the edge computing one. We analyzed the accuracy and performance of the proposed algorithm—also considering power efficiency—and its applicability to the identified settings. Results show that a significant speedup of InSAR processing is possible by exploiting GPU computing in different scenarios with no loss of accuracy, also enabling onboard processing using SoC hardware.


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2629
Author(s):  
Kun-Che Ho ◽  
Yi-Hua Liu ◽  
Song-Pei Ye ◽  
Guan-Jhu Chen ◽  
Yu-Shan Cheng

The battery storage system (BSS) is one of the key components in many modern power applications, such as in renewable energy systems and electric vehicles. However, charge imbalance among batteries is very common in BSSs, which may impair the power efficiency, reliability, and safety. Hence, various battery equalization methods have been proposed in the literature. Among these techniques, switched-capacitor (SC)-based battery equalizers (BEs) have attracted much attention due to their low cost, small size, and controllability. In this paper, seven types of SC-based BEs are studied, including conventional, double-tiered, modularized, chain structure types I and II, series-parallel, and single SC-based BEs. Mathematical models that describe the charge–discharge behaviors are first derived. Next, a statistical analysis based on MATLAB simulation is carried out to compare the performance of these seven BEs. Finally, a summary of the circuit design complexity, balancing speed, and practical implementation options for these seven topologies is provided.


2021 ◽  
Vol 2141 (1) ◽  
pp. 012001
Author(s):  
Zih-Chun Dai

Abstract The roller worm gear drives have been widely adopted in numerous industrial applications such as robot joint reducer, heavy-duty production line. This study is to improve the performance of a roller gear drive by utilizing an iterative optimization scheme to improve the tooth profile of the hourglass worm gear in the roller gear drive. Dedicated design of the variable-pitch slot on the hourglass worm gear can remedy the power efficiency of the roller gear drive by enhancing the contact ratio dramatically. This research showed that the roller gear drive is a better mechanism for the high reduction ratio reducers. The CAD design and performance analysis of a roller gear drive by SolidWorks have provided the engineers an optimizing methodology.


2021 ◽  
Vol 14 (3) ◽  
pp. 1-33
Author(s):  
Enrico Reggiani ◽  
Emanuele DEL Sozzo ◽  
Davide Conficconi ◽  
Giuseppe Natale ◽  
Carlo Moroni ◽  
...  

Stencil-based algorithms are a relevant class of computational kernels in high-performance systems, as they appear in a plethora of fields, from image processing to seismic simulations, from numerical methods to physical modeling. Among the various incarnations of stencil-based computations, Iterative Stencil Loops (ISLs) and Convolutional Neural Networks (CNNs) represent two well-known examples of kernels belonging to the stencil class. Indeed, ISLs apply the same stencil several times until convergence, while CNN layers leverage stencils to extract features from an image. The computationally intensive essence of ISLs, CNNs, and in general stencil-based workloads, requires solutions able to produce efficient implementations in terms of throughput and power efficiency. In this context, FPGAs are ideal candidates for such workloads, as they allow design architectures tailored to the stencil regular computational pattern. Moreover, the ever-growing need for performance enhancement leads FPGA-based architectures to scale to multiple devices to benefit from a distributed acceleration. For this reason, we propose a library of HDL components to effectively compute ISLs and CNNs inference on FPGA, along with a scalable multi-FPGA architecture, based on custom PCB interconnects. Our solution eases the design flow and guarantees both scalability and performance competitive with state-of-the-art works.


Sign in / Sign up

Export Citation Format

Share Document