Dual Die Package Design Strategy and Performance

The alarming growth of power increase has presented numerous packaging challenges for high performance processors. The average power consumed by a processor is the sum of dynamic and leakage power. The dynamic power is proportional to V^2, while the leakage current (therefore leakage power) is proportional to V^b where V is the voltage and b>1 for modern processes. This means lowering voltage reduces energy consumed per clock cycle but reduces the maximum frequency at which the processor can operate at. Since reducing voltage reduces power faster than it does frequency, integrating more cores into the processor would result in better performance/power efficiency but would generate more memory accesses, driving a need for larger cache and high speed signaling [1]. In addition, the design goal to create unified package pinout for both single core and multicore product flavors adds additional constraint to create a cost effective package solution for both market segments. This paper discusses the design strategy and performance of dual die package to optimize package performance for cost.

Download Full-text

High-speed high-resolution laser diode-based photoacoustic microscopy for in vivo microvasculature imaging

Visual Computing for Industry Biomedicine and Art ◽

10.1186/s42492-020-00067-5 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Xiufeng Li ◽

Victor T C Tsang ◽

Lei Kang ◽

Yan Zhang ◽

Terence T W Wong

Keyword(s):

High Resolution ◽

High Speed ◽

High Performance ◽

Continuous Wave ◽

Signal To Noise Ratio ◽

Cost Effective ◽

High Signal ◽

Photoacoustic Microscopy ◽

Mouse Ear

AbstractLaser diodes (LDs) have been considered as cost-effective and compact excitation sources to overcome the requirement of costly and bulky pulsed laser sources that are commonly used in photoacoustic microscopy (PAM). However, the spatial resolution and/or imaging speed of previously reported LD-based PAM systems have not been optimized simultaneously. In this paper, we developed a high-speed and high-resolution LD-based PAM system using a continuous wave LD, operating at a pulsed mode, with a repetition rate of 30 kHz, as an excitation source. A hybrid scanning mechanism that synchronizes a one-dimensional galvanometer mirror and a two-dimensional motorized stage is applied to achieve a fast imaging capability without signal averaging due to the high signal-to-noise ratio. By optimizing the optical system, a high lateral resolution of 4.8 μm has been achieved. In vivo microvasculature imaging of a mouse ear has been demonstrated to show the high performance of our LD-based PAM system.

Download Full-text

VLSI ARCHITECTURE OF PARALLEL MULTIPLIER– ACCUMULATOR BASED ON RADIX-2 MODIFIED BOOTH ALGORITHM

International Journal of Electronics and Electical Engineering ◽

10.47893/ijeee.2012.1009 ◽

2012 ◽

pp. 40-46

Author(s):

Mr.M.V. Sathish ◽

Mrs. Sailaja

Keyword(s):

Signal Processing ◽

High Speed ◽

High Performance ◽

Vlsi Architecture ◽

Clock Frequency ◽

Parallel Multiplier ◽

Hybrid Type ◽

Standard Design ◽

Overall Performance ◽

And Performance

A new architecture of multiplier-andaccumulator (MAC) for high-speed arithmetic. By combining multiplication with accumulation and devising a hybrid type of carry save adder (CSA), the performance was improved. Since the accumulator that has the largest delay in MAC was merged into CSA, the overall performance was elevated. The proposing method CSA tree uses 1’s-complement-based radix-2 modified Booth’s algorithm (MBA) and has the modified array for the sign extension in order to increase the bit density of the operands. The proposed MAC showed the superior properties to the standard design in many ways and performance twice as much as the previous research in the similar clock frequency. We expect that the proposed MAC can be adapted to various fields requiring high performance such as the signal processing areas.

Download Full-text

Cost-effective flow table designs for high-speed routers: architecture and performance evaluation

IEEE Transactions on Computers ◽

10.1109/tc.2002.1032627 ◽

2002 ◽

Vol 37 (9) ◽

pp. 1089-1099 ◽

Cited By ~ 10

Author(s):

Jun Xu ◽

M. Singhal

Keyword(s):

Performance Evaluation ◽

High Speed ◽

Cost Effective ◽

Flow Table ◽

And Performance

Download Full-text

Design and Performance Analysis of 1-Bit FinFET Full Adder Cells for Subthreshold Region at 16 nm Process Technology

Journal of Nanomaterials ◽

10.1155/2015/726175 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

‘Aqilah binti Abdul Tahrim ◽

Huei Chaeng Chin ◽

Cheng Siong Lim ◽

Michael Loong Peng Tan

Keyword(s):

High Speed ◽

Average Power ◽

Full Adder ◽

Propagation Delay ◽

Oxide Semiconductor ◽

Process Technology ◽

Subthreshold Region ◽

And Performance ◽

Power Delay Product ◽

20 Nm

The scaling process of the conventional 2D-planar metal-oxide semiconductor field-effect transistor (MOSFET) is now approaching its limit as technology has reached below 20 nm process technology. A new nonplanar device architecture called FinFET was invented to overcome the problem by allowing transistors to be scaled down into sub-20 nm region. In this work, the FinFET structure is implemented in 1-bit full adder transistors to investigate its performance and energy efficiency in the subthreshold region for cell designs of Complementary MOS (CMOS), Complementary Pass-Transistor Logic (CPL), Transmission Gate (TG), and Hybrid CMOS (HCMOS). The performance of 1-bit FinFET-based full adder in 16-nm technology is benchmarked against conventional MOSFET-based full adder. The Predictive Technology Model (PTM) and Berkeley Shortchannel IGFET Model-Common Multi-Gate (BSIM-CMG) 16 nm low power libraries are used. Propagation delay, average power dissipation, power-delay-product (PDP), and energy-delay-product (EDP) are analysed based on all four types of full adder cell designs of both FETs. The 1-bit FinFET-based full adder shows a great reduction in all four metric performances. A reduction in propagation delay, PDP, and EDP is evident in the 1-bit FinFET-based full adder of CPL, giving the best overall performance due to its high-speed performance and good current driving capabilities.

Download Full-text

Development of High Performance Moisture Separator Reheater

ASME 2009 Power Conference ◽

10.1115/power2009-81092 ◽

2009 ◽

Cited By ~ 2

Author(s):

Issaku Fujita ◽

Kotaro Machii ◽

Teruaki Sakata

Keyword(s):

High Speed ◽

Nuclear Power ◽

High Performance ◽

Power Plants ◽

Tube Bundle ◽

Separation Performance ◽

Heating Steam ◽

Severe Erosion ◽

Moisture Separator ◽

And Performance

Moisture Separator Reheaters (MSRs) of Nuclear power plants, especially 1st generation type (commercial operation started from between 1970 and 1982), has been suffered from various problems like severe erosion, moisture separation performance deterioration, drain sub cooling. To solve these problems and performance improvement, improved MSR was developed. At the new MSR, high performance SS439 stainless steel round type tube bundle was applied, where heating steam distribution is optimized by orifice plate in order to minimize the drain sub cooling. Based on the CFD approach, cycle steam distribution was optimized and FAC resistant material application for the internal parts of MSRs was determined. As a result, pressure drop was reduced by 0.6% against the HP turbine exhaust pressure. Performance of moisture separation was improved by the latest chevron type separator. Where, the reverse pressure is locally caused at the drainage area of the separator because remarkable longitudinal pressure distribution is formed by the high-speed steam flow in the manifold. Then, a new moisture separation structure was developed in consideration of the influence that this reverse pressure gave to the separator performance.

Download Full-text

Design and Analysis of Rotary Positive Displacement Mechanism for Oil-Less Compression

ASME 2010 10th Biennial Conference on Engineering Systems Design and Analysis, Volume 4 ◽

10.1115/esda2010-24665 ◽

2010 ◽

Author(s):

Holger Roser

Keyword(s):

High Speed ◽

High Performance ◽

Effective Means ◽

Cost Effective ◽

Industrial Applications ◽

Two Phase ◽

Major Drawback ◽

Displacement Mechanism ◽

Flow Losses ◽

Positive Displacement

In this paper, a simple positive displacement mechanism is investigated, which comprises two counter-rotating meshing rotors within a casing. Although considered for various applications more than a century ago, the basic geometry of this mechanism has not been further explored or adapted to modern gas compressor technology. As a fully balanced rotational mechanism operating at uniform angular velocity, potential applications range from pumps to expanders, from slow large displacement to high-speed devices; nonetheless, this research focuses on high-performance oil-less gas compressors as an ideal application. During one complete cycle, the main rotor compresses and discharges the fluid, whilst the secondary rotor seals the compression chamber. Important features of this mechanism are the circular profiles of the rotors, the potential to accommodate large ports for reduced flow losses, and ease of cooling. The simple geometry facilitates a cost-effective means of achieving tight operating clearances between rotors and casing for enhanced sealing without the need for liquid lubricants such as oil. This study and preliminary tests indicate that pressure ratios suitable for standard industrial applications can be obtained over a broad speed range, whilst minimizing friction and flow losses, a major drawback of current technologies. Moreover, two-phase compression and injection of liquids prior to compression have been studied and identified as a means to further improve efficiency and cooling.

Download Full-text

Combinational Counters: A Low Overhead Approach to Address DPA Attacks

Journal of Circuits System and Computers ◽

10.1142/s0218126620500978 ◽

2019 ◽

Vol 29 (06) ◽

pp. 2050097

Author(s):

Ghobad Zarrinchian ◽

Morteza Saheb Zamani

Keyword(s):

Low Power ◽

Power Analysis ◽

High Performance ◽

Clock Cycle ◽

Experimental Results ◽

Differential Power Analysis ◽

Cryptographic Algorithms ◽

Encryption Algorithms ◽

And Performance

Differential Power Analysis (DPA) attacks are known as viable and practical techniques to break the security of cryptographic algorithms. In this type of attack, an adversary extracts the encryption key based on the correlation of consumed power of the hardware running encryption algorithms to the processed data. To address DPA attacks in the hardware layer, various techniques have been proposed thus far. However, current techniques generally impose high performance overhead. Especially, the power overhead is a serious issue which may limit the applicability of current techniques in power-constrained applications. In this paper, combinational counters are explored as a way to address the DPA attacks. By randomizing the consumed power in each clock cycle of the circuit operation, these counters can enhance the resistance of the cryptographic cores against DPA attacks with low power overhead as well as zero timing overhead. Experimental results for an AES S-Box module in 45[Formula: see text]nm technology reveal that the proposed technique is capable of achieving higher level of security in comparison to two other approaches while preserving the power and performance overhead at a same or lower level.

Download Full-text

Development of the “High Pressure Repair Dome” system for in-situ high performance repair of aeronautic structures

MATEC Web of Conferences ◽

10.1051/matecconf/201818804004 ◽

2018 ◽

Vol 188 ◽

pp. 04004

Author(s):

Nicola Gallo ◽

Silvio Pappadá ◽

Umberto Raganato ◽

Stefano Corvaglia

Keyword(s):

High Pressure ◽

High Performance ◽

Stress Transfer ◽

Cost Effective ◽

Reliable Technique ◽

Structural Repair ◽

Aerospace Applications ◽

And Performance ◽

Curing Cycle

As the use of composites for high-performance structures for aerospace applications is constantly increasing, together with the complexity and scale of such structures, an increasingly effort is carried out for the development of advanced techniques for composites structural repair. Mechanical loads and environmental conditions often cause composite damages. If material damage is not extensive, structural repair is the most cost-effective solution. Composite patches can be mechanically fastened, adhesively bonded or co-cured. Bonding or co-curing process provides enhanced stress transfer mechanisms, joint efficiencies and aerodynamic performance. In this paper an innovative and reliable technique to repair damaged composite aeronautical components, named High Pressure Repair Dome (HPRD), is shown. The innovative aspect of this solution is the possibility to bond or co-cure a composite prepreg patch under a pressurized dome, thus using a prepreg compatible with the composite structure. HPRD was developed to allow in-situ repairing on full-scale structures, with the possibility of an accurate control of the parameters of the curing cycle. The advantages and performance of HPRD approach will be discussed and compared with traditional techniques, describing the results achieved and the activity on-course for the full industrialization of this system.

Download Full-text

Impact of Modern Virtualization Methods on Timing Precision and Performance of High-Speed Applications

Future Internet ◽

10.3390/fi11080179 ◽

2019 ◽

Vol 11 (8) ◽

pp. 179 ◽

Cited By ~ 1

Author(s):

Veronika Kirova ◽

Kirill Karpov ◽

Eduard Siemens ◽

Irina Zander ◽

Oksana Vasylenko ◽

...

Keyword(s):

Virtual Environments ◽

Virtual Environment ◽

High Speed ◽

High Performance ◽

Estimation Accuracy ◽

Network Applications ◽

High Speed Network ◽

Timing Precision ◽

And Performance ◽

The Impact

The presented work is a result of extended research and analysis on timing methods precision, their efficiency in different virtual environments and the impact of timing precision on the performance of high-speed networks applications. We investigated how timer hardware is shared among heavily CPU- and I/O-bound tasks on a virtualized OS as well as on bare OS. By replacing the invoked timing methods within a well-known application for estimation of available path bandwidth, we provide the analysis of their impact on estimation accuracy. We show that timer overhead and precision are crucial for high-performance network applications, and low-precision timing methods usage, e.g., the delays and overheads issued by virtualization result in the degradation of the virtual environment. Furthermore, in this paper, we provide confirmation that, by using the methods we intentionally developed for both precise timing operations and AvB estimation, it is possible to overcome the inefficiency of standard time-related operations and overhead that comes with the virtualization. The impacts of negative virtualization factors were investigated in five different environments to define the most optimal virtual environment for high-speed network applications.

Download Full-text

An Efficient Hardware Implementation of Residual Data Binarization in HEVC CABAC Encoder

Electronics ◽

10.3390/electronics9040684 ◽

2020 ◽

Vol 9 (4) ◽

pp. 684

Author(s):

Dinh-Lam Tran ◽

Xuan-Tu Tran ◽

Duy-Hieu Bui ◽

Cong-Kha Pham

Keyword(s):

Power Consumption ◽

Power Efficiency ◽

High Performance ◽

Hardware Implementation ◽

Video Quality ◽

Clock Cycle ◽

Work Load ◽

Video Data ◽

Low Area ◽

Binary Arithmetic

HEVC-standardized encoders employ the CABAC (context-based adaptive binary arithmetic coding) to achieve high compression ratios and video quality that supports modern real-time high-quality video services. Binarizer is one of three main blocks in a CABAC architecture, where binary symbols (bins) are generated to feed the binary arithmetic encoder (BAE). The residual video data occupied an average of 75% of the CABAC’s work-load, thus its performance will significantly contribute to the overall performance of whole CABAC design. This paper proposes an efficient hardware implementation of a binarizer for CABAC that focuses on low area cost, low power consumption while still providing enough bins for high-throughput CABAC. On the average, the proposed design can process upto 3.5 residual syntax elements (SEs) per clock cycle at the maximum frequency of 500 MHz with an area cost of 9.45 Kgates (6.41 Kgates for the binarizer core) and power consumption of 0.239 mW (0.184 mW for the binarizer core) with NanGate 45 nm technology. It shows that our proposal achieved a high overhead-efficiency of 1.293 Mbins/Kgate/mW, much better than the other related high performance designs. In addition, our design also achieved a high power-efficiency of 8288 Mbins/mW; this is important factor for handheld applications.

Download Full-text