A 18.4M Triangles/s 122.6 mW Tile Co-Processor for Embedded GPU Systems

2013 ◽  
Vol 462-463 ◽  
pp. 1050-1054
Author(s):  
Jia Wang ◽  
Tao Sun ◽  
Li Zhou ◽  
Yuan Zhi Zhang ◽  
Yuan Yuan Gao

This paper presents an efficient and accurate tile co-processor architecture which can be used in the tile based rendering systems. The design involves two key components, the vertex processing unit and the triangle tiling unit. The former part is used to get the vertices transformed, clipped and projected to generate the triangle list which located in the view frustum while the latter one reads in the triangle data and determines the tile list which indicates tiles that each triangle covers. A modified Bounding BOX (BBOX) test pipeline and a mask screening technology for different overlap types is proposed and employed in the design in order to get faster triangle binning with lower power consumption. The proposed architecture works at the frequency of 270 MHz, gains 18.4 M triangles tiling/sec with a power consumption less than 122.6 mW. The chip is implemented in 0.13 um CMOS technology and consumes 2.5 x 2.5 mm2 totally.

2013 ◽  
Vol 22 (10) ◽  
pp. 1340025
Author(s):  
TENG WANG ◽  
LEI ZHAO ◽  
ZI-YI HU ◽  
ZHENG XIE ◽  
XIN-AN WANG

In this paper, a novel decomposition approach and VLSI implementation of the chroma interpolator with great hardware reuse and no multipliers for H.264 encoders are proposed. First, the characteristic of the chroma interpolation is analyzed to obtain an optimized decomposition scheme, with which the chroma interpolation can be realized with arithmetic elements (AEs) which are comprised of only adders. Four types of AEs are developed and a pipelining hardware design is proposed to conduct the chroma interpolation with great hardware reuse. The proposed design was prototyped within a Xilinx Virtex6 XC6VLX240T FPGA with a clock frequency as high as 245 MHz. The proposed design was also synthesized with SMIC 130 nm CMOS technology with a clock frequency of 200 MHz, which could support a real-time HDTV application with less hardware cost and lower power consumption.


2014 ◽  
Vol 2014 ◽  
pp. 1-14 ◽  
Author(s):  
Fenghui Yao ◽  
Mohamed Saleh Zein-Sabatto ◽  
Guifeng Shao ◽  
Mohammad Bodruzzaman ◽  
Mohan Malkani

Quantum-dot cellular automata (QCA) is an attractive nanotechnology with the potential alterative to CMOS technology. QCA provides an interesting paradigm for faster speed, smaller size, and lower power consumption in comparison to transistor-based technology, in both communication and computation. This paper describes the design of a 4-bit multifunction nanosensor data processor (NSDP). The functions of NSDP contain (i) sending the preprocessed raw data to high-level processor, (ii) counting the number of the active majority gates, and (iii) generating the approximate sigmoid function. The whole system is designed and simulated with several different input data.


Author(s):  
G.S. Tripathi ◽  
Shiv Prakash Arya ◽  
Rajan Mishra

Performance of adiabatic carry look ahead adder using dynamic CMOS are studied and compared with Adiabatic carry look ahead adder using Pass Transistor. adiabatic carry look ahead adder using pass transistor has higher delay and lower power consumption while adiabatic carry look ahead adder using dynamic cmos logic has lower power dissipation and higher speed. adiabatic carry look ahead adder using dynamic cmos are design using 180 nm cmos technology and compared power dissipation and delay with respect to supply voltage and frequency. simulation result show that power dissipation of carry look ahead adder using dynamic cmos has higher performance comparison adiabatic CLA using pass transistor. simulation result show that adiabatic CLA using dynamic cmos reduce the power consumption 45% and delay reduce to 70% comparison to adiabatic CLA using pass transistor.


2014 ◽  
Vol 23 (05) ◽  
pp. 1450072 ◽  
Author(s):  
SOMAYEH ADIBIFARD ◽  
SEYYED HASSAN MOUSAVI ◽  
SOHEYL ZIABAKHSH ◽  
MUSTAPHA C. E. YAGOUB

A novel 1/4-rate clock phase detector (PD) structure for phase locked loop (PLL)-based clock and data recovery (CDR) is proposed. In this topology, the retimed data is generated within the circuit and no extra circuit is required. Furthermore, the error and reference signals are independent of delay time through gates and thus, no extra replica circuit is needed to compensate such delay. Designed in a 0.18-μm CMOS technology, the proposed 10 Gb/s PD consumes 30 mA from a 1.8 V supply, resulting in a lower power consumption for high-speed applications compared to conventional topologies.


Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 563
Author(s):  
Jorge Pérez-Bailón ◽  
Belén Calvo ◽  
Nicolás Medrano

This paper presents a new approach based on the use of a Current Steering (CS) technique for the design of fully integrated Gm–C Low Pass Filters (LPF) with sub-Hz to kHz tunable cut-off frequencies and an enhanced power-area-dynamic range trade-off. The proposed approach has been experimentally validated by two different first-order single-ended LPFs designed in a 0.18 µm CMOS technology powered by a 1.0 V single supply: a folded-OTA based LPF and a mirrored-OTA based LPF. The first one exhibits a constant power consumption of 180 nW at 100 nA bias current with an active area of 0.00135 mm2 and a tunable cutoff frequency that spans over 4 orders of magnitude (~100 mHz–152 Hz @ CL = 50 pF) preserving dynamic figures greater than 78 dB. The second one exhibits a power consumption of 1.75 µW at 500 nA with an active area of 0.0137 mm2 and a tunable cutoff frequency that spans over 5 orders of magnitude (~80 mHz–~1.2 kHz @ CL = 50 pF) preserving a dynamic range greater than 73 dB. Compared with previously reported filters, this proposal is a competitive solution while satisfying the low-voltage low-power on-chip constraints, becoming a preferable choice for general-purpose reconfigurable front-end sensor interfaces.


Author(s):  
Philipp Ritter

Abstract Next-generation automotive radar sensors are increasingly becoming sensitive to cost and size, which will leverage monolithically integrated radar system-on-Chips (SoC). This article discusses the challenges and the opportunities of the integration of the millimeter-wave frontend along with the digital backend. A 76–81 GHz radar SoC is presented as an evaluation vehicle for an automotive, fully depleted silicon-over-insulator 22 nm CMOS technology. It features a digitally controlled oscillator, 2-millimeter-wave transmit channels and receive channels, an analog base-band with analog-to-digital conversion as well as a digital signal processing unit with on-chip memory. The radar SoC evaluation chip is packaged and flip-chip mounted to a high frequency printed circuit board for functional demonstration and performance evaluation.


2013 ◽  
Vol 2013 ◽  
pp. 1-11
Author(s):  
A. K. Pandey ◽  
R. A. Mishra ◽  
R. K. Nagaria

We proposed footless domino logic buffer circuit. It minimizes redundant switching at the dynamic and the output nodes. The proposed circuit avoids propagation of precharge pulse to the output node and allows the dynamic node which saves power consumption. Simulation is done using 0.18 µm CMOS technology. We have calculated the power consumption, delay, and power delay product of the proposed circuit and compared the results with the existing circuits for different logic function, loading condition, clock frequency, temperature, and power supply. Our proposed circuit reduces power consumption and power delay product as compared to the existing circuits.


2009 ◽  
Vol 18 (03) ◽  
pp. 487-495 ◽  
Author(s):  
VINCENZO STORNELLI ◽  
GIUSEPPE FERRI ◽  
KING PACE

This work presents a single chip integrated pulse generator-modulator to be utilized in a short range wireless radio sensors remote control applications. The circuit, which can generate single pulses, modulated in BPSK, OOK, PAM, and also PPM, has been developed in a standard CMOS technology (AMS 0.35 μm). Typical pulse duration is about 1 ns while pulse repetition frequency is until 200 MHz (5 ns "chip" time). The operating supply voltage is ± 2.5 V, while the whole power consumption is about 15 mW. Post-layout parametric and corner analyses have confirmed the theoretical expectations.


2002 ◽  
Vol 11 (01) ◽  
pp. 51-55
Author(s):  
ROBERT C. CHANG ◽  
L.-C. HSU ◽  
M.-C. SUN

A novel low-power and high-speed D flip-flop is presented in this letter. The flip-flop consists of a single low-power latch, which is controlled by a positive narrow pulse. Hence, fewer transistors are used and lower power consumption is achieved. HSPICE simulation results show that power dissipation of the proposed D flip-flop has been reduced up to 76%. The operating frequency of the flip-flop is also greatly increased.


2013 ◽  
Vol 6 (2) ◽  
pp. 109-113 ◽  
Author(s):  
Andrea Malignaggi ◽  
Amin Hamidian ◽  
Georg Boeck

The present paper presents a fully differential 60 GHz four stages low-noise amplifier for wireless applications. The amplifier has been optimized for low-noise, high-gain, and low-power consumption, and implemented in a 90 nm low-power CMOS technology. Matching and common-mode rejection networks have been realized using shielded coplanar transmission lines. The amplifier achieves a peak small-signal gain of 21.3 dB and an average noise figure of 5.4 dB along with power consumption of 30 mW and occupying only 0.38 mm2pads included. The detailed design procedure and the achieved measurement results are presented in this work.


Sign in / Sign up

Export Citation Format

Share Document