A NOVEL DECOMPOSITION APPROACH AND VLSI IMPLEMENTATION OF CHROMA INTERPOLATOR FOR H.264 ENCODERS

In this paper, a novel decomposition approach and VLSI implementation of the chroma interpolator with great hardware reuse and no multipliers for H.264 encoders are proposed. First, the characteristic of the chroma interpolation is analyzed to obtain an optimized decomposition scheme, with which the chroma interpolation can be realized with arithmetic elements (AEs) which are comprised of only adders. Four types of AEs are developed and a pipelining hardware design is proposed to conduct the chroma interpolation with great hardware reuse. The proposed design was prototyped within a Xilinx Virtex6 XC6VLX240T FPGA with a clock frequency as high as 245 MHz. The proposed design was also synthesized with SMIC 130 nm CMOS technology with a clock frequency of 200 MHz, which could support a real-time HDTV application with less hardware cost and lower power consumption.

Download Full-text

Static Switching Dynamic Buffer Circuit

Journal of Engineering ◽

10.1155/2013/646214 ◽

2013 ◽

Vol 2013 ◽

pp. 1-11

Author(s):

A. K. Pandey ◽

R. A. Mishra ◽

R. K. Nagaria

Keyword(s):

Power Consumption ◽

Power Supply ◽

Loading Condition ◽

Cmos Technology ◽

Clock Frequency ◽

Logic Function ◽

Output Node ◽

Switching Dynamic ◽

Power Delay Product ◽

Frequency Temperature

We proposed footless domino logic buffer circuit. It minimizes redundant switching at the dynamic and the output nodes. The proposed circuit avoids propagation of precharge pulse to the output node and allows the dynamic node which saves power consumption. Simulation is done using 0.18 µm CMOS technology. We have calculated the power consumption, delay, and power delay product of the proposed circuit and compared the results with the existing circuits for different logic function, loading condition, clock frequency, temperature, and power supply. Our proposed circuit reduces power consumption and power delay product as compared to the existing circuits.

Download Full-text

A Novel Highly Linear Voltage-To-Time Converter (VTC) Circuit for Time-Based Analog-To-Digital Converters (ADC) Using Body Biasing

Electronics ◽

10.3390/electronics9122033 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2033

Author(s):

Ahmed Elgreatly ◽

Ahmed Dessouki ◽

Hassan Mostafa ◽

Rania Abdalla ◽

El-sayed El-Rabaie

Keyword(s):

Power Consumption ◽

Software Defined Radio ◽

Dynamic Range ◽

Supply Voltage ◽

Cmos Technology ◽

The Body ◽

Analog To Digital Converters ◽

Clock Frequency ◽

Analog To Digital ◽

Operation Speed

Time-based analog-to-digital converter is considered a crucial part in the design of software-defined radio receivers for its higher performance than other analog-to-digital converters in terms of operation speed, input dynamic range and power consumption. In this paper, two novel voltage-to-time converters are proposed at which the input voltage signal is connected to the body terminal of the starving transistor rather than its gate terminal. These novel converters exhibit better linearity, which is analytically proven in this paper. The maximum linearity error is reduced to 0.4%. In addition, the input dynamic range of these converters is increased to 800 mV for a supply voltage of 1.2 V by using industrial hardware-calibrated TSMC 65 nm CMOS technology. These novel designs consist of only a single inverter stage, which results in reducing the layout area and the power consumption. The overall power consumption is 18 μW for the first proposed circuit and 15 μW for the second proposed circuit. The novel converter circuits have a resolution of 5 bits and operate at a maximum clock frequency of 500 MHz.

Download Full-text

A COMPUTATIONAL-RAM (C-RAM) ARCHITECTURE FOR REAL-TIME MESH-BASED VIDEO MOTION TRACKING PART 2: MOTION COMPENSATION

Journal of Circuits System and Computers ◽

10.1142/s0218126604001933 ◽

2004 ◽

Vol 13 (06) ◽

pp. 1217-1231

Author(s):

MOHAMMED SAYED ◽

WAEL BADAWY

Keyword(s):

Real Time ◽

Reference Frame ◽

Motion Compensation ◽

Motion Tracking ◽

Cmos Technology ◽

Video Frame ◽

Clock Frequency ◽

Output Data ◽

Current Frame ◽

The Core

This paper presents a new Computational-RAM (C-RAM) architecture for real-time mesh-based video motion tracking. In Part 1, the motion estimation part of the proposed architecture is presented. Here in Part 2, a new C-RAM mesh-based motion compensation architecture is presented. The input data to the architecture is the mesh nodes motion vectors and the reference frame and the output data is the compensated (i.e., predicted) frame. The architecture uses the affine transformation for warping the deformed patches in the reference frame into the undeformed patches in the current frame. The architecture computes the affine parameters using a multiplication-free algorithm. The reference and current frames are stored in embedded S-RAMs generated with Virage™ Memory Compiler. The proposed motion compensation architecture has been prototyped, simulated and synthesized using the TSMC 0.18 μm CMOS technology. Using 100 MHz clock frequency, the proposed architecture processes one CIF video frame (i.e., 352×288 pixels) in 0.59 ms, which means it can process up to 1694 frames per second. The core area of the proposed motion compensation architecture is 28.04 mm2 and it consumes 31.15 mW.

Download Full-text

Nanosensor Data Processor in Quantum-Dot Cellular Automata

Journal of Nanotechnology ◽

10.1155/2014/259869 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 13

Author(s):

Fenghui Yao ◽

Mohamed Saleh Zein-Sabatto ◽

Guifeng Shao ◽

Mohammad Bodruzzaman ◽

Mohan Malkani

Keyword(s):

Cellular Automata ◽

Quantum Dot ◽

Power Consumption ◽

Input Data ◽

Cmos Technology ◽

Sigmoid Function ◽

Lower Power ◽

Quantum Dot Cellular Automata ◽

Data Processor ◽

High Level

Quantum-dot cellular automata (QCA) is an attractive nanotechnology with the potential alterative to CMOS technology. QCA provides an interesting paradigm for faster speed, smaller size, and lower power consumption in comparison to transistor-based technology, in both communication and computation. This paper describes the design of a 4-bit multifunction nanosensor data processor (NSDP). The functions of NSDP contain (i) sending the preprocessed raw data to high-level processor, (ii) counting the number of the active majority gates, and (iii) generating the approximate sigmoid function. The whole system is designed and simulated with several different input data.

Download Full-text

A 18.4M Triangles/s 122.6 mW Tile Co-Processor for Embedded GPU Systems

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.462-463.1050 ◽

2013 ◽

Vol 462-463 ◽

pp. 1050-1054

Author(s):

Jia Wang ◽

Tao Sun ◽

Li Zhou ◽

Yuan Zhi Zhang ◽

Yuan Yuan Gao

Keyword(s):

Power Consumption ◽

Cmos Technology ◽

Processing Unit ◽

Processor Architecture ◽

Lower Power ◽

Bounding Box ◽

Screening Technology ◽

Embedded Gpu

This paper presents an efficient and accurate tile co-processor architecture which can be used in the tile based rendering systems. The design involves two key components, the vertex processing unit and the triangle tiling unit. The former part is used to get the vertices transformed, clipped and projected to generate the triangle list which located in the view frustum while the latter one reads in the triangle data and determines the tile list which indicates tiles that each triangle covers. A modified Bounding BOX (BBOX) test pipeline and a mask screening technology for different overlap types is proposed and employed in the design in order to get faster triangle binning with lower power consumption. The proposed architecture works at the frequency of 270 MHz, gains 18.4 M triangles tiling/sec with a power consumption less than 122.6 mW. The chip is implemented in 0.13 um CMOS technology and consumes 2.5 x 2.5 mm2 totally.

Download Full-text

A Digital Linear-Switching Hybrid Power Amplifier for Envelope Tracking Hybrid Supply Modulators

Journal of Circuits System and Computers ◽

10.1142/s0218126617501626 ◽

2017 ◽

Vol 26 (10) ◽

pp. 1750162

Author(s):

Atefeh Salimi ◽

Rasoul Dehghani ◽

Abdolreza Nabavi

Keyword(s):

Power Consumption ◽

Power Amplifier ◽

Cmos Technology ◽

Clock Frequency ◽

Envelope Tracking ◽

Power Efficient ◽

Class Ab ◽

Digital To Analog Converters ◽

Digitally Controlled ◽

Digital Block

A novel envelope modulator for envelope tracking RF power amplifier (PA) is presented in this paper. The proposed modulator consists of a parallel combination of analog class AB and digitally controlled hybrid PAs. The analog and digital class AB PAs are effective in both reducing the clock frequency and also static power dissipation, thus improving the efficiency of the modulator. On the other hand, lower clock frequencies result in simpler and more power-efficient digital to analog converters required in the architecture. The modulator digital block is evaluated with a 45[Formula: see text]nm CMOS technology. The overall power consumption of the digital block is around 76[Formula: see text]mW at 800[Formula: see text]MHz clock frequency. As an application, the designed digital block is incorporated in a complete envelope modulator architecture. The overall efficiency of the modulator, including the digital block power consumption, is around 80.7% at an average 32[Formula: see text]dBm output power for a 5[Formula: see text]MHz input signal.

Download Full-text

Fast-Transient-Response Low-Voltage Integrated, Interleaved DC–DC Converter for Implantable Devices

Journal of Circuits System and Computers ◽

10.1142/s0218126620500139 ◽

2019 ◽

Vol 29 (01) ◽

pp. 2050013

Author(s):

Najmeh Cheraghi Shirazi ◽

Abumoslem Jannesari ◽

Pooya Torkzadeh

Keyword(s):

Power Consumption ◽

Transient Response ◽

Output Voltage ◽

Low Voltage ◽

Charge Pump ◽

Input Voltage ◽

Cmos Technology ◽

Clock Frequency ◽

Voltage Ripple ◽

Body Biasing

A new self-start-up switched-capacitor charge pump is proposed for low-power, low-voltage and battery-less implantable applications. To minimize output voltage ripple and improve transient response, interleaving regulation technique is applied to a multi-stage Cross-Coupled Charge Pump (CCCP) circuit. It splits the power flow in a time-sequenced manner. Three cases of study are designed and investigated with body-biasing technique by auxiliary transistors: Four-stage Two-Branch CCCP (TBCCCP), the two-cell four-stage Interleaved Two-Branch CCCP (ITBCCCP2) and four-cell four-stage Interleaved Two-Branch CCCP (ITBCCCP4). Multi-phase nonoverlap clock generator circuit with body-biasing technique is also proposed which can operate at voltages as low as CCCP circuits. The proposed circuits are designed with input voltage as low as 300 to 400[Formula: see text]mV and 20[Formula: see text]MHz clock frequency for 1[Formula: see text]pF load capacitance. Among the three designs, ITBCCCP4 has the lowest ramp-up time (41.6% faster), output voltage ripple (29% less) and power consumption (19% less). The Figure-Of-Merit (FOM) of ITBCCCP4 is the highest value among two others. For 400[Formula: see text]mV input voltage, ITBCCCP4 has a 98.3% pumping efficiency within 11.6[Formula: see text][Formula: see text]s, while having a maximum voltage ripple of 0.1% and a power consumption as low as 2.7[Formula: see text]nW. The FOM is 0.66 for this circuit. The designed circuits are implemented in 180-nm standard CMOS technology with an effective chip area of [Formula: see text][Formula: see text][Formula: see text]m for TBCCCP, [Formula: see text][Formula: see text][Formula: see text]m for ITBCCCP2 and [Formula: see text][Formula: see text][Formula: see text]m for ITBCCCP4.

Download Full-text

VLSI IMPLEMENTATION OF AN EFFICIENT ASIC ARCHITECTURE FOR REAL-TIME ROTATION OF DIGITAL IMAGES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001495000213 ◽

1995 ◽

Vol 09 (02) ◽

pp. 449-462 ◽

Cited By ~ 7

Author(s):

INDRADEEP GHOSH ◽

BANDANA MAJUMDAR

Keyword(s):

Real Time ◽

Architectural Design ◽

Vlsi Implementation ◽

Gray Level ◽

Clock Frequency ◽

Design Rules ◽

Silicon Area ◽

Vlsi Technology ◽

Processing Techniques ◽

High Processing

This paper describes the design and the VLSI implementation of a novel architecture that performs image rotation in real time. In order to improve throughput, we divide an image-frame into a number of windows. The rotation of each window-center as well as the final displacement of individual pixels within a window is then calculated. A CORDIC-based scheme is used to compute the displacement of a pixel. Our architectural design is incorporated into a chip that has been laid out using VTI (VLSI Technology Inc.) tools obeying the 1.5 μm SCMOS design rules. The chip owes its high processing capability to a combination of pipelining and parallel-processing techniques. For a clock frequency greater than 10.6 MHz, we can perform the rotation of a 512×512 gray-level digital image at the rate of 30 frames per second. The chip utilizes around 35,000 transistors and has an estimated silicon area of 211 mils×276 mils.

Download Full-text

Design of (2, 1, N) Parallel Convolutional Encodes for VLSI

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.2822 ◽

2013 ◽

Vol 321-324 ◽

pp. 2822-2827 ◽

Cited By ~ 1

Author(s):

Mao Qiang Duan ◽

Xiao Li Huang

Keyword(s):

Low Power ◽

Power Dissipation ◽

High Speed ◽

Hardware Design ◽

Cmos Technology ◽

Shift Register ◽

Vlsi Implementation ◽

Parallel Method ◽

Parallel Circuits ◽

Low Power Dissipation

The characters of more high speed computing and much less low power dissipation are needed to settle for convolutional encodes. In this paper, we present a parallel method for convolutional encodes with SMIC 0.35μm CMOS technology; hardware design and VLSI implementation of this algorithm are also presented. Use this method, parallel circuits structure can be easily designed, which take on excellent characters of more high speed computing and low power dissipation compared with traditional serial shift register structure for convolutional encodes.

Download Full-text

A COMPUTATIONAL-RAM (C-RAM) ARCHITECTURE FOR REAL-TIME MESH-BASED VIDEO MOTION TRACKING PART 1: MOTION ESTIMATION

Journal of Circuits System and Computers ◽

10.1142/s0218126604001921 ◽

2004 ◽

Vol 13 (06) ◽

pp. 1203-1215

Author(s):

MOHAMMED SAYED ◽

WAEL BADAWY

Keyword(s):

Motion Estimation ◽

Real Time ◽

Motion Tracking ◽

Cmos Technology ◽

Block Matching ◽

Video Frame ◽

Clock Frequency ◽

Motion Vectors ◽

Mesh Structure ◽

Block Matching Algorithm

This paper presents a new Computational-RAM (C-RAM) architecture for real-time mesh-based video motion tracking. The motion tracking consists of two operations: mesh-based motion estimation and compensation. The proposed motion estimation architecture is presented in Part 1 and the proposed motion compensation architecture is presented in Part 2. The motion estimation architecture stores two frames and computes motion vectors for a regular triangular mesh structure as defined by MPEG-4 Part 2.1 The motion estimation architecture uses the block-matching algorithm (BMA) to estimate the vertical and horizontal motion vectors for each mesh node. Parallel and pipelined implementations have been used to overcome the huge computational requirements of the motion estimation process. The two frames are stored in embedded S-RAMs generated with Virage™ Memory Compiler. The proposed motion estimation architecture has been prototyped, simulated and synthesized using the TSMC 0.18 μm CMOS technology. At 100 MHz clock frequency, the proposed architecture processes one CIF video frame (i.e., 352×288 pixels) in 1.48 ms, which means it can process up to 675 frames per second. The core area of the proposed motion estimation architecture is 24.58 mm2 and it consumes 46.26 mW.

Download Full-text