scholarly journals CADRE: A Low-power, Low-EMI DSP Architecture for Digital Mobile Phones

VLSI Design ◽  
2001 ◽  
Vol 12 (3) ◽  
pp. 333-348 ◽  
Author(s):  
Mike Lewis ◽  
Linda Brackenbury

Current mobile phone applications demand high performance from the DSP, and future generations are likely to require even greater throughput. However, it is important to balance these processing demands against the requirement of low power consumption for extended battery lifetime. A novel low-power digital signal processor (DSP) architecture CADRE (Configurable Asynchronous DSP for Reduced Energy) addresses these requirements through a multi-level power reduction strategy. A parallel architecture and configurable compressed instruction set meets the throughput requirements without excessive program memory bandwidth, while a large register file reduces the cost of data accesses. Sign-magnitude representation is used for data, to reduce switching activity within the datapath. Asynchronous design gives fine-grained activity control without the complexities of clock gating, and gives low electromagnetic interference. Finally, the operational model of the target application allows for a reduced interrupt structure, simplifying processor design by avoiding the need for exact exceptions.

Author(s):  
Saurabh J Shewale ◽  
Sonal A Shirsath

This paper presents a comparative study of Complementary MOSFET (CMOS) full adder circuits. Our approach is based on hybrid design full adder circuits combined in a single unit. Full adder circuit is an essential component for designing of various digital systems. It is used for different applications such as Digital signal processor, microcontroller, microprocessor and data processing units (DSP). In most of these systems the adder lies in the critical path that determines the overall speed of the system. Full adder is mainly used in VLSI devices like microprocessor for computational purposes. The proposed full adder cell has low power consumption, better area efficiency. Recently, there have been massive research interests in this area due to the growing need for low-power and high-performance computing systems. Our aim is to design and compare the full adder circuit in various technologies and compare their power capacity. By using the hybrid structure of NMOS and PMOS, we have implemented the circuit of full adder.


2005 ◽  
Vol 15 (02) ◽  
pp. 459-476
Author(s):  
C. PATRICK YUE ◽  
JAEJIN PARK ◽  
RUIFENG SUN ◽  
L. RICK CARLEY ◽  
FRANK O'MAHONY

This paper presents the low-power circuit techniques suitable for high-speed digital parallel interfaces each operating at over 10 Gbps. One potential application for such high-performance I/Os is the interface between the channel IC and the magnetic read head in future compact hard disk systems. First, a crosstalk cancellation technique using a novel data encoding scheme is introduced to suppress electromagnetic interference (EMI) generated by the adjacent parallel I/Os . This technique is implemented utilizing a novel 8-4-PAM signaling with a data look-ahead algorithm. The key circuit components in the high-speed interface transceiver including the receive sampler, the phase interpolator, and the transmitter output driver are described in detail. Designed in a 0.13-μm digital CMOS process, the transceiver consumes 310 mW per 10-Gps channel from a I-V supply based on simulation results. Next, a 20-Gbps continuous-time adaptive passive equalizer utilizing on-chip lumped RLC components is described. Passive equalizers offer the advantages of higher bandwidth and lower power consumption compared with conventional designs using active filter. A low-power, continuous-time servo loop is designed to automatically adjust the equalizer frequency response for the optimal gain compensation. The equalizer not only adapts to different channel characteristics, but also accommodates temperature and process variations. Implemented in a 0.25-μm, 1P6M BiCMOS process, the equalizer can compensate up to 20 dB of loss at 10 GHz while only consumes 32 mW from a 2.5-V supply.


Electronics ◽  
2018 ◽  
Vol 7 (10) ◽  
pp. 241 ◽  
Author(s):  
Arthur Rosa ◽  
Matheus Silva ◽  
Marcos Campos ◽  
Renato Santana ◽  
Welbert Rodrigues ◽  
...  

In this work, a new real-time Simulation method is designed for nonlinear control techniques applied to power converters. We propose two different implementations: in the first one (Single Hardware in The Loop: SHIL), both model and control laws are inserted in the same Digital Signal Processor (DSP), and in the second approach (Double Hardware in The Loop: DHIL), the equations are loaded in different embedded systems. With this methodology, linear and nonlinear control techniques can be designed and compared in a quick and cheap real-time realization of the proposed systems, ideal for both students and engineers who are interested in learning and validating converters performance. The methodology can be applied to buck, boost, buck-boost, flyback, SEPIC and 3-phase AC-DC boost converters showing that the new and high performance embedded systems can evaluate distinct nonlinear controllers. The approach is done using matlab-simulink over commodity Texas Instruments Digital Signal Processors (TI-DSPs). The main purpose is to demonstrate the feasibility of proposed real-time implementations without using expensive HIL systems such as Opal-RT and Typhoon-HL.


Electronics ◽  
2018 ◽  
Vol 7 (10) ◽  
pp. 243 ◽  
Author(s):  
Padmanabhan Balasubramanian ◽  
Douglas Maskell ◽  
Nikos Mastorakis

Adder is an important datapath unit of a general-purpose microprocessor or a digital signal processor. In the nanoelectronics era, the design of an adder that is modular and which can withstand variations in process, voltage and temperature are of interest. In this context, this article presents a new robust early output asynchronous block carry lookahead adder (BCLA) with redundant carry logic (BCLARC) that has a reduced power-cycle time product (PCTP) and is a low power design. The proposed asynchronous BCLARC is implemented using the delay-insensitive dual-rail code and adheres to the 4-phase return-to-zero (RTZ) and the 4-phase return-to-one (RTO) handshaking. Many existing asynchronous ripple-carry adders (RCAs), carry lookahead adders (CLAs) and carry select adders (CSLAs) were implemented alongside to perform a comparison based on a 32/28 nm complementary metal-oxide-semiconductor (CMOS) technology. The 32-bit addition was considered for an example. For implementation using the delay-insensitive dual-rail code and subject to the 4-phase RTZ handshaking (4-phase RTO handshaking), the proposed BCLARC which is robust and of early output type achieves: (i) 8% (5.7%) reduction in PCTP compared to the optimum RCA, (ii) 14.9% (15.5%) reduction in PCTP compared to the optimum BCLARC, and (iii) 26% (25.5%) reduction in PCTP compared to the optimum CSLA.


2004 ◽  
Vol 12 (02) ◽  
pp. 149-174 ◽  
Author(s):  
KILSEOK CHO ◽  
ALAN D. GEORGE ◽  
RAJ SUBRAMANIYAN ◽  
KEONWOOK KIM

Matched-field processing (MFP) localizes sources more accurately than plane-wave beamforming by employing full-wave acoustic propagation models for the cluttered ocean environment. The minimum variance distortionless response MFP (MVDR–MFP) algorithm incorporates the MVDR technique into the MFP algorithm to enhance beamforming performance. Such an adaptive MFP algorithm involves intensive computational and memory requirements due to its complex acoustic model and environmental adaptation. The real-time implementation of adaptive MFP algorithms for large surveillance areas presents a serious computational challenge where high-performance embedded computing and parallel processing may be required to meet real-time constraints. In this paper, three parallel algorithms based on domain decomposition techniques are presented for the MVDR–MFP algorithm on distributed array systems. The parallel performance factors in terms of execution times, communication times, parallel efficiencies, and memory capacities are examined on three potential distributed systems including two types of digital signal processor arrays and a cluster of personal computers. The performance results demonstrate that these parallel algorithms provide a feasible solution for real-time, scalable, and cost-effective adaptive beamforming on embedded, distributed array systems.


2018 ◽  
Vol 246 ◽  
pp. 03044 ◽  
Author(s):  
Guozhao Zeng ◽  
Xiao Hu ◽  
Yueyue Chen

Convolutional Neural Networks (CNNs) have become the most advanced algorithms for deep learning. They are widely used in image processing, object detection and automatic translation. As the demand for CNNs continues to increase, the platforms on which they are deployed continue to expand. As an excellent low-power, high-performance, embedded solution, Digital Signal Processor (DSP) is used frequently in many key areas. This paper attempts to deploy the CNN to Texas Instruments (TI)’s TMS320C6678 multi-core DSP and optimize the main operations (convolution) to accommodate the DSP structure. The efficiency of the improved convolution operation has increased by tens of times.


Sign in / Sign up

Export Citation Format

Share Document