BLOCK FLOATING POINT FFT IMPLEMENTATION FOR DMT xDSL SYSTEMS

2004 ◽  
Vol 13 (05) ◽  
pp. 1147-1164
Author(s):  
Th. ZAHARIADIS ◽  
S. APOSTOLACOS ◽  
I. GRAMMATIKAKIS ◽  
D. MEXIS ◽  
N. ZERVOS ◽  
...  

The development of multiple Discrete Multitone (DMT) Digital Subscriber Line (DSL) flavors on a single platform can benefit considerably by a programmable architecture, which feature Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), especially when fast prototyping is targeted. However, the flexibility assumed to be offered by algorithmic partitioning does not automatically and proportionally simplify the digital signal processing algorithms, unless the effects of overflow/saturation in intermediate processing stages are carefully studied. The effects of overflow/saturation in intermediate stages is very critical throughout the design process, since the operations involved are nonlinear in nature and affect the most significant bits of the computational process. This paper presents an efficient soft-core implementation of a Block Floating Point FFT (BLFP) algorithm, designed for a Very high-speed DSL (VDSL) DMT systems and for the full variety of other xDSL DMT flavors, as the latter demand an extended dynamic range to achieve performance that may otherwise be only warranted by costly floating-point chip implementations.

Electronics ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 884
Author(s):  
Stefano Rossi ◽  
Enrico Boni

Methods of increasing complexity are currently being proposed for ultrasound (US) echographic signal processing. Graphics Processing Unit (GPU) resources allowing massive exploitation of parallel computing are ideal candidates for these tasks. Many high-performance US instruments, including open scanners like ULA-OP 256, have an architecture based only on Field-Programmable Gate Arrays (FPGAs) and/or Digital Signal Processors (DSPs). This paper proposes the implementation of the embedded NVIDIA Jetson Xavier AGX module on board ULA-OP 256. The system architecture was revised to allow the introduction of a new Peripheral Component Interconnect Express (PCIe) communication channel, while maintaining backward compatibility with all other embedded computing resources already on board. Moreover, the Input/Output (I/O) peripherals of the module make the ultrasound system independent, freeing the user from the need to use an external controlling PC.


2014 ◽  
Vol 550 ◽  
pp. 126-136
Author(s):  
N. Ramya Rani

:Floating point arithmetic plays a major role in scientific and embedded computing applications. But the performance of field programmable gate arrays (FPGAs) used for floating point applications is poor due to the complexity of floating point arithmetic. The implementation of floating point units on FPGAs consumes a large amount of resources and that leads to the development of embedded floating point units in FPGAs. Embedded applications like multimedia, communication and DSP algorithms use floating point arithmetic in processing graphics, Fourier transformation, coding, etc. In this paper, methodologies are presented for the implementation of embedded floating point units on FPGA. The work is focused with the aim of achieving high speed of computations and to reduce the power for evaluating expressions. An application that demands high performance floating point computation can achieve better speed and density by incorporating embedded floating point units. Additionally this paper describes a comparative study of the design of single precision and double precision pipelined floating point arithmetic units for evaluating expressions. The modules are designed using VHDL simulation in Xilinx software and implemented on VIRTEX and SPARTAN FPGAs.


Author(s):  
M. Parvathi ◽  
N. Vasantha ◽  
K. Satya Prasad

One of the important block of BIST controller is LFSR and the speed with which BIST operates depends on LFSR systems design. There are methods in implementing LFSR using field programmable gate arrays (FPGAs) or digital signal processors (DSPs). BIST controller system speed is then limited to FPGAs and DSPs, which may influence other parameters such as overall area, maximum current, limit and power dissipation. This paper proposes a technique to achieve an efficient BIST controller by redesigning LFSR using GDI based D flip-flops that resulted with low area and low current capabilities. This paper presents three different techniques for implementing flip-flops for an efficient LFSR so that the layout area will be minimized as well as the maximum current drawn will be lower.


2017 ◽  
Vol 2017 ◽  
pp. 1-8
Author(s):  
Satheesh Bojja Venkatakrishnan ◽  
Elias A. Alwan ◽  
John L. Volakis

Typical radio frequency (RF) digital beamformers can be highly complex. In addition to a suitable antenna array, they require numerous receiver chains, demodulators, data converter arrays, and digital signal processors. To recover and reconstruct the received signal, synchronization is required since the analog-to-digital converters (ADCs), digital-to-analog converters (DACs), field programmable gate arrays (FPGAs), and local oscillators are all clocked at different frequencies. In this article, we present a clock synchronization topology for a multichannel on-site coding receiver (OSCR) using the FPGA as a master clock to drive all RF blocks. This approach reduces synchronization errors by a factor of 8, when compared to conventional digital beamformer.


2010 ◽  
Vol 2010 ◽  
pp. 1-10 ◽  
Author(s):  
Diego F. Sánchez ◽  
Daniel M. Muñoz ◽  
Carlos H. Llanos ◽  
José M. Motta

Hardware acceleration in high performance computer systems has a particular interest for many engineering and scientific applications in which a large number of arithmetic operations and transcendental functions must be computed. In this paper a hardware architecture for computing direct kinematics of robot manipulators with 5 degrees of freedom (5D.o.f) using floating-point arithmetic is presented for 32, 43, and 64 bit-width representations and it is implemented in Field Programmable Gate Arrays (FPGAs). The proposed architecture has been developed using several floating-point libraries for arithmetic and transcendental functions operators, allowing the designer to select (pre-synthesis) a suitable bit-width representation according to the accuracy and dynamic range, as well as the area, elapsed time and power consumption requirements of the application. Synthesis results demonstrate the effectiveness and high performance of the implemented cores on commercial FPGAs. Simulation results have been addressed in order to compute the Mean Square Error (MSE), using the Matlab as statistical estimator, validating the correct behavior of the implemented cores. Additionally, the processing time of the hardware architecture was compared with the same formulation implemented in software, using the PowerPC (FPGA embedded processor), demonstrating that the hardware architecture speeds-up by factor of 1298 the software implementation.


Author(s):  
Shalina Percy Delicia Figuli ◽  
Peter Figuli ◽  
Alberto Sonnino ◽  
Juergen Becker

In the past three decades, Field Programmable Gate Arrays (FPGAs) have emerged to be the backbone of digital signal processing, especially in high-speed communication systems. However, today, these devices are clocked below 1GHz and improvement in performance stays a big challenge on all abstraction layers, right from system architecture down to physical technology. Far and wide, myriad number of researches are done on methodologies and techniques which can deliver higher throughput with lower operating frequencies. Towards this projected objective, in this paper an efficient modulation technique like Quadrature Amplitude Modulation (QAM) along with mixed time and frequency domain approach and Forward Error Correction (FEC) technique have been utilized to employ a generic scalable FPGA based QAM transmitter with filter parallelization being executed in mixed domain. The system developed in this paper achieves an effective throughput of 12.8Gb/s for 256-QAM with 16 parallel inputs having an operating frequency of 201.25MHz, while a 18.7Gb/s effective throughput is realized with 32 parallel inputs at 146MHz. Thereby, it paves down a promising methodology for applications where having higher clock frequencies is a hard limit.


2013 ◽  
Vol 457-458 ◽  
pp. 731-735
Author(s):  
Yong Yi Huang ◽  
Jian Feng Zhou

Digital signal processing is the processing of digitized discrete-time samp-led signals. Processing is done by general-purpose computers or by digital circuits such as ASICs, field-programmable gate arrays or specialized digital signal processors. Information science focuses on understanding problems from the perspective of the stakeholders involved and then applying information and other technologies as needed. The definition of multiple pseudofames for subspaces with integer translation is proposed. The notion of a generalized multiresolution structure (GMS) of is also introduced. The construction of a generalized multiresolution structure of Paley-Wiener subspaces of is investigated. The pyramid decomposition scheme is derived based on a generalized multiresolution structure.


2021 ◽  
pp. 542-561
Author(s):  
Stevan Berber

In this chapter, the practical aspects of the design of digital discrete communication systems, primarily digital signal processors and field -programmable gate arrays, are analysed. The systems are presented at the level of block schematics, to address the main issues in their design and discuss the advantages and disadvantages of various designs in digital technology. Designs using quadriphase-shift keying and quadrature amplitude modulation are presented separately. The operation of each system is explained in terms of the theoretical structure of the system, which allows a clear understanding of the relationship between the theoretical model of the system and its practical design. The structures of the first, second, and third generation of discrete transceiver designs are presented and discussed.


2007 ◽  
Vol 16 (02) ◽  
pp. 267-286 ◽  
Author(s):  
ALEXANDER SKAVANTZOS ◽  
MOHAMMAD ABDALLAH ◽  
THANOS STOURAITIS

The Residue Number System (RNS) is an integer system appropriate for implementing fast digital signal processors. It can be used for supporting high-speed arithmetic by operating in parallel channels without need for exchanging information among the channels. In this paper, two novel RNS are proposed. First, a new RNS system based on the modulus set {2n+1, 2n - 1, 2n + 1, 2n + 2(n+1)/2 + 1, 2n - 2(n+1)/2 + 1}, n odd, is developed, along with an efficient implementation of its residue-to-weighted converter. The new RNS is a balanced five-modulus system, appropriate for large dynamic ranges. The proposed residue-to-binary converter is fast and hardware efficient and is based on a one's complement multi-operand adder that adds operands of size only 80% of the size dictated by the system's dynamic range. Second, a new class of multi-modulus RNS systems is proposed. These systems are based on sets consisting of two groups of moduli with the modulus product within one group being of the form 2a(2b - 1), while the modulus product within the other group is of the form 2c - 1. Their RNS-to-weighted converters are based on efficient combinations of the Chinese Remainder Theorem and Mixed Radix Conversion decoding techniques. Systems based on four, five, and seven moduli are constructed and analyzed. The new systems allow efficient implementations for their RNS-to-weighted decoders, imply fast and balanced RNS arithmetic, and may achieve large dynamic ranges. The presented residue-to-weighted converters for these systems rely on simple mod (2x - 1) hardware, which can be easily implemented as one's complement hardware.


Sign in / Sign up

Export Citation Format

Share Document