digital signal Latest Research Papers

Rethinking Embedded Blocks for Machine Learning Applications

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3491234 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-30

Author(s):

Seyedramin Rasoulinezhad ◽

Esther Roorda ◽

Steve Wilton ◽

Philip H. W. Leong ◽

David Boland

Keyword(s):

Machine Learning ◽

Signal Processing ◽

Digital Signal ◽

Problem Formulation ◽

Coarse Grained ◽

Quantitative Methodology ◽

Source Codes ◽

Recent Emergence ◽

Embedded Blocks ◽

Improved Performance

The underlying goal of FPGA architecture research is to devise flexible substrates that implement a wide variety of circuits efficiently. Contemporary FPGA architectures have been optimized to support networking, signal processing, and image processing applications through high-precision digital signal processing (DSP) blocks. The recent emergence of machine learning has created a new set of demands characterized by: (1) higher computational density and (2) low precision arithmetic requirements. With the goal of exploring this new design space in a methodical manner, we first propose a problem formulation involving computing nested loops over multiply-accumulate (MAC) operations, which covers many basic linear algebra primitives and standard deep neural network (DNN) kernels. A quantitative methodology for deriving efficient coarse-grained compute block architectures from benchmarks is then proposed together with a family of new embedded blocks, called MLBlocks. An MLBlock instance includes several multiply-accumulate units connected via a flexible routing, where each configuration performs a few parallel dot-products in a systolic array fashion. This architecture is parameterized with support for different data movements, reuse, and precisions, utilizing a columnar arrangement that is compatible with existing FPGA architectures. On synthetic benchmarks, we demonstrate that for 8-bit arithmetic, MLBlocks offer 6× improved performance over the commercial Xilinx DSP48E2 architecture with smaller area and delay; and for time-multiplexed 16-bit arithmetic, achieves 2× higher performance per area with the same area and frequency. All source codes and data, along with documents to reproduce all the results in this article, are available at http://github.com/raminrasoulinezhad/MLBlocks .

Download Full-text

Approximate Constant-Coefficient Multiplication Using Hybrid Binary-Unary Computing for FPGAs

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3494570 ◽

2022 ◽

Vol 15 (3) ◽

pp. 1-25

Author(s):

S. Rasoul Faraji ◽

Pierre Abillama ◽

Kia Bazargan

Keyword(s):

Discrete Cosine Transform ◽

Video Processing ◽

High Speed ◽

Low Cost ◽

Digital Signal ◽

Logic Gates ◽

Cosine Transform ◽

Real Time Processing ◽

Coefficient Multiplier ◽

Encoding Method

Multipliers are used in virtually all Digital Signal Processing (DSP) applications such as image and video processing. Multiplier efficiency has a direct impact on the overall performance of such applications, especially when real-time processing is needed, as in 4K video processing, or where hardware resources are limited, as in mobile and IoT devices. We propose a novel, low-cost, low energy, and high-speed approximate constant coefficient multiplier (CCM) using a hybrid binary-unary encoding method. The proposed method implements a CCM using simple routing networks with no logic gates in the unary domain, which results in more efficient multipliers compared to Xilinx LogiCORE IP CCMs and table-based KCM CCMs (Flopoco) on average. We evaluate the proposed multipliers on 2-D discrete cosine transform algorithm as a common DSP module. Post-routing FPGA results show that the proposed multipliers can improve the {area, area × delay, power consumption, and energy-delay product} of a 2-D discrete cosine transform on average by {30%, 33%, 30%, 31%}. Moreover, the throughput of the proposed 2-D discrete cosine transform is on average 5% more than that of the binary architecture implemented using table-based KCM CCMs. We will show that our method has fewer routability issues compared to binary implementations when implementing a DCT core.

Download Full-text

xDNN: Inference for Deep Convolutional Neural Networks

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3473334 ◽

2022 ◽

Vol 15 (2) ◽

pp. 1-29

Author(s):

Paolo D'Alberto ◽

Victor Wu ◽

Aaron Ng ◽

Rahul Nimaiyar ◽

Elliott Delaye ◽

...

Keyword(s):

Neural Networks ◽

Power Efficiency ◽

Digital Signal ◽

Fpga Design ◽

Deep Convolutional Neural Networks ◽

Parametric Function ◽

Field Programmable ◽

Scale Down ◽

On Chip ◽

Numerical Precision

We present xDNN, an end-to-end system for deep-learning inference based on a family of specialized hardware processors synthesized on Field-Programmable Gate Array (FPGAs) and Convolution Neural Networks (CNN). We present a design optimized for low latency, high throughput, and high compute efficiency with no batching. The design is scalable and a parametric function of the number of multiply-accumulate units, on-chip memory hierarchy, and numerical precision. The design can produce a scale-down processor for embedded devices, replicated to produce more cores for larger devices, or resized to optimize efficiency. On Xilinx Virtex Ultrascale+ VU13P FPGA, we achieve 800 MHz that is close to the Digital Signal Processing maximum frequency and above 80% efficiency of on-chip compute resources. On top of our processor family, we present a runtime system enabling the execution of different networks for different input sizes (i.e., from 224× 224 to 2048× 1024). We present a compiler that reads CNNs from native frameworks (i.e., MXNet, Caffe, Keras, and Tensorflow), optimizes them, generates codes, and provides performance estimates. The compiler combines quantization information from the native environment and optimizations to feed the runtime with code as efficient as any hardware expert could write. We present tools partitioning a CNN into subgraphs for the division of work to CPU cores and FPGAs. Notice that the software will not change when or if the FPGA design becomes an ASIC, making our work vertical and not just a proof-of-concept FPGA project. We show experimental results for accuracy, latency, and power for several networks: In summary, we can achieve up to 4 times higher throughput, 3 times better power efficiency than the GPUs, and up to 20 times higher throughput than the latest CPUs. To our knowledge, we provide solutions faster than any previous FPGA-based solutions and comparable to any other top-of-the-shelves solutions.

Download Full-text

DSP TMS320C6678 Based SHVC Encoder Implementation and its Optimization

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6656.0110522 ◽

2022 ◽

Vol 10 (5) ◽

pp. 24-31

Author(s):

Ibtissem Wali ◽

◽

Amina Kessentini ◽

Mohamed Ali Ben Ayed ◽

Nouri Masmoudi ◽

...

Keyword(s):

Real Time ◽

Performance Optimization ◽

Digital Signal ◽

Experimental Tests ◽

Video Encoding ◽

Performance Achievement ◽

Time Encoding ◽

Promising Solution ◽

Signal Processors ◽

Programmable Processors

The programmable processors newest technologies, as for example the multicore Digital Signal Processors (DSP), offer a promising solution for overcoming the complexity of the real time video encoding application. In this paper, the SHVC video encoder was effectively implemented just on a single core among the eight cores of TMS320C6678 DSP for a Common Intermediate Format (CIF)input video sequence resolution(352x288). Performance optimization of the SHVC encoder had reached up 41% compared to its reference software enabling a real-time implementation of the SHVC encoder for CIF input videos sequence resolution. The proposed SHVC implementation was carried out on different quantization parameters (QP). Several experimental tests had proved our performance achievement for real-time encoding on TMS320C6678.

Download Full-text

Redefining and Validating Digital Biomarkers as Fluid, Dynamic Multi-Dimensional Digital Signal Patterns

Frontiers in Digital Health ◽

10.3389/fdgth.2021.751629 ◽

2022 ◽

Vol 3 ◽

Author(s):

Rhoda Au ◽

Vijaya B. Kolachalama ◽

Ioannis C. H. Paschalidis

Keyword(s):

Data Science ◽

Fluid Dynamic ◽

Digital Signal ◽

The United States ◽

Fda Approval ◽

United States Food ◽

Current Cycle ◽

Health Quality ◽

Definition Of ◽

Digital Biomarkers

“Digital biomarker” is a term broadly and indiscriminately applied and often limited in its conceptualization to mimic well-established biomarkers as defined and approved by regulatory agencies such as the United States Food and Drug Administration (FDA). There is a practical urgency to revisit the definition of a digital biomarker and expand it beyond current methods of identification and validation. Restricting the promise of digital technologies within the realm of currently defined biomarkers creates a missed opportunity. A whole new field of prognostic and early diagnostic digital biomarkers driven by data science and artificial intelligence can break the current cycle of high healthcare costs and low health quality that is being driven by today's chronic disease detection and treatment approaches. This new class of digital biomarkers will be dynamic and require developing new FDA approval pathways and next-generation gold standards.

Download Full-text

DSP Processer-in-the-Loop Tests Based on Automatic Code Generation

Inventions ◽

10.3390/inventions7010012 ◽

2022 ◽

Vol 7 (1) ◽

pp. 12

Author(s):

Qi Zhang ◽

Wenhui Pei

Keyword(s):

Code Generation ◽

Pulse Width Modulation ◽

Digital Signal ◽

Automatic Generation ◽

Automatic Code Generation ◽

Development Cycle ◽

Control System Model ◽

Automatic Code ◽

Dsp Control ◽

Generation Technology

The digital signal processing (DSP) processor-in-the-loop tests based on automatic code generation technology are studied. Firstly, the idea of model-based design is introduced, and the principle and method of embedded code automatic generation technology are analyzed by taking the automatic code generation of the DSP control algorithm for pulse width modulation (PWM) output as an example. Then, the control system model is established on MATLAB/Simulink. After verifying the model through simulation, the target board platform is established with DSP as the core processor, and the automatically generated code is tested by the processor-in-the-loop (PIL). The results show that the technology greatly shortens the development cycle of the project, improves the robustness and consistency of the control code, and can be widely used in the complex algorithm development process of the controller, from intelligent design and modeling to implementation.

Download Full-text

Approximator: A Software Tool for Automatic Generation of Approximate Arithmetic Circuits

Computers ◽

10.3390/computers11010011 ◽

2022 ◽

Vol 11 (1) ◽

pp. 11

Author(s):

Padmanabhan Balasubramanian ◽

Raunaq Nayar ◽

Okkar Min ◽

Douglas L. Maskell

Keyword(s):

Software Tool ◽

Digital Signal ◽

Automatic Generation ◽

Arithmetic Circuits ◽

Design Flow ◽

Attractive Alternative ◽

Design Environment ◽

Verilog Hdl ◽

Arithmetic Circuit ◽

Practical Applications

Approximate arithmetic circuits are an attractive alternative to accurate arithmetic circuits because they have significantly reduced delay, area, and power, albeit at the cost of some loss in accuracy. By keeping errors due to approximate computation within acceptable limits, approximate arithmetic circuits can be used for various practical applications such as digital signal processing, digital filtering, low power graphics processing, neuromorphic computing, hardware realization of neural networks for artificial intelligence and machine learning etc. The degree of approximation that can be incorporated into an approximate arithmetic circuit tends to vary depending on the error resiliency of the target application. Given this, the manual coding of approximate arithmetic circuits corresponding to different degrees of approximation in a hardware description language (HDL) may be a cumbersome and a time-consuming process—more so when the circuit is big. Therefore, a software tool that can automatically generate approximate arithmetic circuits of any size corresponding to a desired accuracy would not only aid the design flow but also help to improve a designer’s productivity by speeding up the circuit/system development. In this context, this paper presents ‘Approximator’, which is a software tool developed to automatically generate approximate arithmetic circuits based on a user’s specification. Approximator can automatically generate Verilog HDL codes of approximate adders and multipliers of any size based on the novel approximate arithmetic circuit architectures proposed by us. The Verilog HDL codes output by Approximator can be used for synthesis in an FPGA or ASIC (standard cell based) design environment. Additionally, the tool can perform error and accuracy analyses of approximate arithmetic circuits. The salient features of the tool are illustrated through some example screenshots captured during different stages of the tool use. Approximator has been made open-access on GitHub for the benefit of the research community, and the tool documentation is provided for the user’s reference.

Download Full-text

Information Processing Methods of Electronic Warfare Events Based on Communication Technology

Security and Communication Networks ◽

10.1155/2022/9309710 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Hongyan Mao

Keyword(s):

Communication Technology ◽

Digital Signal ◽

Recall Rate ◽

Complex Signal ◽

Electronic Warfare ◽

Communication Signals ◽

Electronic Countermeasure ◽

Signal Environment ◽

Modulation Methods ◽

Accuracy And Stability

Traditional electronic countermeasure incident intelligence processing has problems such as low accuracy and stability and long processing time. A method of electronic countermeasure incident intelligence processing based on communication technology is proposed. First, use the integrated digital signal receiver to identify various modulation methods in the complex signal environment to facilitate the processing and transmission of communication signals, then establish an electronic countermeasure intelligence processing framework with Esper as the core, and flow the situation to the processing conclusion through the PROTOBUF interactive format Redis cache. The data can realize the intelligent processing of electronic countermeasure incidents. The experimental results show that the method proposed in this paper increases the recall rate by 5 to 20% compared with other methods. This method has high accuracy and stability for electronic countermeasure incident intelligence processing and can effectively shorten the time for electronic countermeasure incident intelligence processing.

Download Full-text

Applied Digital Signal Processing

Handbook of Experimental Structural Dynamics ◽

10.1007/978-1-4939-6503-8_6-2 ◽

2022 ◽

pp. 1-81

Author(s):

Robert B. Randall ◽

Jerome Antoni ◽

Pietro Borghesani

Keyword(s):

Signal Processing ◽

Digital Signal Processing ◽

Digital Signal

Download Full-text

Digital signal processing–assisted radio-over-fiber

10.1016/b978-0-12-821627-9.00010-3 ◽

2022 ◽

pp. 103-132

Author(s):

Xiang Liu

Keyword(s):

Signal Processing ◽

Digital Signal Processing ◽

Digital Signal ◽

Radio Over Fiber

Download Full-text

digital signal
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Rethinking Embedded Blocks for Machine Learning Applications

Approximate Constant-Coefficient Multiplication Using Hybrid Binary-Unary Computing for FPGAs

xDNN: Inference for Deep Convolutional Neural Networks

DSP TMS320C6678 Based SHVC Encoder Implementation and its Optimization

Redefining and Validating Digital Biomarkers as Fluid, Dynamic Multi-Dimensional Digital Signal Patterns

DSP Processer-in-the-Loop Tests Based on Automatic Code Generation

Approximator: A Software Tool for Automatic Generation of Approximate Arithmetic Circuits

Information Processing Methods of Electronic Warfare Events Based on Communication Technology

Applied Digital Signal Processing

Digital signal processing–assisted radio-over-fiber

Export Citation Format

digital signalRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Rethinking Embedded Blocks for Machine Learning Applications

Approximate Constant-Coefficient Multiplication Using Hybrid Binary-Unary Computing for FPGAs

xDNN: Inference for Deep Convolutional Neural Networks

DSP TMS320C6678 Based SHVC Encoder Implementation and its Optimization

Redefining and Validating Digital Biomarkers as Fluid, Dynamic Multi-Dimensional Digital Signal Patterns

DSP Processer-in-the-Loop Tests Based on Automatic Code Generation

Approximator: A Software Tool for Automatic Generation of Approximate Arithmetic Circuits

Information Processing Methods of Electronic Warfare Events Based on Communication Technology

Applied Digital Signal Processing

Digital signal processing–assisted radio-over-fiber

digital signal
Recently Published Documents