Rethinking Embedded Blocks for Machine Learning Applications

2022 ◽  
Vol 15 (1) ◽  
pp. 1-30
Author(s):  
Seyedramin Rasoulinezhad ◽  
Esther Roorda ◽  
Steve Wilton ◽  
Philip H. W. Leong ◽  
David Boland

The underlying goal of FPGA architecture research is to devise flexible substrates that implement a wide variety of circuits efficiently. Contemporary FPGA architectures have been optimized to support networking, signal processing, and image processing applications through high-precision digital signal processing (DSP) blocks. The recent emergence of machine learning has created a new set of demands characterized by: (1) higher computational density and (2) low precision arithmetic requirements. With the goal of exploring this new design space in a methodical manner, we first propose a problem formulation involving computing nested loops over multiply-accumulate (MAC) operations, which covers many basic linear algebra primitives and standard deep neural network (DNN) kernels. A quantitative methodology for deriving efficient coarse-grained compute block architectures from benchmarks is then proposed together with a family of new embedded blocks, called MLBlocks. An MLBlock instance includes several multiply-accumulate units connected via a flexible routing, where each configuration performs a few parallel dot-products in a systolic array fashion. This architecture is parameterized with support for different data movements, reuse, and precisions, utilizing a columnar arrangement that is compatible with existing FPGA architectures. On synthetic benchmarks, we demonstrate that for 8-bit arithmetic, MLBlocks offer 6× improved performance over the commercial Xilinx DSP48E2 architecture with smaller area and delay; and for time-multiplexed 16-bit arithmetic, achieves 2× higher performance per area with the same area and frequency. All source codes and data, along with documents to reproduce all the results in this article, are available at http://github.com/raminrasoulinezhad/MLBlocks .

2019 ◽  
Author(s):  
Tiago Tavares

This hands-on workshop comprises essential techniques for digital signal processing and machine learning. Participants will use the Python libraries librosa and scikit-learn as support to build an automatic audio classification system. The workshop will use explorations in toy problems to approach theoretical aspects. Later, it will discuss practical issues for building a scientific applications in the field.


Sign in / Sign up

Export Citation Format

Share Document