Multi core processor for QR decomposition based on FPGA

Hardware design of multicore 32-bits processor is implemented to achieve low latency and high throughput QR decomposition (QRD) based on two algorithms which they are Gram Schmidt (GS) and Givens Rotation (GR). The orthogonal matrices are computed using the first core processor by Gram Schmidt algorithm, and the upper triangular matrices are computed using the second core processor by Givens Rotation algorithm. This design of multicore processor can achieve 50M QRD/s throughput for (4 × 4) matrices at running frequency 200 MHz.

Download Full-text

Design and implementation of a low-latency, high-throughput sorted QR decomposition circuit for MIMO communications

2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) ◽

10.1109/apccas.2016.7803953 ◽

2016 ◽

Cited By ~ 4

Author(s):

Wei-Yang Chen ◽

Daniel Guenther ◽

Chung-An Shen ◽

Gerd Ascheid

Keyword(s):

High Throughput ◽

Qr Decomposition ◽

Low Latency ◽

Mimo Communications ◽

Design And Implementation

Download Full-text

Functions Preserving Orthogonality of Hermitian Matrices

Journal of Mathematics Research ◽

10.5539/jmr.v13n4p77 ◽

2021 ◽

Vol 13 (4) ◽

pp. 77

Author(s):

Meili Liu ◽

Liwei Wang ◽

Chun-Te Lee ◽

Jeng-Eng Lin

Keyword(s):

Symmetric Matrices ◽

Hermitian Matrices ◽

Orthogonal Matrices ◽

Triangular Matrices ◽

Upper Triangular Matrices

Inspired by the results that functions preserve orthogonality of full matrices, upper triangular matrices, and symmetric matrices. We finish the work by finding special orthogonal matrices which satisfy the conditions of preserving orthogonality functions. We give a characterization of functions preserving orthogonality of Hermitian matrices.

Download Full-text

Architectural Optimizations for a High-Throughput Sorted QR Decomposition Circuit in MIMO Communication Systems

Journal of Circuits System and Computers ◽

10.1142/s0218126618502201 ◽

2018 ◽

Vol 27 (14) ◽

pp. 1850220 ◽

Cited By ~ 1

Author(s):

Wei-Yang Chen ◽

Chung-An Shen

Keyword(s):

High Throughput ◽

Communication Systems ◽

Multiple Input Multiple Output ◽

Vlsi Architecture ◽

Qr Decomposition ◽

Channel Matrix ◽

Mimo Communication ◽

Givens Rotation ◽

The Matrix ◽

Input Multiple Output

This paper presents the VLSI architecture of a low-latency and high-throughput sorted-QR decomposition (SQRD) engine for multiple-input multiple-output (MIMO) communication systems. In order to achieve a high processing throughput, the proposed design is architected based on a novel pipelined Givens rotation (GR) structure comprising of multi-dimension COordinate rotation DIgital computer (CORDIC) (MD-CORDIC) processing elements (PEs). Moreover, this design delivers the vector norm and conducts the sorting operation as a by-product of the vectoring operation on the execution flow of the CORDIC process. Therefore, excessive overheads for norm-calculation and sorting are excluded, and thus the latency is greatly reduced and throughput is enhanced. In addition, the proposed SQRD engine is operating directly on the complex-valued channel matrix to avoid the matrix augmentation caused by the real-valued decomposition of the channel matrix. This design has been synthesized, placed and routed, and the post-layout estimation results have shown that the processing throughput of the proposed SQRD architecture achieves an approximately 2[Formula: see text] improvement compared to the prior arts.

Download Full-text