scholarly journals Multi core processor for QR decomposition based on FPGA

2018 ◽  
Vol 7 (4) ◽  
pp. 2100
Author(s):  
Safaa S. Omran ◽  
Ahmed K. Abdul-abbas

Hardware design of multicore 32-bits processor is implemented to achieve low latency and high throughput QR decomposition (QRD) based on two algorithms which they are Gram Schmidt (GS) and Givens Rotation (GR). The orthogonal matrices are computed using the first core processor by Gram Schmidt algorithm, and the upper triangular matrices are computed using the second core processor by Givens Rotation algorithm. This design of multicore processor can achieve 50M QRD/s throughput for (4 × 4) matrices at running frequency 200 MHz.  

2021 ◽  
Vol 13 (4) ◽  
pp. 77
Author(s):  
Meili Liu ◽  
Liwei Wang ◽  
Chun-Te Lee ◽  
Jeng-Eng Lin

Inspired by the results that functions preserve orthogonality of full matrices, upper triangular matrices, and symmetric matrices. We finish the work by finding special orthogonal matrices which satisfy the conditions of preserving orthogonality functions. We give a characterization of functions preserving orthogonality of Hermitian matrices.


2018 ◽  
Vol 27 (14) ◽  
pp. 1850220 ◽  
Author(s):  
Wei-Yang Chen ◽  
Chung-An Shen

This paper presents the VLSI architecture of a low-latency and high-throughput sorted-QR decomposition (SQRD) engine for multiple-input multiple-output (MIMO) communication systems. In order to achieve a high processing throughput, the proposed design is architected based on a novel pipelined Givens rotation (GR) structure comprising of multi-dimension COordinate rotation DIgital computer (CORDIC) (MD-CORDIC) processing elements (PEs). Moreover, this design delivers the vector norm and conducts the sorting operation as a by-product of the vectoring operation on the execution flow of the CORDIC process. Therefore, excessive overheads for norm-calculation and sorting are excluded, and thus the latency is greatly reduced and throughput is enhanced. In addition, the proposed SQRD engine is operating directly on the complex-valued channel matrix to avoid the matrix augmentation caused by the real-valued decomposition of the channel matrix. This design has been synthesized, placed and routed, and the post-layout estimation results have shown that the processing throughput of the proposed SQRD architecture achieves an approximately 2[Formula: see text] improvement compared to the prior arts.


Sign in / Sign up

Export Citation Format

Share Document