The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Nesta carta, é apresentada uma decomposição QR MMSE ordenada com eficiência de hardware (MMSE-SQRD) de baixa latência, alto rendimento e hardware para sistemas de múltiplas entradas e múltiplas saídas (MIMO). Em contraste com o método de estender a matriz complexa para o modelo real e depois aplicar a decomposição QR com valor real (QRD), desenvolvemos um esquema de decomposição altamente paralelo baseado no computador digital de rotação de coordenadas (CORDIC) que executa o QRD no domínio complexo diretamente e em seguida, convertendo o resultado complexo em sua contraparte real. O esquema proposto pode melhorar muito o paralelismo de processamento e reduzir os procedimentos de anulação e classificação. Além disso, também projetamos a arquitetura de hardware pipeline correspondente do MMSE-SQRD baseada na estrutura de rotação Givens altamente paralela com algoritmo CORDIC para detectores MIMO 4×4. O MMSE-SQRD proposto é implementado na tecnologia SMIC 55nm CMOS, alcançando uma taxa de transferência de até 50M QRD/s e uma latência de 59 ciclos de clock com apenas 218 quilo-gates (KG). Comparado aos trabalhos anteriores, o projeto proposto atinge a maior eficiência de rendimento normalizado e a menor latência de processamento.
Lu SUN
Institute of Microelectronics of the Chinese Academy of Sciences (IMECAS),University of Chinese Academy of Sciences (UCAS)
Bin WU
Institute of Microelectronics of the Chinese Academy of Sciences (IMECAS)
Tianchun YE
Institute of Microelectronics of the Chinese Academy of Sciences (IMECAS)
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Lu SUN, Bin WU, Tianchun YE, "Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors" in IEICE TRANSACTIONS on Fundamentals,
vol. E104-A, no. 4, pp. 762-767, April 2021, doi: 10.1587/transfun.2020EAL2076.
Abstract: In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2020EAL2076/_p
Copiar
@ARTICLE{e104-a_4_762,
author={Lu SUN, Bin WU, Tianchun YE, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors},
year={2021},
volume={E104-A},
number={4},
pages={762-767},
abstract={In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.},
keywords={},
doi={10.1587/transfun.2020EAL2076},
ISSN={1745-1337},
month={April},}
Copiar
TY - JOUR
TI - Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 762
EP - 767
AU - Lu SUN
AU - Bin WU
AU - Tianchun YE
PY - 2021
DO - 10.1587/transfun.2020EAL2076
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E104-A
IS - 4
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - April 2021
AB - In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.
ER -