The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Nesta carta, apresentamos um novo acumulador múltiplo de ponto flutuante de precisão única (FNA-MAC) para obter recursos de hardware mais baixos, latência de computação reduzida e precisão de computação aprimorada para operações contínuas de produtos escalares. Ao fundir ainda mais a normalização e o alinhamento no algoritmo FMA tradicional, a arquitetura proposta elimina o primeiro N-1 operações de normalização e arredondamento para um N-ponto do produto escalar e preserva a precisão dos resultados intermediários em um tamanho de bit significativo que é o dobro do dos métodos tradicionais. A normalização e o arredondamento do resultado final são processados ao custo do consumo de uma operação adicional de multiplicação e adição. Os resultados da simulação mostram que a melhoria na precisão computacional é significativa. Enquanto isso, quando comparado com um projeto FMA publicado recentemente, o FNA-MAC proposto pode reduzir o recurso de tabela de consulta de fatia/flip-flop e a latência de computação em um fato de 18%, 33.3%, respectivamente.
Min YUAN
Zhejiang University
Qianjian XING
Zhejiang University
Zhenguo MA
Zhejiang University
Feng YU
Zhejiang University
Yingke XU
Zhejiang University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Min YUAN, Qianjian XING, Zhenguo MA, Feng YU, Yingke XU, "A Fused Continuous Floating-Point MAC on FPGA" in IEICE TRANSACTIONS on Fundamentals,
vol. E101-A, no. 9, pp. 1594-1598, September 2018, doi: 10.1587/transfun.E101.A.1594.
Abstract: In this letter, we present a novel single-precision floating-point multiply-accumulator (FNA-MAC) to achieve lower hardware resource, reduced computing latency and improved computing accuracy for continuous dot product operations. By further fusing the normalization and alignment in the traditional FMA algorithm, the proposed architecture eliminates the first N-1 normalization and rounding operations for an N-point dot product, and preserves the precision of interim results in a significant bit size that is twice of that in the traditional methods. The normalization and rounding of the final result is processed at the cost of consuming an additional multiply-add operation. The simulation results show that the improvement in computational accuracy is significant. Meanwhile, when comparing to a recently published FMA design, the proposed FNA-MAC can reduce the slice look-up table/flip-flop resource and computing latency by a fact of 18%, 33.3%, respectively.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E101.A.1594/_p
Copiar
@ARTICLE{e101-a_9_1594,
author={Min YUAN, Qianjian XING, Zhenguo MA, Feng YU, Yingke XU, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={A Fused Continuous Floating-Point MAC on FPGA},
year={2018},
volume={E101-A},
number={9},
pages={1594-1598},
abstract={In this letter, we present a novel single-precision floating-point multiply-accumulator (FNA-MAC) to achieve lower hardware resource, reduced computing latency and improved computing accuracy for continuous dot product operations. By further fusing the normalization and alignment in the traditional FMA algorithm, the proposed architecture eliminates the first N-1 normalization and rounding operations for an N-point dot product, and preserves the precision of interim results in a significant bit size that is twice of that in the traditional methods. The normalization and rounding of the final result is processed at the cost of consuming an additional multiply-add operation. The simulation results show that the improvement in computational accuracy is significant. Meanwhile, when comparing to a recently published FMA design, the proposed FNA-MAC can reduce the slice look-up table/flip-flop resource and computing latency by a fact of 18%, 33.3%, respectively.},
keywords={},
doi={10.1587/transfun.E101.A.1594},
ISSN={1745-1337},
month={September},}
Copiar
TY - JOUR
TI - A Fused Continuous Floating-Point MAC on FPGA
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1594
EP - 1598
AU - Min YUAN
AU - Qianjian XING
AU - Zhenguo MA
AU - Feng YU
AU - Yingke XU
PY - 2018
DO - 10.1587/transfun.E101.A.1594
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E101-A
IS - 9
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - September 2018
AB - In this letter, we present a novel single-precision floating-point multiply-accumulator (FNA-MAC) to achieve lower hardware resource, reduced computing latency and improved computing accuracy for continuous dot product operations. By further fusing the normalization and alignment in the traditional FMA algorithm, the proposed architecture eliminates the first N-1 normalization and rounding operations for an N-point dot product, and preserves the precision of interim results in a significant bit size that is twice of that in the traditional methods. The normalization and rounding of the final result is processed at the cost of consuming an additional multiply-add operation. The simulation results show that the improvement in computational accuracy is significant. Meanwhile, when comparing to a recently published FMA design, the proposed FNA-MAC can reduce the slice look-up table/flip-flop resource and computing latency by a fact of 18%, 33.3%, respectively.
ER -