The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
O multiplicador logarítmico aproximado proposto por Mitchell fornece uma alternativa eficiente para processar multiplicação densa ou operações de multiplicação-acumulação em aplicações como processamento de imagens e robótica em tempo real. Oferece as vantagens de pequena área, alta eficiência energética e é adequado para aplicações que não necessariamente alcançam alta precisão. No entanto, seu erro máximo de 11.1% dificulta a implantação em aplicações que exigem precisão relativamente alta. Este artigo propõe um novo método de decomposição de operandos (OD) que decompõe uma multiplicação na soma de múltiplas multiplicações logarítmicas aproximadas para reduzir amplamente os erros do multiplicador de Mitchell e, ao mesmo tempo, aproveitar ao máximo sua economia de área. Com base no método OD proposto, este artigo também propõe uma unidade de precisão reconfigurável de multiplicação e acumulação (MAC) que fornece múltiplas precisões reconfiguráveis com alto paralelismo. Em comparação com uma unidade MAC que consiste em multiplicadores precisos, a área é significativamente reduzida para menos da metade, melhorando o paralelismo de hardware e satisfazendo a precisão necessária para vários cenários. Os resultados experimentais mostram a excelente aplicabilidade da nossa unidade MAC proposta em suavização de imagens e localização de robôs e aplicação de mapeamento. Também projetamos um protótipo de processador que integra a funcionalidade mínima desta unidade MAC como um acelerador vetorial e implementamos uma reconfiguração de precisão em nível de software na forma de uma extensão do conjunto de instruções. Confirmamos experimentalmente o correto funcionamento do acelerador vetorial proposto, que fornece diferentes graus de precisão e paralelismo em nível de software.
Lingxiao HOU
Nagoya University
Yutaka MASUDA
Nagoya University
Tohru ISHIHARA
Nagoya University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Lingxiao HOU, Yutaka MASUDA, Tohru ISHIHARA, "An Accuracy Reconfigurable Vector Accelerator based on Approximate Logarithmic Multipliers for Energy-Efficient Computing" in IEICE TRANSACTIONS on Fundamentals,
vol. E106-A, no. 3, pp. 532-541, March 2023, doi: 10.1587/transfun.2022VLP0005.
Abstract: The approximate logarithmic multiplier proposed by Mitchell provides an efficient alternative for processing dense multiplication or multiply-accumulate operations in applications such as image processing and real-time robotics. It offers the advantages of small area, high energy efficiency and is suitable for applications that do not necessarily achieve high accuracy. However, its maximum error of 11.1% makes it challenging to deploy in applications requiring relatively high accuracy. This paper proposes a novel operand decomposition method (OD) that decomposes one multiplication into the sum of multiple approximate logarithmic multiplications to widely reduce Mitchell multiplier errors while taking full advantage of its area savings. Based on the proposed OD method, this paper also proposes an accuracy reconfigurable multiply-accumulate (MAC) unit that provides multiple reconfigurable accuracies with high parallelism. Compared to a MAC unit consisting of accurate multipliers, the area is significantly reduced to less than half, improving the hardware parallelism while satisfying the required accuracy for various scenarios. The experimental results show the excellent applicability of our proposed MAC unit in image smoothing and robot localization and mapping application. We have also designed a prototype processor that integrates the minimum functionality of this MAC unit as a vector accelerator and have implemented a software-level accuracy reconfiguration in the form of an instruction set extension. We experimentally confirmed the correct operation of the proposed vector accelerator, which provides the different degrees of accuracy and parallelism at the software level.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2022VLP0005/_p
Copiar
@ARTICLE{e106-a_3_532,
author={Lingxiao HOU, Yutaka MASUDA, Tohru ISHIHARA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={An Accuracy Reconfigurable Vector Accelerator based on Approximate Logarithmic Multipliers for Energy-Efficient Computing},
year={2023},
volume={E106-A},
number={3},
pages={532-541},
abstract={The approximate logarithmic multiplier proposed by Mitchell provides an efficient alternative for processing dense multiplication or multiply-accumulate operations in applications such as image processing and real-time robotics. It offers the advantages of small area, high energy efficiency and is suitable for applications that do not necessarily achieve high accuracy. However, its maximum error of 11.1% makes it challenging to deploy in applications requiring relatively high accuracy. This paper proposes a novel operand decomposition method (OD) that decomposes one multiplication into the sum of multiple approximate logarithmic multiplications to widely reduce Mitchell multiplier errors while taking full advantage of its area savings. Based on the proposed OD method, this paper also proposes an accuracy reconfigurable multiply-accumulate (MAC) unit that provides multiple reconfigurable accuracies with high parallelism. Compared to a MAC unit consisting of accurate multipliers, the area is significantly reduced to less than half, improving the hardware parallelism while satisfying the required accuracy for various scenarios. The experimental results show the excellent applicability of our proposed MAC unit in image smoothing and robot localization and mapping application. We have also designed a prototype processor that integrates the minimum functionality of this MAC unit as a vector accelerator and have implemented a software-level accuracy reconfiguration in the form of an instruction set extension. We experimentally confirmed the correct operation of the proposed vector accelerator, which provides the different degrees of accuracy and parallelism at the software level.},
keywords={},
doi={10.1587/transfun.2022VLP0005},
ISSN={1745-1337},
month={March},}
Copiar
TY - JOUR
TI - An Accuracy Reconfigurable Vector Accelerator based on Approximate Logarithmic Multipliers for Energy-Efficient Computing
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 532
EP - 541
AU - Lingxiao HOU
AU - Yutaka MASUDA
AU - Tohru ISHIHARA
PY - 2023
DO - 10.1587/transfun.2022VLP0005
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E106-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 2023
AB - The approximate logarithmic multiplier proposed by Mitchell provides an efficient alternative for processing dense multiplication or multiply-accumulate operations in applications such as image processing and real-time robotics. It offers the advantages of small area, high energy efficiency and is suitable for applications that do not necessarily achieve high accuracy. However, its maximum error of 11.1% makes it challenging to deploy in applications requiring relatively high accuracy. This paper proposes a novel operand decomposition method (OD) that decomposes one multiplication into the sum of multiple approximate logarithmic multiplications to widely reduce Mitchell multiplier errors while taking full advantage of its area savings. Based on the proposed OD method, this paper also proposes an accuracy reconfigurable multiply-accumulate (MAC) unit that provides multiple reconfigurable accuracies with high parallelism. Compared to a MAC unit consisting of accurate multipliers, the area is significantly reduced to less than half, improving the hardware parallelism while satisfying the required accuracy for various scenarios. The experimental results show the excellent applicability of our proposed MAC unit in image smoothing and robot localization and mapping application. We have also designed a prototype processor that integrates the minimum functionality of this MAC unit as a vector accelerator and have implemented a software-level accuracy reconfiguration in the form of an instruction set extension. We experimentally confirmed the correct operation of the proposed vector accelerator, which provides the different degrees of accuracy and parallelism at the software level.
ER -