The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Uma série de redes neurais binarizadas (BNNs) mostram a precisão aceita em tarefas de classificação de imagens e alcançam excelente desempenho em field programmable gate array (FPGA). No entanto, observamos que os projetos existentes de BNNs consomem bastante tempo na mudança do BNN alvo e na aceleração de um novo BNN. Portanto, este artigo apresenta o FCA-BNN, um acelerador flexível e configurável, que emprega a técnica configurável em nível de camada para executar perfeitamente cada camada do BNN alvo. Inicialmente, para economizar recursos e melhorar a eficiência energética, as fórmulas ideais orientadas a hardware são introduzidas para projetar matrizes de computação com eficiência energética para diferentes tamanhos de convolução acolchoada e camadas totalmente conectadas. Além disso, para acelerar os BNNs alvo de forma eficiente, exploramos o modelo analítico para explorar os parâmetros de projeto ideais para FCA-BNN. Finalmente, nosso fluxo de mapeamento proposto altera a rede alvo inserindo ordem e acelera uma nova rede compilando e carregando instruções correspondentes, sem carregar e gerar fluxo de bits. As avaliações em três estruturas principais de BNNs mostram que as diferenças entre a precisão de inferência do FCA-BNN e a da GPU são de apenas 0.07%, 0.31% e 0.4% para LFC, VGG-like e Cifar-10 AlexNet. Além disso, nossos resultados de eficiência energética alcançam os resultados dos aceleradores FPGA personalizados existentes em 0.8× para LFC e 2.6× para tipo VGG. Para o Cifar-10 AlexNet, o FCA-BNN alcança 188.2× e 60.6× melhor que CPU e GPU em eficiência energética, respectivamente. Até onde sabemos, FCA-BNN é o projeto mais eficiente para mudança do BNN alvo e aceleração de um novo BNN, ao mesmo tempo que mantém o desempenho competitivo.
Jiabao GAO
Fudan University
Yuchen YAO
Fudan University
Zhengjie LI
Fudan University
Jinmei LAI
Fudan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Jiabao GAO, Yuchen YAO, Zhengjie LI, Jinmei LAI, "FCA-BNN: Flexible and Configurable Accelerator for Binarized Neural Networks on FPGA" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 8, pp. 1367-1377, August 2021, doi: 10.1587/transinf.2021EDP7054.
Abstract: A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7054/_p
Copiar
@ARTICLE{e104-d_8_1367,
author={Jiabao GAO, Yuchen YAO, Zhengjie LI, Jinmei LAI, },
journal={IEICE TRANSACTIONS on Information},
title={FCA-BNN: Flexible and Configurable Accelerator for Binarized Neural Networks on FPGA},
year={2021},
volume={E104-D},
number={8},
pages={1367-1377},
abstract={A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.},
keywords={},
doi={10.1587/transinf.2021EDP7054},
ISSN={1745-1361},
month={August},}
Copiar
TY - JOUR
TI - FCA-BNN: Flexible and Configurable Accelerator for Binarized Neural Networks on FPGA
T2 - IEICE TRANSACTIONS on Information
SP - 1367
EP - 1377
AU - Jiabao GAO
AU - Yuchen YAO
AU - Zhengjie LI
AU - Jinmei LAI
PY - 2021
DO - 10.1587/transinf.2021EDP7054
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2021
AB - A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.
ER -