The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Esta carta apresenta uma nova técnica para obter uma inferência rápida das redes neurais convolucionais binarizadas (BCNN). A técnica proposta modifica a estrutura dos blocos constituintes do modelo BCNN para que os elementos de entrada para a operação de max-pooling sejam binários. Nesta estrutura, se algum dos elementos de entrada for +1, o resultado do agrupamento pode ser produzido imediatamente; a técnica proposta elimina os cálculos envolvidos na obtenção dos demais elementos de entrada, de modo a reduzir efetivamente o tempo de inferência. A técnica proposta reduz o tempo de inferência em até 34.11%, mantendo a precisão da classificação.
Ji-Hoon SHIN
Korea Aerospace University
Tae-Hwan KIM
Korea Aerospace University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Ji-Hoon SHIN, Tae-Hwan KIM, "Fast Inference of Binarized Convolutional Neural Networks Exploiting Max Pooling with Modified Block Structure" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 3, pp. 706-710, March 2020, doi: 10.1587/transinf.2019EDL8165.
Abstract: This letter presents a novel technique to achieve a fast inference of the binarized convolutional neural networks (BCNN). The proposed technique modifies the structure of the constituent blocks of the BCNN model so that the input elements for the max-pooling operation are binary. In this structure, if any of the input elements is +1, the result of the pooling can be produced immediately; the proposed technique eliminates such computations that are involved to obtain the remaining input elements, so as to reduce the inference time effectively. The proposed technique reduces the inference time by up to 34.11%, while maintaining the classification accuracy.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDL8165/_p
Copiar
@ARTICLE{e103-d_3_706,
author={Ji-Hoon SHIN, Tae-Hwan KIM, },
journal={IEICE TRANSACTIONS on Information},
title={Fast Inference of Binarized Convolutional Neural Networks Exploiting Max Pooling with Modified Block Structure},
year={2020},
volume={E103-D},
number={3},
pages={706-710},
abstract={This letter presents a novel technique to achieve a fast inference of the binarized convolutional neural networks (BCNN). The proposed technique modifies the structure of the constituent blocks of the BCNN model so that the input elements for the max-pooling operation are binary. In this structure, if any of the input elements is +1, the result of the pooling can be produced immediately; the proposed technique eliminates such computations that are involved to obtain the remaining input elements, so as to reduce the inference time effectively. The proposed technique reduces the inference time by up to 34.11%, while maintaining the classification accuracy.},
keywords={},
doi={10.1587/transinf.2019EDL8165},
ISSN={1745-1361},
month={March},}
Copiar
TY - JOUR
TI - Fast Inference of Binarized Convolutional Neural Networks Exploiting Max Pooling with Modified Block Structure
T2 - IEICE TRANSACTIONS on Information
SP - 706
EP - 710
AU - Ji-Hoon SHIN
AU - Tae-Hwan KIM
PY - 2020
DO - 10.1587/transinf.2019EDL8165
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - This letter presents a novel technique to achieve a fast inference of the binarized convolutional neural networks (BCNN). The proposed technique modifies the structure of the constituent blocks of the BCNN model so that the input elements for the max-pooling operation are binary. In this structure, if any of the input elements is +1, the result of the pooling can be produced immediately; the proposed technique eliminates such computations that are involved to obtain the remaining input elements, so as to reduce the inference time effectively. The proposed technique reduces the inference time by up to 34.11%, while maintaining the classification accuracy.
ER -