The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
A rede LSTM demonstrou ter desempenho superior no reconhecimento de expressão facial de sequência de vídeo. Tendo em vista a capacidade de representação limitada do LSTM de camada única, é proposto um modelo de atenção hierárquica com ramificação de recursos aprimorada. Esta nova arquitetura de rede consiste no tradicional VGG-16-FACE com ramificação de recursos aprimorados seguida por um LSTM de camada cruzada. O VGG-16-FACE com ramificação aprimorada extrai os recursos espaciais, assim como o LSTM de camada cruzada extrai as relações temporais entre os diferentes quadros do vídeo. O método proposto é avaliado em bancos de dados públicos de emoções em tarefas independentes de assunto e entre bancos de dados e supera os métodos de última geração.
Ying TONG
Nanjing Institute of Technology
Rui CHEN
Nanjing Institute of Technology
Ruiyu LIANG
Nanjing Institute of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Ying TONG, Rui CHEN, Ruiyu LIANG, "Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 11, pp. 2403-2406, November 2020, doi: 10.1587/transinf.2020EDL8065.
Abstract: LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8065/_p
Copiar
@ARTICLE{e103-d_11_2403,
author={Ying TONG, Rui CHEN, Ruiyu LIANG, },
journal={IEICE TRANSACTIONS on Information},
title={Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM},
year={2020},
volume={E103-D},
number={11},
pages={2403-2406},
abstract={LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.},
keywords={},
doi={10.1587/transinf.2020EDL8065},
ISSN={1745-1361},
month={November},}
Copiar
TY - JOUR
TI - Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM
T2 - IEICE TRANSACTIONS on Information
SP - 2403
EP - 2406
AU - Ying TONG
AU - Rui CHEN
AU - Ruiyu LIANG
PY - 2020
DO - 10.1587/transinf.2020EDL8065
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2020
AB - LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.
ER -