The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Detectores de objetos recentes de alto desempenho geralmente dependem de uma abordagem em dois estágios, que se beneficia de sua proposta de região e prática de refinamento, mas sofre baixa velocidade de detecção. Por outro lado, as abordagens de um estágio têm a vantagem de alta eficiência, mas sacrificam até certo ponto sua precisão. Neste artigo, propomos uma nova rede de detecção de objetos de disparo único que herda os méritos de ambos. Motivados pela ideia de enriquecimento semântico para as características convolucionais dentro de um detector profundo típico, propomos dois novos módulos: 1) modelando as interações semânticas entre canais e as dependências de longo alcance entre posições espaciais, o módulo de autoatendimento gera ambos os canais e posicionar a atenção e aprimorar os recursos convolucionais originais de maneira autoguiada; 2) aproveitando a capacidade de localização discriminativa de classe da CNN treinada em classificação, o módulo de ativação semântica aprende uma resposta convolucional semântica significativa que aumenta os recursos convolucionais de baixo nível com fortes informações semânticas específicas de classe. A chamada rede de autoatendimento e ativação semântica (ASAN) atinge melhor precisão do que os métodos de dois estágios e é capaz de realizar processamento em tempo real. Experimentos abrangentes em PASCAL VOC indicam que ASAN atinge desempenho de detecção de última geração com alta eficiência.
Xinyu ZHU
Fudan University
Jun ZHANG
Fudan University
Gengsheng CHEN
Fudan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Xinyu ZHU, Jun ZHANG, Gengsheng CHEN, "ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 3, pp. 648-659, March 2020, doi: 10.1587/transinf.2019EDP7164.
Abstract: Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7164/_p
Copiar
@ARTICLE{e103-d_3_648,
author={Xinyu ZHU, Jun ZHANG, Gengsheng CHEN, },
journal={IEICE TRANSACTIONS on Information},
title={ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection},
year={2020},
volume={E103-D},
number={3},
pages={648-659},
abstract={Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.},
keywords={},
doi={10.1587/transinf.2019EDP7164},
ISSN={1745-1361},
month={March},}
Copiar
TY - JOUR
TI - ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection
T2 - IEICE TRANSACTIONS on Information
SP - 648
EP - 659
AU - Xinyu ZHU
AU - Jun ZHANG
AU - Gengsheng CHEN
PY - 2020
DO - 10.1587/transinf.2019EDP7164
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.
ER -