The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Redes Neurais Convolucionais (CNNs) demonstraram recentemente excelente desempenho em tarefas de recuperação de imagens. Características convolucionais locais extraídas por CNNs, em particular, mostram capacidade excepcional de discriminação. Pesquisas recentes neste campo concentraram-se em métodos de agrupamento que incorporam características locais em características globais e avaliam a similaridade global de duas imagens. No entanto, os métodos de agrupamento sacrificam as informações da região local da imagem e as relações espaciais, que são precisamente conhecidas como as chaves para a robustez contra a oclusão e mudanças de ponto de vista. Neste artigo, em vez de métodos de agrupamento, propomos um método alternativo baseado na similaridade local, determinada pelo uso direto de características convolucionais locais. Especificamente, primeiro definimos três formas de tensores de similaridade local (LSTs), que levam em consideração informações sobre regiões locais, bem como relações espaciais entre elas. Em seguida, construímos um modelo CNN de similaridade (SCNN) baseado em LSTs para avaliar a similaridade entre a consulta e as imagens da galeria. A configuração ideal do nosso método é buscada através de experimentos minuciosos sob três perspectivas: tamanho da região local, conteúdo da região local e relações espaciais entre regiões locais. Os resultados experimentais em um conjunto de dados aberto modificado (onde as imagens de consulta são limitadas às ocluídas) confirmam que o método proposto supera os métodos de pooling devido ao aprimoramento da robustez. Além disso, testes em três conjuntos de dados de recuperação públicos mostram que a combinação de LSTs com métodos convencionais de agrupamento alcança os melhores resultados.
Longjiao ZHAO
Nagoya University
Yu WANG
Hitotsubashi University
Jien KATO
Ritsumeikan University
Yoshiharu ISHIKAWA
Nagoya University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Longjiao ZHAO, Yu WANG, Jien KATO, Yoshiharu ISHIKAWA, "Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 1069-1080, May 2023, doi: 10.1587/transinf.2022EDP7163.
Abstract: Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDP7163/_p
Copiar
@ARTICLE{e106-d_5_1069,
author={Longjiao ZHAO, Yu WANG, Jien KATO, Yoshiharu ISHIKAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval},
year={2023},
volume={E106-D},
number={5},
pages={1069-1080},
abstract={Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.},
keywords={},
doi={10.1587/transinf.2022EDP7163},
ISSN={1745-1361},
month={May},}
Copiar
TY - JOUR
TI - Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval
T2 - IEICE TRANSACTIONS on Information
SP - 1069
EP - 1080
AU - Longjiao ZHAO
AU - Yu WANG
AU - Jien KATO
AU - Yoshiharu ISHIKAWA
PY - 2023
DO - 10.1587/transinf.2022EDP7163
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.
ER -