The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
A tecnologia de hashing cross-modal tem atraído muita atenção por seu desempenho de recuperação favorável e baixo custo de armazenamento. No entanto, para os métodos de hashing intermodais existentes, a heterogeneidade dos dados entre as modalidades ainda é um desafio e a forma de explorar e utilizar plenamente os recursos intramodais não foi bem estudada. Neste artigo, propomos uma nova abordagem de hashing intermodal chamada Modality-fused Graph Network (MFGN). A arquitetura de rede consiste em um canal de texto e um canal de imagem que são usados para aprender recursos específicos da modalidade, e um canal de fusão de modalidade que usa a rede gráfica para aprender as representações compartilhadas por modalidade para reduzir a heterogeneidade entre as modalidades. Além disso, é introduzido um módulo de integração dos canais de imagem e texto para explorar plenamente as funcionalidades intramodalidade. Experimentos em dois conjuntos de dados amplamente utilizados mostram que nossa abordagem alcança melhores resultados do que os métodos de hashing cross-modal de última geração.
Fei WU
Nanjing University of Posts and Telecommunications
Shuaishuai LI
Nanjing University of Posts and Telecommunications
Guangchuan PENG
Nanjing University of Posts and Telecommunications
Yongheng MA
Nanjing University of Posts and Telecommunications
Xiao-Yuan JING
Wuhan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING, "Modality-Fused Graph Network for Cross-Modal Retrieval" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 1094-1097, May 2023, doi: 10.1587/transinf.2022EDL8069.
Abstract: Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDL8069/_p
Copiar
@ARTICLE{e106-d_5_1094,
author={Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING, },
journal={IEICE TRANSACTIONS on Information},
title={Modality-Fused Graph Network for Cross-Modal Retrieval},
year={2023},
volume={E106-D},
number={5},
pages={1094-1097},
abstract={Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.},
keywords={},
doi={10.1587/transinf.2022EDL8069},
ISSN={1745-1361},
month={May},}
Copiar
TY - JOUR
TI - Modality-Fused Graph Network for Cross-Modal Retrieval
T2 - IEICE TRANSACTIONS on Information
SP - 1094
EP - 1097
AU - Fei WU
AU - Shuaishuai LI
AU - Guangchuan PENG
AU - Yongheng MA
AU - Xiao-Yuan JING
PY - 2023
DO - 10.1587/transinf.2022EDL8069
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
ER -