The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
O resumo de recursos em nível de música é fundamental para a navegação, recuperação e indexação de arquivos de música digital. Este estudo propõe um modelo de rede neural profunda, CQTXNet, para extrair resumos de recursos em nível de música para identificação de covers. CQTXNet incorpora convolução separável em profundidade, conexões de rede residuais e modelos de atenção para estender as abordagens anteriores. Uma avaliação experimental do CQTXNet proposto foi realizada em dois conjuntos de dados de músicas cover disponíveis publicamente, variando o número de camadas de rede e o tipo de módulos de atenção.
Jinsoo SEO
Gangneung-Wonju National University
Junghyun KIM
Electronics and Telecommunications Research Institute
Hyemi KIM
Electronics and Telecommunications Research Institute
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Jinsoo SEO, Junghyun KIM, Hyemi KIM, "CQTXNet: A Modified Xception Network with Attention Modules for Cover Song Identification" in IEICE TRANSACTIONS on Information,
vol. E107-D, no. 1, pp. 49-52, January 2024, doi: 10.1587/transinf.2023MUL0003.
Abstract: Song-level feature summarization is fundamental for the browsing, retrieval, and indexing of digital music archives. This study proposes a deep neural network model, CQTXNet, for extracting song-level feature summary for cover song identification. CQTXNet incorporates depth-wise separable convolution, residual network connections, and attention models to extend previous approaches. An experimental evaluation of the proposed CQTXNet was performed on two publicly available cover song datasets by varying the number of network layers and the type of attention modules.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2023MUL0003/_p
Copiar
@ARTICLE{e107-d_1_49,
author={Jinsoo SEO, Junghyun KIM, Hyemi KIM, },
journal={IEICE TRANSACTIONS on Information},
title={CQTXNet: A Modified Xception Network with Attention Modules for Cover Song Identification},
year={2024},
volume={E107-D},
number={1},
pages={49-52},
abstract={Song-level feature summarization is fundamental for the browsing, retrieval, and indexing of digital music archives. This study proposes a deep neural network model, CQTXNet, for extracting song-level feature summary for cover song identification. CQTXNet incorporates depth-wise separable convolution, residual network connections, and attention models to extend previous approaches. An experimental evaluation of the proposed CQTXNet was performed on two publicly available cover song datasets by varying the number of network layers and the type of attention modules.},
keywords={},
doi={10.1587/transinf.2023MUL0003},
ISSN={1745-1361},
month={January},}
Copiar
TY - JOUR
TI - CQTXNet: A Modified Xception Network with Attention Modules for Cover Song Identification
T2 - IEICE TRANSACTIONS on Information
SP - 49
EP - 52
AU - Jinsoo SEO
AU - Junghyun KIM
AU - Hyemi KIM
PY - 2024
DO - 10.1587/transinf.2023MUL0003
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E107-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2024
AB - Song-level feature summarization is fundamental for the browsing, retrieval, and indexing of digital music archives. This study proposes a deep neural network model, CQTXNet, for extracting song-level feature summary for cover song identification. CQTXNet incorporates depth-wise separable convolution, residual network connections, and attention models to extend previous approaches. An experimental evaluation of the proposed CQTXNet was performed on two publicly available cover song datasets by varying the number of network layers and the type of attention modules.
ER -