The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Este estudo avalia os efeitos de alguns métodos de extensão cega de largura de banda (BWE) sem aprendizagem em sistemas de verificação automática de alto-falante (ASV) de última geração. Recentemente, um método de extensão de largura de banda não linear (N-BWE) foi proposto como uma abordagem BWE cega, sem aprendizagem e leve. Outros BWEs não relacionados à aprendizagem também foram desenvolvidos nos últimos anos. Para avaliações ASV, a maioria dos dados disponíveis para treinar sistemas ASV são fala telefônica em banda estreita (NB). Enquanto isso, dados de banda larga (WB) têm sido usados para treinar os sistemas ASV de última geração, como i-vector, d-vector e x-vector. Isso pode causar incompatibilidades na taxa de amostragem quando todos os conjuntos de dados são usados. Neste artigo, investigamos a influência das incompatibilidades de taxas de amostragem nos sistemas ASV baseados em vetores x e como os métodos BWE de não aprendizagem funcionam contra eles. Os resultados mostraram que o método N-BWE melhorou a taxa de erro igual (EER) em sistemas ASV com base no vetor x quando as incompatibilidades estavam presentes. Pesquisamos a relação entre medidas objetivas e EERs. Consequentemente, o método N-BWE produziu os EERs mais baixos em ambos os sistemas ASV e obteve o menor valor RMS-LSD e o maior escore STOI.
Ryota KAMINISHI
Tokyo Metropolitan University
Haruna MIYAMOTO
Tokyo Metropolitan University
Sayaka SHIOTA
Tokyo Metropolitan University
Hitoshi KIYA
Tokyo Metropolitan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Ryota KAMINISHI, Haruna MIYAMOTO, Sayaka SHIOTA, Hitoshi KIYA, "Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 1, pp. 42-49, January 2020, doi: 10.1587/transinf.2019MUP0008.
Abstract: This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019MUP0008/_p
Copiar
@ARTICLE{e103-d_1_42,
author={Ryota KAMINISHI, Haruna MIYAMOTO, Sayaka SHIOTA, Hitoshi KIYA, },
journal={IEICE TRANSACTIONS on Information},
title={Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification},
year={2020},
volume={E103-D},
number={1},
pages={42-49},
abstract={This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.},
keywords={},
doi={10.1587/transinf.2019MUP0008},
ISSN={1745-1361},
month={January},}
Copiar
TY - JOUR
TI - Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification
T2 - IEICE TRANSACTIONS on Information
SP - 42
EP - 49
AU - Ryota KAMINISHI
AU - Haruna MIYAMOTO
AU - Sayaka SHIOTA
AU - Hitoshi KIYA
PY - 2020
DO - 10.1587/transinf.2019MUP0008
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2020
AB - This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.
ER -