The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Este artigo aborda o método de aprimoramento de fala de canal único que utiliza o valor médio e a variância dos espectros de potência de ruído logarítmico. Uma questão importante para o algoritmo de aprimoramento de fala de canal único é determinar o ponto de equilíbrio entre a distorção espectral e o ruído residual. Assim, é necessária a discriminação precisa entre os componentes espectrais e de ruído da fala. Os métodos convencionais determinam o ponto de equilíbrio utilizando parâmetros obtidos experimentalmente. Como resultado, a discriminação espectral não é adequada. E a fala melhorada é deteriorada pela distorção espectral ou ruído residual. Portanto, é necessário um critério para determinar o ponto. O método proposto determina o ponto de equilíbrio entre distorção espectral e nível de ruído residual por meio da discriminação entre componentes espectrais e de ruído da fala com base em critérios estatísticos. A discriminação espectral é realizada por meio de testes de hipóteses que utilizam médias e variâncias dos espectros de potência logarítmica. Os componentes espectrais discriminados são divididos em componentes espectrais com predominância de fala e componentes espectrais com predominância de ruído. Para os com fala dominante, é realizada subtração espectral para minimizar a distorção espectral. Para aqueles com ruído dominante, a atenuação é realizada para reduzir o nível de ruído. O desempenho do método é confirmado em termos de forma de onda, espectrograma, nível de redução de ruído e tarefa de reconhecimento de fala. Como resultado, o nível de redução de ruído e a taxa de reconhecimento de fala são melhorados, de modo que o método reduz eficazmente o ruído musical e melhora a qualidade de fala melhorada.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Hidetoshi NAKASHIMA, Yoshifumi CHISAKI, Tsuyoshi USAGAWA, Masanao EBATA, "Spectral Subtraction Based on Statistical Criteria of the Spectral Distribution" in IEICE TRANSACTIONS on Fundamentals,
vol. E85-A, no. 10, pp. 2283-2292, October 2002, doi: .
Abstract: This paper addresses the single channel speech enhancement method which utilizes the mean value and variance of the logarithmic noise power spectra. An important issue for single channel speech enhancement algorithm is to determine the trade-off point for the spectral distortion and residual noise. Thus the accurate discrimination between speech spectral and noise components is required. The conventional methods determine the trade-off point using parameters obtained experimentally. As a result spectral discrimination is not adequate. And the enhanced speech is deteriorated by spectral distortion or residual noise. Therefore, a criteria to determine the point is necessary. The proposed method determines the trade-off point of spectral distortion and residual noise level by discrimination between speech spectral and noise components based on statistical criteria. The spectral discrimination is performed using hypothesis testing that utilizes means and variances of the logarithmic power spectra. The discriminated spectral components are divided into speech-dominant spectral components and noise-dominant ones. For the speech-dominant ones, spectral subtraction is performed to minimize the spectral distortion. For the noise-dominant ones, attenuation is performed to reduce the noise level. The performance of the method is confirmed in terms of waveform, spectrogram, noise reduction level and speech recognition task. As a result, the noise reduction level and speech recognition rate are improved so that the method reduces the musical noise effectively and improves the enhanced speech quality.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e85-a_10_2283/_p
Copiar
@ARTICLE{e85-a_10_2283,
author={Hidetoshi NAKASHIMA, Yoshifumi CHISAKI, Tsuyoshi USAGAWA, Masanao EBATA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Spectral Subtraction Based on Statistical Criteria of the Spectral Distribution},
year={2002},
volume={E85-A},
number={10},
pages={2283-2292},
abstract={This paper addresses the single channel speech enhancement method which utilizes the mean value and variance of the logarithmic noise power spectra. An important issue for single channel speech enhancement algorithm is to determine the trade-off point for the spectral distortion and residual noise. Thus the accurate discrimination between speech spectral and noise components is required. The conventional methods determine the trade-off point using parameters obtained experimentally. As a result spectral discrimination is not adequate. And the enhanced speech is deteriorated by spectral distortion or residual noise. Therefore, a criteria to determine the point is necessary. The proposed method determines the trade-off point of spectral distortion and residual noise level by discrimination between speech spectral and noise components based on statistical criteria. The spectral discrimination is performed using hypothesis testing that utilizes means and variances of the logarithmic power spectra. The discriminated spectral components are divided into speech-dominant spectral components and noise-dominant ones. For the speech-dominant ones, spectral subtraction is performed to minimize the spectral distortion. For the noise-dominant ones, attenuation is performed to reduce the noise level. The performance of the method is confirmed in terms of waveform, spectrogram, noise reduction level and speech recognition task. As a result, the noise reduction level and speech recognition rate are improved so that the method reduces the musical noise effectively and improves the enhanced speech quality.},
keywords={},
doi={},
ISSN={},
month={October},}
Copiar
TY - JOUR
TI - Spectral Subtraction Based on Statistical Criteria of the Spectral Distribution
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 2283
EP - 2292
AU - Hidetoshi NAKASHIMA
AU - Yoshifumi CHISAKI
AU - Tsuyoshi USAGAWA
AU - Masanao EBATA
PY - 2002
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E85-A
IS - 10
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - October 2002
AB - This paper addresses the single channel speech enhancement method which utilizes the mean value and variance of the logarithmic noise power spectra. An important issue for single channel speech enhancement algorithm is to determine the trade-off point for the spectral distortion and residual noise. Thus the accurate discrimination between speech spectral and noise components is required. The conventional methods determine the trade-off point using parameters obtained experimentally. As a result spectral discrimination is not adequate. And the enhanced speech is deteriorated by spectral distortion or residual noise. Therefore, a criteria to determine the point is necessary. The proposed method determines the trade-off point of spectral distortion and residual noise level by discrimination between speech spectral and noise components based on statistical criteria. The spectral discrimination is performed using hypothesis testing that utilizes means and variances of the logarithmic power spectra. The discriminated spectral components are divided into speech-dominant spectral components and noise-dominant ones. For the speech-dominant ones, spectral subtraction is performed to minimize the spectral distortion. For the noise-dominant ones, attenuation is performed to reduce the noise level. The performance of the method is confirmed in terms of waveform, spectrogram, noise reduction level and speech recognition task. As a result, the noise reduction level and speech recognition rate are improved so that the method reduces the musical noise effectively and improves the enhanced speech quality.
ER -