The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Este artigo descreve um método de subtração espectral espacial usando o conjunto de microfones formadores de feixe complementares para aprimorar sinais de fala ruidosos para reconhecimento de fala. O beamforming complementar é baseado em dois tipos de beamformers projetados para obter padrões de diretividade complementares entre si. Neste artigo, é mostrado que o processamento de subtração não linear com beamforming complementar pode resultar em uma espécie de subtração espectral sem a necessidade de detecção de pausa na fala. Além disso, também é descrito o algoritmo de otimização para o padrão de diretividade. Para avaliar a eficácia, experimentos de aprimoramento de fala e experimentos de reconhecimento de fala são realizados com base em simulações de computador sob condições de ruído estacionárias e não estacionárias. Em comparação com a matriz convencional otimizada de atraso e soma (DS), é mostrado que: (1) a matriz proposta melhora a relação sinal-ruído (SNR) da fala degradada em cerca de 2 dB e executa mais de 20 % melhor nas taxas de reconhecimento de palavras sob as condições em que o ruído gaussiano branco com SNR de entrada de -5 ou -10 dB é usado, (2) o array proposto tem desempenho mais de 5% melhor nas taxas de reconhecimento de palavras sob condições de ruído não estacionárias. Além disso, é mostrado que essas melhorias do arranjo proposto são iguais ou superiores às do método convencional de subtração espectral em cascata com o arranjo DS.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Hiroshi SARUWATARI, Shoji KAJITA, Kazuya TAKEDA, Fumitada ITAKURA, "Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming" in IEICE TRANSACTIONS on Fundamentals,
vol. E82-A, no. 8, pp. 1501-1510, August 1999, doi: .
Abstract: This paper describes a spatial spectral subtraction method by using the complementary beamforming microphone array to enhance noisy speech signals for speech recognition. The complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this paper, it is shown that the nonlinear subtraction processing with complementary beamforming can result in a kind of the spectral subtraction without the need for speech pause detection. In addition, the optimization algorithm for the directivity pattern is also described. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations under both stationary and nonstationary noise conditions. In comparison with the optimized conventional delay-and-sum (DS) array, it is shown that: (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by about 2 dB and performs more than 20% better in word recognition rates under the conditions that the white Gaussian noise with the input SNR of -5 or -10 dB is used, (2) the proposed array performs more than 5% better in word recognition rates under the nonstationary noise conditions. Also, it is shown that these improvements of the proposed array are same as or superior to those of the conventional spectral subtraction method cascaded with the DS array.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e82-a_8_1501/_p
Copiar
@ARTICLE{e82-a_8_1501,
author={Hiroshi SARUWATARI, Shoji KAJITA, Kazuya TAKEDA, Fumitada ITAKURA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming},
year={1999},
volume={E82-A},
number={8},
pages={1501-1510},
abstract={This paper describes a spatial spectral subtraction method by using the complementary beamforming microphone array to enhance noisy speech signals for speech recognition. The complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this paper, it is shown that the nonlinear subtraction processing with complementary beamforming can result in a kind of the spectral subtraction without the need for speech pause detection. In addition, the optimization algorithm for the directivity pattern is also described. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations under both stationary and nonstationary noise conditions. In comparison with the optimized conventional delay-and-sum (DS) array, it is shown that: (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by about 2 dB and performs more than 20% better in word recognition rates under the conditions that the white Gaussian noise with the input SNR of -5 or -10 dB is used, (2) the proposed array performs more than 5% better in word recognition rates under the nonstationary noise conditions. Also, it is shown that these improvements of the proposed array are same as or superior to those of the conventional spectral subtraction method cascaded with the DS array.},
keywords={},
doi={},
ISSN={},
month={August},}
Copiar
TY - JOUR
TI - Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1501
EP - 1510
AU - Hiroshi SARUWATARI
AU - Shoji KAJITA
AU - Kazuya TAKEDA
AU - Fumitada ITAKURA
PY - 1999
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E82-A
IS - 8
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - August 1999
AB - This paper describes a spatial spectral subtraction method by using the complementary beamforming microphone array to enhance noisy speech signals for speech recognition. The complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this paper, it is shown that the nonlinear subtraction processing with complementary beamforming can result in a kind of the spectral subtraction without the need for speech pause detection. In addition, the optimization algorithm for the directivity pattern is also described. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations under both stationary and nonstationary noise conditions. In comparison with the optimized conventional delay-and-sum (DS) array, it is shown that: (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by about 2 dB and performs more than 20% better in word recognition rates under the conditions that the white Gaussian noise with the input SNR of -5 or -10 dB is used, (2) the proposed array performs more than 5% better in word recognition rates under the nonstationary noise conditions. Also, it is shown that these improvements of the proposed array are same as or superior to those of the conventional spectral subtraction method cascaded with the DS array.
ER -