The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
O reconhecimento estatístico de fala usando modelos de Markov ocultos de densidade contínua (CDHMMs) rendeu muitas aplicações práticas. No entanto, em geral, as incompatibilidades entre os dados de treinamento e os dados de entrada degradam significativamente a precisão do reconhecimento. Várias técnicas de adaptação de modelos acústicos usando algumas declarações de entrada foram empregadas para superar esse problema. Neste artigo, examinamos essas técnicas de adaptação, incluindo estimativa máxima a posteriori (MAP), regressão linear de máxima verossimilhança (MLLR) e voz própria. Apresentamos também uma visão esquemática chamada pirâmide de adaptação para ilustrar como esses métodos se relacionam entre si.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Koichi SHINODA, "Acoustic Model Adaptation for Speech Recognition" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 9, pp. 2348-2362, September 2010, doi: 10.1587/transinf.E93.D.2348.
Abstract: Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these adaptation techniques, including maximum a posteriori (MAP) estimation, maximum likelihood linear regression (MLLR), and eigenvoice. We also present a schematic view called the adaptation pyramid to illustrate how these methods relate to each other.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.2348/_p
Copiar
@ARTICLE{e93-d_9_2348,
author={Koichi SHINODA, },
journal={IEICE TRANSACTIONS on Information},
title={Acoustic Model Adaptation for Speech Recognition},
year={2010},
volume={E93-D},
number={9},
pages={2348-2362},
abstract={Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these adaptation techniques, including maximum a posteriori (MAP) estimation, maximum likelihood linear regression (MLLR), and eigenvoice. We also present a schematic view called the adaptation pyramid to illustrate how these methods relate to each other.},
keywords={},
doi={10.1587/transinf.E93.D.2348},
ISSN={1745-1361},
month={September},}
Copiar
TY - JOUR
TI - Acoustic Model Adaptation for Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2348
EP - 2362
AU - Koichi SHINODA
PY - 2010
DO - 10.1587/transinf.E93.D.2348
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2010
AB - Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these adaptation techniques, including maximum a posteriori (MAP) estimation, maximum likelihood linear regression (MLLR), and eigenvoice. We also present a schematic view called the adaptation pyramid to illustrate how these methods relate to each other.
ER -