The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Aprender a representação semântica para o contexto de tradução é benéfico para a tradução automática estatística (SMT). Esforços anteriores concentraram-se na codificação implícita do conhecimento sintático e semântico no contexto da tradução por redes neurais, que são fracas na captura de informações de sintaxe estrutural explícita. Neste artigo, propomos uma nova rede neural com uma arquitetura convolucional baseada em árvore para aprender explicitamente informações de sintaxe estrutural no contexto de tradução, melhorando assim a previsão da tradução. Especificamente, primeiro convertemos sentenças paralelas com árvores de análise de origem em sequências lineares baseadas em sintaxe com base em um algoritmo de subárvore de sintaxe mínima e, em seguida, definimos uma rede convolucional baseada em árvore sobre as sequências lineares para aprender conjuntamente a representação de contexto baseada em sintaxe e a previsão de tradução. Para verificar a eficácia, o modelo proposto é integrado ao SMT baseado em frases. Experimentos em tarefas de tradução em larga escala de chinês para inglês e alemão para inglês mostram que a abordagem proposta pode alcançar uma melhoria substancial e significativa em vários sistemas de base.
Kehai CHEN
Harbin Institute of Technology
Tiejun ZHAO
Harbin Institute of Technology
Muyun YANG
Harbin Institute of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Kehai CHEN, Tiejun ZHAO, Muyun YANG, "Syntax-Based Context Representation for Statistical Machine Translation" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 12, pp. 3226-3237, December 2018, doi: 10.1587/transinf.2018EDP7209.
Abstract: Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7209/_p
Copiar
@ARTICLE{e101-d_12_3226,
author={Kehai CHEN, Tiejun ZHAO, Muyun YANG, },
journal={IEICE TRANSACTIONS on Information},
title={Syntax-Based Context Representation for Statistical Machine Translation},
year={2018},
volume={E101-D},
number={12},
pages={3226-3237},
abstract={Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.},
keywords={},
doi={10.1587/transinf.2018EDP7209},
ISSN={1745-1361},
month={December},}
Copiar
TY - JOUR
TI - Syntax-Based Context Representation for Statistical Machine Translation
T2 - IEICE TRANSACTIONS on Information
SP - 3226
EP - 3237
AU - Kehai CHEN
AU - Tiejun ZHAO
AU - Muyun YANG
PY - 2018
DO - 10.1587/transinf.2018EDP7209
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2018
AB - Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.
ER -