The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
A colocação é um fenômeno onipresente em idiomas e o reconhecimento e extração precisos da colocação são de grande importância para muitas tarefas de processamento de linguagem natural. As colocações podem ser diferenciadas de colocações simples de bigramas até quadros de colocação (referindo-se a colocações multigramas distantes). Até agora, pouco foco foi dado aos quadros de colocação. Orientado para tradução e análise, este estudo visa reconhecer e extrair os quadros de colocação mais longos possíveis de determinadas sentenças. Primeiro extraímos colocações de bigramas com método baseado em semântica distributiva, introduzindo padrões de colocação e integrando algumas medidas de associação de última geração. Com base nas colocações de bigramas extraídas pelo método proposto, obtemos os quadros de colocação mais longos de acordo com a natureza recursiva e as regras linguísticas das colocações. Comparado com os sistemas de linha de base, o método proposto tem um desempenho significativamente melhor na extração de colocação de bigramas, tanto em precisão quanto em recuperação. E na extração de quadros de colocação, o método proposto tem um desempenho ainda melhor com precisão semelhante aos resultados de extração de colocação de bigramas.
Xiaoxia LIU
Dalian University of Technology
Degen HUANG
Dalian University of Technology
Zhangzhi YIN
Dalian University of Technology
Fuji REN
Tokushima University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Xiaoxia LIU, Degen HUANG, Zhangzhi YIN, Fuji REN, "Recognition of Collocation Frames from Sentences" in IEICE TRANSACTIONS on Information,
vol. E102-D, no. 3, pp. 620-627, March 2019, doi: 10.1587/transinf.2018EDP7255.
Abstract: Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7255/_p
Copiar
@ARTICLE{e102-d_3_620,
author={Xiaoxia LIU, Degen HUANG, Zhangzhi YIN, Fuji REN, },
journal={IEICE TRANSACTIONS on Information},
title={Recognition of Collocation Frames from Sentences},
year={2019},
volume={E102-D},
number={3},
pages={620-627},
abstract={Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.},
keywords={},
doi={10.1587/transinf.2018EDP7255},
ISSN={1745-1361},
month={March},}
Copiar
TY - JOUR
TI - Recognition of Collocation Frames from Sentences
T2 - IEICE TRANSACTIONS on Information
SP - 620
EP - 627
AU - Xiaoxia LIU
AU - Degen HUANG
AU - Zhangzhi YIN
AU - Fuji REN
PY - 2019
DO - 10.1587/transinf.2018EDP7255
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E102-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2019
AB - Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.
ER -