The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Devido às limitações das ferramentas de processamento de linguagem para a língua tailandesa, a extração de informações baseada em padrões de documentos tailandeses requer técnicas suplementares. Com base na aplicação de regras de janela deslizante e filtragem de extração, apresentamos uma estrutura para extrair informações semânticas de frases de sintomas médicos com limites desconhecidos em entradas de informações de texto não estruturado em tailandês. Um algoritmo de aprendizado de regras supervisionado é empregado para construção automática de regras de extração de informações a partir de frases de sintomas de treinamento marcadas manualmente. Dois componentes de filtragem são introduzidos: um usa um modelo de classificação para prever a aplicação de regras através de um limite de frase de sintoma com base em recursos de instanciação de curingas internos de regras, o outro usa confiança de classificação ponderada para resolver conflitos decorrentes de extrações sobrepostas. Em nosso estudo experimental, concentramos nossa atenção em dois tipos básicos de descrições frasais de sintomas: um está relacionado com características anormais de algumas entidades observáveis e o outro com locais do corpo humano onde os sintomas primitivos aparecem. Os resultados experimentais mostram que os componentes de filtragem melhoram a precisão enquanto preservam a recuperação de forma satisfatória.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Peerasak INTARAPAIBOON, Ekawit NANTAJEEWARAWAT, Thanaruk THEERAMUNKONG, "Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries" in IEICE TRANSACTIONS on Information,
vol. E94-D, no. 3, pp. 465-478, March 2011, doi: 10.1587/transinf.E94.D.465.
Abstract: Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai unstructured-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model to predict rule application across a symptom-phrase boundary based on instantiation features of rule internal wildcards, the other uses weighted classification confidence to resolve conflicts arising from overlapping extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which primitive symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E94.D.465/_p
Copiar
@ARTICLE{e94-d_3_465,
author={Peerasak INTARAPAIBOON, Ekawit NANTAJEEWARAWAT, Thanaruk THEERAMUNKONG, },
journal={IEICE TRANSACTIONS on Information},
title={Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries},
year={2011},
volume={E94-D},
number={3},
pages={465-478},
abstract={Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai unstructured-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model to predict rule application across a symptom-phrase boundary based on instantiation features of rule internal wildcards, the other uses weighted classification confidence to resolve conflicts arising from overlapping extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which primitive symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.},
keywords={},
doi={10.1587/transinf.E94.D.465},
ISSN={1745-1361},
month={March},}
Copiar
TY - JOUR
TI - Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries
T2 - IEICE TRANSACTIONS on Information
SP - 465
EP - 478
AU - Peerasak INTARAPAIBOON
AU - Ekawit NANTAJEEWARAWAT
AU - Thanaruk THEERAMUNKONG
PY - 2011
DO - 10.1587/transinf.E94.D.465
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E94-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2011
AB - Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai unstructured-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model to predict rule application across a symptom-phrase boundary based on instantiation features of rule internal wildcards, the other uses weighted classification confidence to resolve conflicts arising from overlapping extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which primitive symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.
ER -