The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Como um tipo de modelo de aprendizado de máquina, um “modelo de conjunto de árvore de decisão” (DTEM) é representado por um conjunto de árvores de decisão. Um DTEM é conhecido principalmente por ser válido para dados estruturados; no entanto, como outros modelos de aprendizado de máquina, é difícil treiná-lo para retornar o valor de saída correto (chamado “valor de previsão”) para qualquer valor de entrada (chamado “valor de atributo”). Conseqüentemente, quando um DTEM é usado em relação a um sistema que requer confiabilidade, é importante detectar de forma abrangente os valores dos atributos que levam ao mau funcionamento de um sistema (falhas) durante o desenvolvimento e tomar as contramedidas apropriadas. Uma solução concebível é instalar um filtro de entrada que controle a entrada do DTEM e usar software separado para processar valores de atributos que podem levar a falhas. Para desenvolver o filtro de entrada é necessário especificar a condição de filtragem para o valor do atributo que leva ao mau funcionamento do sistema. Considerando essa necessidade, propomos um método para verificar formalmente um DTEM e, de acordo com o resultado da verificação, se for encontrado um valor de atributo que leva a uma falha, extraindo o intervalo em que tal valor de atributo existe. O método proposto pode extrair de forma abrangente o intervalo em que existe o valor do atributo que leva à falha; portanto, criando um filtro de entrada baseado nessa faixa, é possível evitar a falha. Para demonstrar a viabilidade do método proposto, realizamos um estudo de caso utilizando um conjunto de dados de preços de casas. Através do estudo de caso, também avaliamos sua escalabilidade e mostramos que o número e a profundidade das árvores de decisão são fatores importantes que determinam a aplicabilidade do método proposto.
Naoto SATO
Hitachi, Ltd.
Hironobu KURUMA
Hitachi, Ltd.
Yuichiroh NAKAGAWA
Hitachi, Ltd.
Hideto OGAWA
Hitachi, Ltd.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA, "Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 2, pp. 363-378, February 2020, doi: 10.1587/transinf.2019EDP7120.
Abstract: As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7120/_p
Copiar
@ARTICLE{e103-d_2_363,
author={Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges},
year={2020},
volume={E103-D},
number={2},
pages={363-378},
abstract={As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.},
keywords={},
doi={10.1587/transinf.2019EDP7120},
ISSN={1745-1361},
month={February},}
Copiar
TY - JOUR
TI - Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges
T2 - IEICE TRANSACTIONS on Information
SP - 363
EP - 378
AU - Naoto SATO
AU - Hironobu KURUMA
AU - Yuichiroh NAKAGAWA
AU - Hideto OGAWA
PY - 2020
DO - 10.1587/transinf.2019EDP7120
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2020
AB - As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.
ER -