The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Os avanços na aprendizagem por reforço profundo demonstraram sua eficácia em uma ampla variedade de domínios. Redes neurais profundas são capazes de aproximar funções e políticas de valor em ambientes complexos. No entanto, as redes neurais profundas herdam uma série de desvantagens. A falta de interpretabilidade limita sua usabilidade em muitos cenários do mundo real críticos para a segurança. Além disso, eles dependem de enormes quantidades de dados para aprender com eficiência. Isto pode ser adequado em tarefas simuladas, mas restringe seu uso a muitas aplicações do mundo real. Finalmente, a sua capacidade de generalização é baixa, a capacidade de determinar que uma situação é semelhante a outra encontrada anteriormente. Apresentamos um método para combinar conhecimento externo e aprendizagem por reforço interpretável. Derivamos uma versão variante baseada em regras do algoritmo Sarsa(λ), que chamamos de Sarsa-rb(λ), que aumenta os dados com conhecimento prévio e explora semelhanças entre estados. Demonstramos que nossa abordagem aproveita pequenas quantidades de conhecimento prévio para acelerar significativamente o aprendizado em vários domínios, como negociação ou navegação visual. O agente resultante fornece ganhos substanciais em velocidade e desempenho de treinamento em relação ao q-learning profundo (DQN), gradientes de política determinísticos profundos (DDPG) e melhora a estabilidade em relação à otimização de política proximal (PPO).
Nicolas BOUGIE
Sokendai, The Graduate University for Advanced Studies,National Institute of Informatics
Ryutaro ICHISE
Sokendai, The Graduate University for Advanced Studies,National Institute of Informatics
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Nicolas BOUGIE, Ryutaro ICHISE, "Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 10, pp. 2143-2153, October 2020, doi: 10.1587/transinf.2019EDP7170.
Abstract: Advances in deep reinforcement learning have demonstrated its effectiveness in a wide variety of domains. Deep neural networks are capable of approximating value functions and policies in complex environments. However, deep neural networks inherit a number of drawbacks. Lack of interpretability limits their usability in many safety-critical real-world scenarios. Moreover, they rely on huge amounts of data to learn efficiently. This may be suitable in simulated tasks, but restricts their use to many real-world applications. Finally, their generalization capability is low, the ability to determine that a situation is similar to one encountered previously. We present a method to combine external knowledge and interpretable reinforcement learning. We derive a rule-based variant version of the Sarsa(λ) algorithm, which we call Sarsa-rb(λ), that augments data with prior knowledge and exploits similarities among states. We demonstrate that our approach leverages small amounts of prior knowledge to significantly accelerate the learning in multiple domains such as trading or visual navigation. The resulting agent provides substantial gains in training speed and performance over deep q-learning (DQN), deep deterministic policy gradients (DDPG), and improves stability over proximal policy optimization (PPO).
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7170/_p
Copiar
@ARTICLE{e103-d_10_2143,
author={Nicolas BOUGIE, Ryutaro ICHISE, },
journal={IEICE TRANSACTIONS on Information},
title={Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge},
year={2020},
volume={E103-D},
number={10},
pages={2143-2153},
abstract={Advances in deep reinforcement learning have demonstrated its effectiveness in a wide variety of domains. Deep neural networks are capable of approximating value functions and policies in complex environments. However, deep neural networks inherit a number of drawbacks. Lack of interpretability limits their usability in many safety-critical real-world scenarios. Moreover, they rely on huge amounts of data to learn efficiently. This may be suitable in simulated tasks, but restricts their use to many real-world applications. Finally, their generalization capability is low, the ability to determine that a situation is similar to one encountered previously. We present a method to combine external knowledge and interpretable reinforcement learning. We derive a rule-based variant version of the Sarsa(λ) algorithm, which we call Sarsa-rb(λ), that augments data with prior knowledge and exploits similarities among states. We demonstrate that our approach leverages small amounts of prior knowledge to significantly accelerate the learning in multiple domains such as trading or visual navigation. The resulting agent provides substantial gains in training speed and performance over deep q-learning (DQN), deep deterministic policy gradients (DDPG), and improves stability over proximal policy optimization (PPO).},
keywords={},
doi={10.1587/transinf.2019EDP7170},
ISSN={1745-1361},
month={October},}
Copiar
TY - JOUR
TI - Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge
T2 - IEICE TRANSACTIONS on Information
SP - 2143
EP - 2153
AU - Nicolas BOUGIE
AU - Ryutaro ICHISE
PY - 2020
DO - 10.1587/transinf.2019EDP7170
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2020
AB - Advances in deep reinforcement learning have demonstrated its effectiveness in a wide variety of domains. Deep neural networks are capable of approximating value functions and policies in complex environments. However, deep neural networks inherit a number of drawbacks. Lack of interpretability limits their usability in many safety-critical real-world scenarios. Moreover, they rely on huge amounts of data to learn efficiently. This may be suitable in simulated tasks, but restricts their use to many real-world applications. Finally, their generalization capability is low, the ability to determine that a situation is similar to one encountered previously. We present a method to combine external knowledge and interpretable reinforcement learning. We derive a rule-based variant version of the Sarsa(λ) algorithm, which we call Sarsa-rb(λ), that augments data with prior knowledge and exploits similarities among states. We demonstrate that our approach leverages small amounts of prior knowledge to significantly accelerate the learning in multiple domains such as trading or visual navigation. The resulting agent provides substantial gains in training speed and performance over deep q-learning (DQN), deep deterministic policy gradients (DDPG), and improves stability over proximal policy optimization (PPO).
ER -