Reinforcement Learning Parameterization: Softmax Between Exploration and Exploation
K. Macek
Czech Technical University in Prague
Abstract
Control in dynamic systems stands for a complex task with respect to changing conditions, nonlinear dependencies and time delays. One of tools of online optimization of control parameters is reinforcement learning. Present paper deals with its application in PID parameters optimization and examines the most appropriate parameterization of softmax selection mechanism.
Full paper
Session
Robust and Adaptive Control (Poster)
Reference
Macek, K.: Reinforcement Learning Parameterization: Softmax Between Exploration and Exploation. Editors: Fikar, M., Kvasnica, M., In Proceedings of the 17th International Conference on Process Control ’09, Štrbské Pleso, Slovakia, 438–442, 2009
BibTeX
@inProceedings{pc09-068, | ||
author | = { | Macek, K.}, |
title | = { | Reinforcement Learning Parameterization: Softmax Between Exploration and Exploation}, |
booktitle | = { | Proceedings of the 17th International Conference on Process Control '09}, |
year | = { | 2009}, |
pages | = { | 438-442}, |
editor | = { | Fikar, M. and Kvasnica, M.}, |
address | = { | Štrbské Pleso, Slovakia}, |
publisher | = { | Slovak University of Technology in Bratislava}, |
url | = { | http://www.kirp.chtf.stuba.sk/pc09/data/papers/068.pdf}} |