Abstract
This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.
Original language | English (US) |
---|---|
Pages (from-to) | 2383-2390 |
Number of pages | 8 |
Journal | IEEE Transactions on Automatic Control |
Volume | 68 |
Issue number | 4 |
DOIs | |
State | Published - Apr 1 2023 |
Keywords
- Adaptive optimal control
- data-driven control
- policy iteration
- reinforcement learning
- robustness
- stochastic control
ASJC Scopus subject areas
- Electrical and Electronic Engineering
- Control and Systems Engineering
- Computer Science Applications