Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems

Research output: Contribution to journalArticlepeer-review


This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.

Original languageEnglish (US)
Pages (from-to)2383-2390
Number of pages8
JournalIEEE Transactions on Automatic Control
Issue number4
StatePublished - Apr 1 2023


  • Adaptive optimal control
  • data-driven control
  • policy iteration
  • reinforcement learning
  • robustness
  • stochastic control

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Computer Science Applications


Dive into the research topics of 'Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems'. Together they form a unique fingerprint.

Cite this