Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter studies the robustness of reinforcement learning for discrete-time linear stochastic systems with multiplicative noise evolving in continuous state and action spaces. As one of the popular methods in reinforcement learning, the robustness of policy iteration is a longstanding open problem for the stochastic linear quadratic regulator (LQR) problem with multiplicative noise. A solution in the spirit of input-to-state stability is given, guaranteeing that the solutions of the policy iteration algorithm are bounded and enter a small neighborhood of the optimal solution, whenever the error in each iteration is bounded and small. In addition, a novel off-policy multiple-trajectory optimistic least-squares policy iteration algorithm is proposed, to learn a near-optimal solution of the stochastic LQR problem directly from online input/state data, without explicitly identifying the system matrices. The efficacy of the proposed algorithm is supported by rigorous convergence analysis and numerical results on a second-order example.

Original languageEnglish (US)
Title of host publicationLecture Notes in Control and Information Sciences
PublisherSpringer Science and Business Media Deutschland GmbH
Pages249-277
Number of pages29
DOIs
StatePublished - 2022

Publication series

NameLecture Notes in Control and Information Sciences
Volume488
ISSN (Print)0170-8643
ISSN (Electronic)1610-7411

ASJC Scopus subject areas

  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise'. Together they form a unique fingerprint.

Cite this