Bridging the Gap Between Reinforcement Learning and Nonlinear Output-Feedback Control

Weinan Gao, Zhong Ping Jiang, Tianyou Chai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The primary objective of this paper is to bridge the gap between reinforcement learning (RL) and nonlinear output-feedback control by developing a novel solution to direct adaptive optimal control with guaranteed closed-loop stability More specifically, for a broad class of nonlinear affine discrete time systems with limited output measurements, we integrate RL and advanced nonlinear control methods to devise high-fidelity direct adaptive optimal controllers from data. Under the condition of uniform observability, our original learning-based control solution begins with the reconstruction of the system state from the retrospective input and output data, akin to a deadbea observer. Subsequently, we propose value iteration algorithms to facilitate the learning of optimal output-feedback control policies and value functions leveraging measured input and output data To ensure feasibility and reliability in practice, we provide rigorous convergence proofs for the proposed learning algorithms along with the stability analysis for the closed-loop system Simulation results are presented to showcase the effectivenes of the developed methodologies, demonstrating their capability to handle the output-feedback adaptive optimal control problems of general nonlinear affine systems.

Original languageEnglish (US)
Title of host publicationProceedings of the 43rd Chinese Control Conference, CCC 2024
EditorsJing Na, Jian Sun
PublisherIEEE Computer Society
Pages2425-2431
Number of pages7
ISBN (Electronic)9789887581581
DOIs
StatePublished - 2024
Event43rd Chinese Control Conference, CCC 2024 - Kunming, China
Duration: Jul 28 2024Jul 31 2024

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference43rd Chinese Control Conference, CCC 2024
Country/TerritoryChina
CityKunming
Period7/28/247/31/24

Keywords

  • Adaptive optimal control
  • nonlinear systems
  • output-feedback
  • reinforcement learning

ASJC Scopus subject areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Applied Mathematics
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Bridging the Gap Between Reinforcement Learning and Nonlinear Output-Feedback Control'. Together they form a unique fingerprint.

Cite this