Abstract
This article studies the adaptive optimal control problem for continuous-time nonlinear systems described by differential equations. A key strategy is to exploit the value iteration (VI) method proposed initially by Bellman in 1957 as a fundamental tool to solve dynamic programming problems. However, previous VI methods are all exclusively devoted to the Markov decision processes and discrete-time dynamical systems. In this article, we aim to fill up the gap by developing a new continuous-time VI method that will be applied to address the adaptive or nonadaptive optimal control problems for continuous-time systems described by differential equations. Like the traditional VI, the continuous-time VI algorithm retains the nice feature that there is no need to assume the knowledge of an initial admissible control policy. As a direct application of the proposed VI method, a new class of adaptive optimal controllers is obtained for nonlinear systems with totally unknown dynamics. A learning-based control algorithm is proposed to show how to learn robust optimal controllers directly from real-time data. Finally, two examples are given to illustrate the efficacy of the proposed methodology.
Original language | English (US) |
---|---|
Pages (from-to) | 2781-2790 |
Number of pages | 10 |
Journal | IEEE transactions on neural networks and learning systems |
Volume | 33 |
Issue number | 7 |
DOIs | |
State | Published - Jul 1 2022 |
Keywords
- Adaptive optimal control
- Adaptive systems
- Dynamical systems
- Heuristic algorithms
- Linear systems
- Mathematical model
- Nonlinear systems
- Optimal control
- nonlinear systems
- value iteration (VI).
- value iteration (VI)
ASJC Scopus subject areas
- Software
- Artificial Intelligence
- Computer Networks and Communications
- Computer Science Applications