Abstract
This paper studies data-driven learning-based methods for the finite-horizon optimal control of linear time-varying discrete-time systems. First, a novel finite-horizon Policy Iteration (PI) method for linear time-varying discrete-time systems is presented. Its connections with existing infinite-horizon PI methods are discussed. Then, both data-driven off-policy PI and Value Iteration (VI) algorithms are derived to find approximate optimal controllers when the system dynamics is completely unknown. Under mild conditions, the proposed data-driven off-policy algorithms converge to the optimal solution. Finally, the effectiveness and feasibility of the developed methods are validated by a practical example of spacecraft attitude control.
Original language | English (US) |
---|---|
Pages (from-to) | 73-84 |
Number of pages | 12 |
Journal | Control Theory and Technology |
Volume | 17 |
Issue number | 1 |
DOIs | |
State | Published - Feb 1 2019 |
Keywords
- Optimal control
- adaptive dynamic programming
- policy iteration (PI)
- time-varying system
- value iteration (VI)
ASJC Scopus subject areas
- Control and Systems Engineering
- Signal Processing
- Information Systems
- Modeling and Simulation
- Aerospace Engineering
- Control and Optimization
- Electrical and Electronic Engineering