In this paper, we introduce a learning approach for the controller structure in coalitional model predictive control (MPC) schemes. In this context, the local control entities can dynamically perform in a decentralized manner or assemble into groups of controllers that coordinate their control actions, i.e., coalitions. Such control strategy aims at maximizing system performance while reducing the coordination and computation burden. In this paper, we pose a multi-armed bandit problem where the arms are a set of possible controller structures and the player performs as a supervisory layer that can periodically change the composition of the coalitions. The goal is to use real-time observations to progressively learn the controller structure that best suits the needs of the system. A heuristic learning algorithm and illustrative results are provided.
- Coalitional model predictive control
- Multi-Armed bandits
ASJC Scopus subject areas
- Control and Systems Engineering