Abstract
Through vehicle-to-vehicle (V2V) communication, both human-driven and autonomous vehicles can actively exchange data, such as velocities and bumper-to-bumper distances. Employing the shared data, control laws with improved performance can be designed for connected and autonomous vehicles (CAVs). In this article, taking into account human-vehicle interaction and heterogeneous driver behavior, an adaptive optimal control design method is proposed for a platoon mixed with multiple preceding human-driven vehicles and one CAV at the tail. It is shown that by using reinforcement learning and adaptive dynamic programming techniques, a near-optimal controller can be learned from real-time data for the CAV with V2V communications, but without the precise knowledge of the accurate car-following parameters of any driver in the platoon. The proposed method allows the CAV controller to adapt to different platoon dynamics caused by the unknown and heterogeneous driver-dependent parameters. To improve the safety performance during the learning process, our off-policy learning algorithm can leverage both the historical data and the data collected in real time, which leads to considerably reduced learning time duration. The effectiveness and efficiency of our proposed method is demonstrated by rigorous proofs and microscopic traffic simulations.
Original language | English (US) |
---|---|
Pages (from-to) | 5267-5277 |
Number of pages | 11 |
Journal | IEEE Transactions on Cybernetics |
Volume | 52 |
Issue number | 6 |
DOIs | |
State | Published - Jun 1 2022 |
Keywords
- Adaptive dynamic programming (ADP)
- autonomous vehicles
- connected vehicles
- time-delay system
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Information Systems
- Human-Computer Interaction
- Computer Science Applications
- Electrical and Electronic Engineering