TY - GEN
T1 - Simulation to scaled city
T2 - 10th ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS 2019, part of the 2019 CPS-IoT Week
AU - Jang, Kathy
AU - Vinitsky, Eugene
AU - Chalaki, Behdad
AU - Remer, Ben
AU - Beaver, Logan
AU - Malikopoulos, Andreas A.
AU - Bayen, Alexandre
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/4/16
Y1 - 2019/4/16
N2 - Using deep reinforcement learning, we successfully train a set of two autonomous vehicles to lead a fleet of vehicles onto a roundabout and then transfer this policy from simulation to a scaled city without fine-tuning. We use Flow, a library for deep reinforcement learning in microsimulators, to train two policies, (1) a policy with noise injected into the state and action space and (2) a policy without any injected noise. In simulation, the autonomous vehicles learn an emergent metering behavior for both policies which allows smooth merging. We then directly transfer this policy without any tuning to the University of Delaware’s Scaled Smart City (UDSSC), a 1:25 scale testbed for connected and automated vehicles. We characterize the performance of the transferred policy based on how thoroughly the ramp metering behavior is captured in UDSSC. We show that the noise-free policy results in severe slowdowns and only, occasionally, it exhibits acceptable metering behavior. On the other hand, the noise-injected policy consistently performs an acceptable metering behavior, implying that the noise eventually aids with the zero-shot policy transfer. Finally, the transferred, noise-injected policy leads to a 5% reduction of average travel time and a reduction of 22% in maximum travel time in the UDSSC. Videos of the proposed self-learning controllers can be found at https://sites.google.com/view/iccps-policy-transfer.
AB - Using deep reinforcement learning, we successfully train a set of two autonomous vehicles to lead a fleet of vehicles onto a roundabout and then transfer this policy from simulation to a scaled city without fine-tuning. We use Flow, a library for deep reinforcement learning in microsimulators, to train two policies, (1) a policy with noise injected into the state and action space and (2) a policy without any injected noise. In simulation, the autonomous vehicles learn an emergent metering behavior for both policies which allows smooth merging. We then directly transfer this policy without any tuning to the University of Delaware’s Scaled Smart City (UDSSC), a 1:25 scale testbed for connected and automated vehicles. We characterize the performance of the transferred policy based on how thoroughly the ramp metering behavior is captured in UDSSC. We show that the noise-free policy results in severe slowdowns and only, occasionally, it exhibits acceptable metering behavior. On the other hand, the noise-injected policy consistently performs an acceptable metering behavior, implying that the noise eventually aids with the zero-shot policy transfer. Finally, the transferred, noise-injected policy leads to a 5% reduction of average travel time and a reduction of 22% in maximum travel time in the UDSSC. Videos of the proposed self-learning controllers can be found at https://sites.google.com/view/iccps-policy-transfer.
KW - Autonomous vehicles
KW - Control theory
KW - Cyber-physical systems
KW - Deep learning
KW - Policy Transfer
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85066617039&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066617039&partnerID=8YFLogxK
U2 - 10.1145/3302509.3313784
DO - 10.1145/3302509.3313784
M3 - Conference contribution
AN - SCOPUS:85066617039
T3 - ICCPS 2019 - Proceedings of the 2019 ACM/IEEE International Conference on Cyber-Physical Systems
SP - 291
EP - 301
BT - ICCPS 2019 - Proceedings of the 2019 ACM/IEEE International Conference on Cyber-Physical Systems
A2 - Ramachandran, Gowri Sankar
A2 - Ortiz, Jorge
PB - Association for Computing Machinery, Inc
Y2 - 16 April 2019 through 18 April 2019
ER -