Ramp Metering for a Distant Downstream Bottleneck Using Reinforcement Learning with Value Function Approximation

Yue Zhou, Kaan Ozbay, Pushkin Kachroo, Fan Zuo

Research output: Contribution to journalArticlepeer-review

Abstract

Ramp metering for a bottleneck located far downstream of the ramp is more challenging than for a bottleneck that is near the ramp. This is because under the control of a conventional linear feedback-type ramp metering strategy, when metered traffic from the ramp arrive at the distant downstream bottleneck, the state of the bottleneck may have significantly changed from when it is sampled for computing the metering rate; due to the considerable time, these traffic will have to take to traverse the long distance between the ramp and the bottleneck. As a result of such time-delay effects, significant stability issue can arise. Previous studies have mainly resorted to compensating for the time-delay effects by incorporating predictors of traffic flow evolution into the control systems. This paper presents an alternative approach. The problem of ramp metering for a distant downstream bottleneck is formulated as a Q-learning problem, in which an intelligent ramp meter agent learns a nonlinear optimal ramp metering policy such that the capacity of the distant downstream bottleneck can be fully utilized, but not to be exceeded to cause congestion. The learned policy is in pure feedback form in that only the current state of the environment is needed to determine the optimal metering rate for the current time. No prediction is needed, as anticipation of traffic flow evolution has been instilled into the nonlinear feedback policy via learning. To deal with the intimidating computational cost associated with the multidimensional continuous state space, the value function of actions is approximated by an artificial neural network, rather than a lookup table. The mechanism and development of the approximate value function and how learning of its parameters is integrated into the Q-learning process are well explained. Through experiments, the learned ramp metering policy has demonstrated effectiveness and benign stability and some level of robustness to demand uncertainties.

Original languageEnglish (US)
Article number8813467
JournalJournal of Advanced Transportation
Volume2020
DOIs
StatePublished - 2020

ASJC Scopus subject areas

  • Automotive Engineering
  • Economics and Econometrics
  • Mechanical Engineering
  • Computer Science Applications
  • Strategy and Management

Fingerprint

Dive into the research topics of 'Ramp Metering for a Distant Downstream Bottleneck Using Reinforcement Learning with Value Function Approximation'. Together they form a unique fingerprint.

Cite this