Deep learning methods are in the forefront of techniques used to perform complex controls in autonomous vehicles (AVs). Such methods are vulnerable to nuanced types of adversarial attacks, and can have sever safety implications. Specifically, backdoors are an emerging kind of adversarial attacks on deep neural networks (DNNs), where a secret backdoor is injected into the DNNs by an attacker and activated in the presence of well-designed triggers, which necessitate a systematic exploration to enable the study of effective defenses. In this paper, we learn an adversarial distribution for trigger samples by reinforcement learning with the objective that the difference between the adversarial and genuine distributions are minimized. This bypasses many detection algorithms that are designed based on the difference between the adversarial and genuine input samples. Specifically, the difference between two distributions are evaluated by the Jensen-Shannon (JS)-divergence. The adversarial samples generated by the learned adversarial distribution are used for manipulating benign models in two complex traffic control systems. Our results show that our method renders the backdoor attack stealthy overriding the benign control objectives and potentially causing vehicle collisions.