Emerging applications pose different Quality of Service (QoS) requirements for the network, where Traffic Engineering (TE) plays an important role in QoS provisioning by carefully selecting routing paths and adjusting traffic split ratios on routing paths. To accommodate diverse QoS requirements of traffic flows under network dynamics, TE usually periodically computes an optimal routing strategy and updates a significant number of forwarding entries, which introduces considerable network operation management overhead. In this paper, we propose QoS-RL, a Reinforcement Learning (RL)-based TE solution for QoS provisioning and load balancing with low management overhead and service disruption during routing updates. Given the traffic matrices that represent the traffic demands of high and low priority flows, QoS-RL can intelligently select and update only a few destination-based forwarding entries to satisfy the QoS requirements of high priority traffic while maintaining good load balancing performance by rerouting a small portion of low priority traffic. Extensive simulation results on four real-world network topologies demonstrate that QoS-RL provides at least 95.5 % of optimal end-to-end delay performance on average for high priority flows, and also achieves above 90 % of optimal load balancing performance in most cases by updating only 10% of destination-based forwarding entries.