This paper presents a novel reinforcement learning (RL) based discrete-time closed-loop control methodology for switch-mode, pulse-width-modulated (PWM) power electronic converters. This method of closed-loop optimal output regulation is achieved by utilizing measured data to approximate system dynamics, thus obviating the need for prior knowledge of system/plant dynamics. The underlying RL algorithm is then utilized to obtain the optimal feedback controller. The derived controller is obtained in a manner akin to that of a Linear Quadratic Regulator (LQR) and involves the iterative solution of an algebraic Riccati equation (ARE). This closed-loop control methodology is implemented on both buck and boost converters and its robustness to load and line variation is tested. A Type-III compensator was also developed in order to compare its performance with that of the proposed controller. Simulation results are provided to verify the effectiveness and examine the limitations of the proposed control strategy.