This paper studies the stochastic optimal control problem with additive and multiplicative noise via reinforcement learning (RL) and approximate/adaptive dynamic programming (ADP). Using It calculus, a policy iteration algorithm is derived in the presence of both additive and multiplicative noise. It is shown that the expectation of the approximated cost matrix is guaranteed to converge to the solution of certain algebraic Riccati equation that gives rise to the optimal cost value. Furthermore, the covariance of the approximated cost matrix can be reduced by increasing the length of time interval between two consecutive iterations. Finally, the efficiency of the proposed ADP methodology is illustrated in a numerical example.