In this work, we propose a two-stage video coding framework, as an extension of our previous one-stage framework in . The two-stage frameworks consists two different dictionaries. Specifically, the first stage directly finds the sparse representation of a block with a self-adaptive dictionary consisting of all possible inter-prediction candidates by solving an L0-norm minimization problem using orthogonal least squares (OLS), and the second stage codes the residual using altered DCT dictionary orthonormalized to the subspace spanned by the first stage atoms. The transition of the first stage and the second stage is adaptively determined based on the estimated residual reduction per bit. We further propose a complete context adaptive entropy coder to efficiently code the locations and the coefficients of chosen first stage atoms. Simulation results show that the proposed coder significantly improves the RD performance over our previous one-stage coder. More importantly, the two-stage coder, using a fixed block size and inter-prediction only, outperforms the H.264 coder (x264) and is competitive with the HEVC reference coder (HM) over a large rate range.