Hidden Markov models (HMM) have been widely studied and applied over decades. The standard supervised learning method for HMM is maximum likelihood estimation (MLE) which maximizes the joint probability of training data. However, the most natural way of training would be finding the parameters that directly minimize the error rate of a given training set. In this article, we propose a novel learning method that minimizes the number of incorrectly decoded labels frame-wise. To do this, we construct a smooth function that is arbitrarily close to the exact frame error rate and minimize it directly using a gradient-based optimization algorithm. The proposed approach is intuitive and simple. We applied our method to the task of chord recognition in music, and the results show that it performs better than Maximum Likelihood Estimation and Minimum Classification Error.