TY - GEN
T1 - Feature adapted convolutional neural networks for downbeat tracking
AU - Durand, Simon
AU - Bello, Juan P.
AU - David, Bertrand
AU - Richard, Gael
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/5/18
Y1 - 2016/5/18
N2 - We define a novel system for the automatic estimation of downbeat positions from audio music signals. New rhythm and melodic features are introduced and feature adapted convolutional neural networks are used to take advantage of their specificity. Indeed, invariance to melody transposition, chroma data augmentation and length-specific rhythmic patterns prove to be useful to learn downbeat likelihood. After the data is segmented in tatums, complementary features related to melody, rhythm and harmony are extracted and the likelihood of a tatum being at a downbeat position is computed with the aforementioned neural networks. The downbeat sequence is then extracted with a flexible temporal hidden Markov model. We then show the efficiency and robustness of our approach with a comparative evaluation conducted on 9 datasets.
AB - We define a novel system for the automatic estimation of downbeat positions from audio music signals. New rhythm and melodic features are introduced and feature adapted convolutional neural networks are used to take advantage of their specificity. Indeed, invariance to melody transposition, chroma data augmentation and length-specific rhythmic patterns prove to be useful to learn downbeat likelihood. After the data is segmented in tatums, complementary features related to melody, rhythm and harmony are extracted and the likelihood of a tatum being at a downbeat position is computed with the aforementioned neural networks. The downbeat sequence is then extracted with a flexible temporal hidden Markov model. We then show the efficiency and robustness of our approach with a comparative evaluation conducted on 9 datasets.
KW - Convolutional Neural Networks
KW - Downbeat Tracking
KW - Music Information Retrieval
KW - Music Signal Processing
UR - http://www.scopus.com/inward/record.url?scp=84973367430&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84973367430&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2016.7471684
DO - 10.1109/ICASSP.2016.7471684
M3 - Conference contribution
AN - SCOPUS:84973367430
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 296
EP - 300
BT - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Y2 - 20 March 2016 through 25 March 2016
ER -