Analysis of common design choices in deep learning systems for downbeat tracking

Magdalena Fuentes, Brian McFee, Hélène C. Crayencour, Slim Essid, Juan P. Bello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Downbeat tracking consists of annotating a piece of musical audio with the estimated position of the first beat of each bar. In recent years, increasing attention has been paid to applying deep learning models to this task, and various architectures have been proposed, leading to a significant improvement in accuracy. However, there are few insights about the role of the various design choices and the delicate interactions between them. In this paper we offer a systematic investigation of the impact of largely adopted variants. We study the effects of the temporal granularity of the input representation (i.e. beat-level vs tatum-level) and the encoding of the networks outputs. We also investigate the potential of convolutional-recurrent networks, which have not been explored in previous downbeat tracking systems. To this end, we exploit a state-of-the-art recurrent neural network where we introduce those variants, while keeping the training data, network learning parameters and post-processing stages fixed. We find that temporal granularity has a significant impact on performance, and we analyze its interaction with the encoding of the networks outputs.

Original languageEnglish (US)
Title of host publicationProceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018
EditorsEmilia Gomez, Xiao Hu, Eric Humphrey, Emmanouil Benetos
PublisherInternational Society for Music Information Retrieval
Pages106-112
Number of pages7
ISBN (Electronic)9782954035123
StatePublished - 2018
Event19th International Society for Music Information Retrieval Conference, ISMIR 2018 - Paris, France
Duration: Sep 23 2018Sep 27 2018

Publication series

NameProceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018

Conference

Conference19th International Society for Music Information Retrieval Conference, ISMIR 2018
CountryFrance
CityParis
Period9/23/189/27/18

ASJC Scopus subject areas

  • Music
  • Information Systems

Fingerprint Dive into the research topics of 'Analysis of common design choices in deep learning systems for downbeat tracking'. Together they form a unique fingerprint.

  • Cite this

    Fuentes, M., McFee, B., Crayencour, H. C., Essid, S., & Bello, J. P. (2018). Analysis of common design choices in deep learning systems for downbeat tracking. In E. Gomez, X. Hu, E. Humphrey, & E. Benetos (Eds.), Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018 (pp. 106-112). (Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018). International Society for Music Information Retrieval.