Increasing drum transcription vocabulary using data synthesis

Mark Cartwright, Juan Pablo Bello

Research output: Contribution to conferencePaper

Abstract

Current datasets for automatic drum transcription (ADT) are small and limited due to the tedious task of annotating onset events. While some of these datasets contain large vocabularies of percussive instrument classes (e.g. ~20 classes), many of these classes occur very infrequently in the data. This paucity of data makes it difficult to train models that support such large vocabularies. Therefore, data-driven drum transcription models often focus on a small number of percussive instrument classes (e.g. 3 classes). In this paper, we propose to support large-vocabulary drum transcription by generating a large synthetic dataset (210,000 eight second examples) of audio examples for which we have ground-truth transcriptions. Using this synthetic dataset along with existing drum transcription datasets, we train convolutional-recurrent neural networks (CRNNs) in a multi-task framework to support large-vocabulary ADT. We find that training on both the synthetic and real music drum transcription datasets together improves performance on not only large-vocabulary ADT, but also beat / downbeat detection small-vocabulary ADT.

Original languageEnglish (US)
Pages72-79
Number of pages8
StatePublished - 2018
Event21st International Conference on Digital Audio Effects, DAFx 2018 - Aveiro, Portugal
Duration: Sep 4 2018Sep 8 2018

Conference

Conference21st International Conference on Digital Audio Effects, DAFx 2018
CountryPortugal
CityAveiro
Period9/4/189/8/18

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Computer Science Applications
  • Signal Processing
  • Music

Fingerprint Dive into the research topics of 'Increasing drum transcription vocabulary using data synthesis'. Together they form a unique fingerprint.

  • Cite this

    Cartwright, M., & Bello, J. P. (2018). Increasing drum transcription vocabulary using data synthesis. 72-79. Paper presented at 21st International Conference on Digital Audio Effects, DAFx 2018, Aveiro, Portugal.