Sambaset: A dataset of historical samba de enredo recordings for computational music analysis

Lucas S. Maia, Magdalena Fuentes, Luiz W.P. Biscainho, Martín Rocamora, Slim Essid

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In the last few years, several datasets have been released to meet the requirements of "hungry" yet promising datadriven approaches in music technology research. Since, for historical reasons, most investigations conducted in the field still revolve around music of the so-called "Western" tradition, the corresponding data, methodology and conclusions carry a strong cultural bias. Music of non- "Western" background, whenever present, is usually underrepresented, poorly labeled, or even mislabeled, the exception being projects that aim at specifically describing such music. In this paper we present SAMBASET, a dataset of Brazilian samba music that contains over 40 hours of historical and modern samba de enredo commercial recordings. To the best of our knowledge, this is the first dataset of this genre. We describe the collection of metadata (e.g. artist, composer, release date) and outline our semiautomatic approach to the challenging task of annotating beats in this large dataset, which includes the assessment of the performance of state-of-the-art beat tracking algorithms for this specific case. Finally, we present a study on tempo and beat tracking that illustrates SAMBASET's value, and we comment on other tasks for which it could be used.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR 2019
EditorsArthur Flexer, Geoffroy Peeters, Julian Urbano, Anja Volk
PublisherInternational Society for Music Information Retrieval
Number of pages8
ISBN (Electronic)9781732729919
StatePublished - 2019
Event20th International Society for Music Information Retrieval Conference, ISMIR 2019 - Delft, Netherlands
Duration: Nov 4 2019Nov 8 2019

Publication series

NameProceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR 2019


Conference20th International Society for Music Information Retrieval Conference, ISMIR 2019

ASJC Scopus subject areas

  • Music
  • Information Systems


Dive into the research topics of 'Sambaset: A dataset of historical samba de enredo recordings for computational music analysis'. Together they form a unique fingerprint.

Cite this