Data driven and discriminative projections for large-scale cover song identification

Eric J. Humphrey, Oriol Nieto, Juan P. Bello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The predominant approach to computing document similarity in web scale applications proceeds by encoding task-specific invariance in a vectorized representation, such that the relationship between items can be computed efficiently by a simple scoring function, e.g. Euclidean distance. Here, we improve upon previous work in large-scale cover song identification by using data-driven projections at different time-scales to capture local features and embed summary vectors into a semantically organized space. We achieve this by projecting 2D-Fourier Magnitude Coefficients (2D-FMCs) of beat-chroma patches into a sparse, high dimensional representation which, due to the shift invariance properties of the Fourier Transform, is similar in principle to convolutional sparse coding. After aggregating these local beat-chroma projections, we apply supervised dimensionality reduction to recover an embedding where distance is useful for cover song retrieval. Evaluating on the Million Song Dataset, we find our method outperforms the current state of the art overall, but significantly so for top-k metrics, which indicate improved usability.

Original languageEnglish (US)
Title of host publicationProceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013
EditorsAlceu de Souza Britto, Fabien Gouyon, Simon Dixon
PublisherInternational Society for Music Information Retrieval
Pages149-154
Number of pages6
ISBN (Electronic)9780615900650
StatePublished - 2013
Event14th International Society for Music Information Retrieval Conference, ISMIR 2013 - Curitiba, Brazil
Duration: Nov 4 2013Nov 8 2013

Publication series

NameProceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013

Conference

Conference14th International Society for Music Information Retrieval Conference, ISMIR 2013
CountryBrazil
CityCuritiba
Period11/4/1311/8/13

ASJC Scopus subject areas

  • Music
  • Information Systems

Fingerprint Dive into the research topics of 'Data driven and discriminative projections for large-scale cover song identification'. Together they form a unique fingerprint.

  • Cite this

    Humphrey, E. J., Nieto, O., & Bello, J. P. (2013). Data driven and discriminative projections for large-scale cover song identification. In A. D. S. Britto, F. Gouyon, & S. Dixon (Eds.), Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013 (pp. 149-154). (Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013). International Society for Music Information Retrieval.