Abstract
In this paper, we aim to raise awareness of the limitations of the F-measure when evaluating the quality of the boundaries found in the automatic segmentation of music. We present and discuss the results of various experiments where subjects listened to different musical excerpts containing boundary indications and had to rate the quality of the boundaries. These boundaries were carefully generated from state-of-the-art segmentation algorithms as well as human-annotated data. The results show that humans tend to give more relevance to the precision component of the F-measure rather than the recall component, therefore making the classical F-measure not as perceptually informative as currently assumed. Based on the results of the experiments, we discuss the potential of an alternative evaluation based on the F-measure that emphasizes precision over recall, making the section boundary evaluation more expressive and reliable.
Original language | English (US) |
---|---|
Pages | 265-270 |
Number of pages | 6 |
State | Published - 2014 |
Event | 15th International Society for Music Information Retrieval Conference, ISMIR 2014 - Taipei, Taiwan, Province of China Duration: Oct 27 2014 → Oct 31 2014 |
Conference
Conference | 15th International Society for Music Information Retrieval Conference, ISMIR 2014 |
---|---|
Country/Territory | Taiwan, Province of China |
City | Taipei |
Period | 10/27/14 → 10/31/14 |
ASJC Scopus subject areas
- Music
- Information Systems