TY - GEN
T1 - Deep salience representations for F0 estimation in polyphonic music
AU - Bittner, Rachel M.
AU - McFee, Brian
AU - Salamon, Justin
AU - Li, Peter
AU - Bello, Juan P.
N1 - Publisher Copyright:
© 2019 Rachel M. Bittner, Brian McFee, Justin Salamon, Peter Li, Juan P. Bello.
PY - 2017
Y1 - 2017
N2 - Estimating fundamental frequencies in polyphonic music remains a notoriously difficult task in Music Information Retrieval. While other tasks, such as beat tracking and chord recognition have seen improvement with the application of deep learning models, little work has been done to apply deep learning methods to fundamental frequency related tasks including multi-f0 and melody tracking, primarily due to the scarce availability of labeled data. In this work, we describe a fully convolutional neural network for learning salience representations for estimating fundamental frequencies, trained using a large, semi-automatically generated f0 dataset. We demonstrate the effectiveness of our model for learning salience representations for both multi-f0 and melody tracking in polyphonic audio, and show that our models achieve state-of-the-art performance on several multi-f0 and melody datasets. We conclude with directions for future research.
AB - Estimating fundamental frequencies in polyphonic music remains a notoriously difficult task in Music Information Retrieval. While other tasks, such as beat tracking and chord recognition have seen improvement with the application of deep learning models, little work has been done to apply deep learning methods to fundamental frequency related tasks including multi-f0 and melody tracking, primarily due to the scarce availability of labeled data. In this work, we describe a fully convolutional neural network for learning salience representations for estimating fundamental frequencies, trained using a large, semi-automatically generated f0 dataset. We demonstrate the effectiveness of our model for learning salience representations for both multi-f0 and melody tracking in polyphonic audio, and show that our models achieve state-of-the-art performance on several multi-f0 and melody datasets. We conclude with directions for future research.
UR - http://www.scopus.com/inward/record.url?scp=85069924285&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85069924285&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85069924285
T3 - Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017
SP - 63
EP - 70
BT - Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017
A2 - Cunningham, Sally Jo
A2 - Duan, Zhiyao
A2 - Hu, Xiao
A2 - Turnbull, Douglas
PB - International Society for Music Information Retrieval
T2 - 18th International Society for Music Information Retrieval Conference, ISMIR 2017
Y2 - 23 October 2017 through 27 October 2017
ER -