TY - JOUR
T1 - Open-Source Practices for Music Signal Processing Research
T2 - Recommendations for Transparent, Sustainable, and Reproducible Audio Research
AU - McFee, Brian
AU - Kim, Jong Wook
AU - Cartwright, Mark
AU - Salamon, Justin
AU - Bittner, Rachel M.
AU - Bello, Juan Pablo
N1 - Funding Information:
Juan Pablo Bello (jpbello@nyu.edu) received his B.Eng. degree in electronics in 1998 from the Universidad Simón Bolívar in Caracas, Venezuela, and in 2003 he received his Ph.D. degree in electronic engineering from Queen Mary University of London. He is a professor of music technology and computer science and engineering at New York University. His expertise is in digital signal processing, machine listening, and music information retrieval, topics that he teaches and on which he has published more than 100 papers and articles in books, journals, and conference proceedings. He is the director of the Music and Audio Research Lab, where he leads research on music informatics. His work has been supported by public and private institutions in Venezuela, the United Kingdom, and the United States, including Frontier and CAREER Awards from the National Science Foundation and a Fulbright scholar grant for multidisciplinary studies in France. He is a Senior Member of the IEEE.
Publisher Copyright:
© 1991-2012 IEEE.
PY - 2019/1
Y1 - 2019/1
N2 - In the early years of music information retrieval (MIR), research problems were often centered around conceptually simple tasks, and methods were evaluated on small, idealized data sets. A canonical example of this is genre recognition-i.e., Which one of n genres describes this song?-which was often evaluated on the GTZAN data set (1,000 musical excerpts balanced across ten genres) [1]. As task definitions were simple, so too were signal analysis pipelines, which often derived from methods for speech processing and recognition and typically consisted of simple methods for feature extraction, statistical modeling, and evaluation. When describing a research system, the expected level of detail was superficial: it was sufficient to state, e.g., the number of mel-frequency cepstral coefficients used, the statistical model (e.g., a Gaussian mixture model), the choice of data set, and the evaluation criteria, without stating the underlying software dependencies or implementation details. Because of an increased abundance of methods, the proliferation of software toolkits, the explosion of machine learning, and a focus shift toward more realistic problem settings, modern research systems are substantially more complex than their predecessors. Modern MIR researchers must pay careful attention to detail when processing metadata, implementing evaluation criteria, and disseminating results.
AB - In the early years of music information retrieval (MIR), research problems were often centered around conceptually simple tasks, and methods were evaluated on small, idealized data sets. A canonical example of this is genre recognition-i.e., Which one of n genres describes this song?-which was often evaluated on the GTZAN data set (1,000 musical excerpts balanced across ten genres) [1]. As task definitions were simple, so too were signal analysis pipelines, which often derived from methods for speech processing and recognition and typically consisted of simple methods for feature extraction, statistical modeling, and evaluation. When describing a research system, the expected level of detail was superficial: it was sufficient to state, e.g., the number of mel-frequency cepstral coefficients used, the statistical model (e.g., a Gaussian mixture model), the choice of data set, and the evaluation criteria, without stating the underlying software dependencies or implementation details. Because of an increased abundance of methods, the proliferation of software toolkits, the explosion of machine learning, and a focus shift toward more realistic problem settings, modern research systems are substantially more complex than their predecessors. Modern MIR researchers must pay careful attention to detail when processing metadata, implementing evaluation criteria, and disseminating results.
UR - http://www.scopus.com/inward/record.url?scp=85059779386&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059779386&partnerID=8YFLogxK
U2 - 10.1109/MSP.2018.2875349
DO - 10.1109/MSP.2018.2875349
M3 - Article
AN - SCOPUS:85059779386
VL - 36
SP - 128
EP - 137
JO - IEEE Audio and Electroacoustics Newsletter
JF - IEEE Audio and Electroacoustics Newsletter
SN - 1053-5888
IS - 1
M1 - 8588406
ER -