TY - JOUR
T1 - Statistical considerations for crowdsourced perceptual ratings of human speech productions
AU - Fernández, Daniel
AU - Harel, Daphna
AU - Ipeirotis, Panos
AU - McAllister, Tara
N1 - Funding Information:
This work was supported by the Moore-Sloan Data Sciences Environment grant from New York University, the New York University Research Challenge Fund Program, the National Institute of Health (R03DC 012883), and by the Marsden grant number E2987-3648 from the Royal Society of New Zealand.
Publisher Copyright:
© 2018, © 2018 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2019/6/11
Y1 - 2019/6/11
N2 - Crowdsourcing has become a major tool for scholarly research since its introduction to the academic sphere in 2008. However, unlike in traditional laboratory settings, it is nearly impossible to control the conditions under which workers on crowdsourcing platforms complete tasks. In the study of communication disorders, crowdsourcing has provided a novel solution to the collection of perceptual ratings of human speech production. Such ratings allow researchers to gauge whether a treatment yields meaningful change in how human listeners' perceive disordered speech. This paper will explore some statistical considerations of crowdsourced data with specific focus on collecting perceptual ratings of human speech productions. Random effects models are applied to crowdsourced perceptual ratings collected in both a continuous and binary fashion. A simulation study is conducted to test the reliability of the proposed models under differing numbers of workers and tasks. Finally, this methodology is applied to a data set from the study of communication disorders.
AB - Crowdsourcing has become a major tool for scholarly research since its introduction to the academic sphere in 2008. However, unlike in traditional laboratory settings, it is nearly impossible to control the conditions under which workers on crowdsourcing platforms complete tasks. In the study of communication disorders, crowdsourcing has provided a novel solution to the collection of perceptual ratings of human speech production. Such ratings allow researchers to gauge whether a treatment yields meaningful change in how human listeners' perceive disordered speech. This paper will explore some statistical considerations of crowdsourced data with specific focus on collecting perceptual ratings of human speech productions. Random effects models are applied to crowdsourced perceptual ratings collected in both a continuous and binary fashion. A simulation study is conducted to test the reliability of the proposed models under differing numbers of workers and tasks. Finally, this methodology is applied to a data set from the study of communication disorders.
KW - Amazon Mechanical Turk
KW - Reliability
KW - communication disorders
KW - crowdsourcing
KW - random effects models
KW - validity
UR - http://www.scopus.com/inward/record.url?scp=85057322802&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057322802&partnerID=8YFLogxK
U2 - 10.1080/02664763.2018.1547692
DO - 10.1080/02664763.2018.1547692
M3 - Article
AN - SCOPUS:85057322802
VL - 46
SP - 1364
EP - 1384
JO - Journal of Applied Statistics
JF - Journal of Applied Statistics
SN - 0266-4763
IS - 8
ER -