Statistical considerations for crowdsourced perceptual ratings of human speech productions

Daniel Fernández, Daphna Harel, Panos Ipeirotis, Tara McAllister

Research output: Contribution to journalArticlepeer-review

Abstract

Crowdsourcing has become a major tool for scholarly research since its introduction to the academic sphere in 2008. However, unlike in traditional laboratory settings, it is nearly impossible to control the conditions under which workers on crowdsourcing platforms complete tasks. In the study of communication disorders, crowdsourcing has provided a novel solution to the collection of perceptual ratings of human speech production. Such ratings allow researchers to gauge whether a treatment yields meaningful change in how human listeners' perceive disordered speech. This paper will explore some statistical considerations of crowdsourced data with specific focus on collecting perceptual ratings of human speech productions. Random effects models are applied to crowdsourced perceptual ratings collected in both a continuous and binary fashion. A simulation study is conducted to test the reliability of the proposed models under differing numbers of workers and tasks. Finally, this methodology is applied to a data set from the study of communication disorders.

Original languageEnglish (US)
Pages (from-to)1364-1384
Number of pages21
JournalJournal of Applied Statistics
Volume46
Issue number8
DOIs
StatePublished - Jun 11 2019

Keywords

  • Amazon Mechanical Turk
  • Reliability
  • communication disorders
  • crowdsourcing
  • random effects models
  • validity

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Statistical considerations for crowdsourced perceptual ratings of human speech productions'. Together they form a unique fingerprint.

Cite this