Crowdsourced perceptual ratings of voice quality in people with parkinson’s disease before and after intensive voice and articulation therapies: Secondary outcome of a randomized controlled trial

Tara McAllister, Christopher Nightingale, Gemma Moya-Galé, Ava Kawamura, Lorraine Olson Ramig

Research output: Contribution to journalArticlepeer-review

Abstract

Purpose: Limited research has examined the suitability of crowdsourced ratings to measure treatment effects in speakers with Parkinson’s disease (PD), particularly for constructs such as voice quality. This study obtained measures of reliability and validity for crowdsourced listeners’ ratings of voice quality in speech samples from a published study. We also investigated whether aggregated listener ratings would replicate the original study’s findings of treatment effects based on the Acoustic Voice Quality Index (AVQI) measure. Method: This study reports a secondary outcome measure of a randomized controlled trial with speakers with dysarthria associated with PD, including two active comparators (Lee Silverman Voice Treatment [LSVT LOUD] and LSVT ARTIC), an inactive compara-tor (untreated PD), and a healthy control group. Speech samples from three time points (pretreatment, posttreatment, and 6-month follow-up) were presented in random order for rating as “typical” or “atypical” with respect to voice quality. Untrained listeners were recruited through the Amazon Mechanical Turk crowdsourcing platform until each sam-ple had at least 25 ratings. Results: Intrarater reliability for tokens presented repeatedly was substantial (Cohen’s κ =.65–.70), and interrater agreement significantly exceeded chance level. There was a significant correlation of moderate magnitude between the AVQI and the proportion of listeners classifying a given sample as “typical.” Consistent with the original study, we found a significant interaction between group and time point, with the LSVT LOUD group alone showing significantly higher perceptually rated voice quality at posttreatment and follow-up relative to the pretreatment time point. Conclusions: These results suggest that crowdsourcing can be a valid means to evaluate clinical speech samples, even for less familiar constructs such as voice quality. The findings also replicate the results of the study by Moya-Galé et al. (2022) and support their functional relevance by demonstrating that the effects of treatment measured acoustically in that study are perceptually appar-ent to everyday listeners.

Original languageEnglish (US)
Pages (from-to)1541-1562
Number of pages22
JournalJournal of Speech, Language, and Hearing Research
Volume66
Issue number5
DOIs
StatePublished - May 2023

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'Crowdsourced perceptual ratings of voice quality in people with parkinson’s disease before and after intensive voice and articulation therapies: Secondary outcome of a randomized controlled trial'. Together they form a unique fingerprint.

Cite this