Voice or speaker recognition is critical in a wide variety of social contexts. In this study, we investigated the contributions of acoustic, phonological, lexical, and semantic information toward voice recognition. Native English speaking participants were trained to recognize five speakers in five conditions: non-speech, Mandarin, German, pseudo-English, and English. We showed that voice recognition significantly improved as more information became available, from purely acoustic features in non-speech to additional phonological information varying in familiarity. Moreover, we found that the recognition performance is transferable between training and testing in phonologically familiar conditions (German, pseudo-English, and English), but not in unfamiliar (Mandarin) or non-speech conditions. These results provide evidence suggesting that bottom-up acoustic analysis and top-down influence from phonological processing collaboratively govern voice recognition.
ASJC Scopus subject areas