We revisit the analysis by synthesis (A × S) approach to speech recognition. In the late 1950s and 1960s, Stevens and Halle proposed a model of spoken word recognition in which candidate word representations were synthesised from brief cues in the auditory signal and analysed against the input signal in tightly linked bottom-up/top-down fashion. While this approach failed to garner much support at the time, recent years have brought a surge of interest in Bayesian approaches to perception, and the idea of A × S has consequently gained attention, particularly in the domain of visual perception. We review the model and illustrate some data from speech perception that are well-accounted for in the context of such an architecture. We focus on prediction in speech perception, an operation at the centre of the A × S algorithm. The data reviewed here and the current possibilities to study online measures of speech processing using cognitive neuroscience methods, in our view, add to a provocative series of arguments why A × S should be reconsidered as a contender in speech recognition research, complementing currently more dominant models.
- Cognitive neuroscience
ASJC Scopus subject areas
- Experimental and Cognitive Psychology
- Language and Linguistics
- Linguistics and Language