A generalized construction of integrated speech recognition transducers

Cyril Allauzen, Mehryar Mohri, Michael Riley, Brian Roark

Research output: Contribution to journalConference articlepeer-review

Abstract

We showed in previous work that weighted finite-state transducers provide a common representation for many components of a speech recognition system and described general algorithms for combining these representations to build a single optimized and compact transducer itegrating all these components, directly mapping from HMM states to words. This approach works well for certain well-controlled input transducers, but presents some problems related to the efficiency of composition and the applicability of determinization and weight-pushing with more general transducers. We generalize our prior construction of the integrated speech recognition transducer to work with an arbitrary number of component transducers and, to a large extent, release the constraints imposed to the type of input transducers by providing more general solutions to these problems. This generalization allowed us to deal with cases where our prior optimization did not apply. Our experiments in the AT&T HMIHY 0300 task and an AT&T VoiceTone task show the efficiency of our generalized optimization technique. We report a 1.6 recognition speed-up in the HMIHY 0300 task, 1.8 speed-up in a VoiceTone task using a word-based language model, and 1.7 using a class-based model.

Original languageEnglish (US)
Pages (from-to)I761-I764
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
StatePublished - 2004
EventProceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada
Duration: May 17 2004May 21 2004

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A generalized construction of integrated speech recognition transducers'. Together they form a unique fingerprint.

Cite this