TY - GEN
T1 - A weight pushing algorithm for large vocabulary speech recognition
AU - Mohri, Mehryar
AU - Riley, Michael
PY - 2001
Y1 - 2001
N2 - Weighted finite-state transducers provide a general framework for the representation of the components of speech recognition systems; language models, pronunciation dictionaries, contextdependent models, HMM-level acoustic models, and the output word or phone lattices can all be represented by weighted automata and transducers. In general, a representation is not unique and there may be different weighted transducers realizing the same mapping. In particular, even when they have exactly the same topology with the same input and output labels, two equivalent transducers may differ by the way the weights are distributed along each path. We present a weight pushing algorithm that modifies the weights of a given weighted transducer in a way such that the transition probabilities form a stochastic distribution. This results in an equivalent transducer whose weight distribution is more suitable for pruning and speech recognition. We demonstrate substantial improvements of the speed of our recognition system in several tasks based on the use of this algorithm. We report a 45% speedup at 83% word accuracy with a simple single-pass 40; 000-word vocabulary North American Business News (NAB) recognition system on the DARPA Eval '95 test set. With the same technique, we report a 550% speedup at 88% word accuracy in rescoring NAB word lattices with more accurate 2nd-pass models. We finally report a 280% speedup at 68% word accuracy for 100; 000 first name-last name pairs recognition.
AB - Weighted finite-state transducers provide a general framework for the representation of the components of speech recognition systems; language models, pronunciation dictionaries, contextdependent models, HMM-level acoustic models, and the output word or phone lattices can all be represented by weighted automata and transducers. In general, a representation is not unique and there may be different weighted transducers realizing the same mapping. In particular, even when they have exactly the same topology with the same input and output labels, two equivalent transducers may differ by the way the weights are distributed along each path. We present a weight pushing algorithm that modifies the weights of a given weighted transducer in a way such that the transition probabilities form a stochastic distribution. This results in an equivalent transducer whose weight distribution is more suitable for pruning and speech recognition. We demonstrate substantial improvements of the speed of our recognition system in several tasks based on the use of this algorithm. We report a 45% speedup at 83% word accuracy with a simple single-pass 40; 000-word vocabulary North American Business News (NAB) recognition system on the DARPA Eval '95 test set. With the same technique, we report a 550% speedup at 88% word accuracy in rescoring NAB word lattices with more accurate 2nd-pass models. We finally report a 280% speedup at 68% word accuracy for 100; 000 first name-last name pairs recognition.
UR - http://www.scopus.com/inward/record.url?scp=85009070232&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85009070232&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85009070232
T3 - EUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology
SP - 1603
EP - 1606
BT - EUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology
A2 - Lindberg, Borge
A2 - Benner, Henrik
A2 - Dalsgaard, Paul
A2 - Tan, Zheng-Hua
PB - International Speech Communication Association
T2 - 7th European Conference on Speech Communication and Technology - Scandinavia, EUROSPEECH 2001
Y2 - 3 September 2001 through 7 September 2001
ER -