TY - JOUR
T1 - N-way composition of weighted finite-state transducers
AU - Allauzen, Cyril
AU - Mohri, Mehryar
PY - 2009/8
Y1 - 2009/8
N2 - Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, n-way composition, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The worst-case complexity of our algorithm for composing three transducers T1, T2, and T3 resulting in T, is O(|T|Q min(d(T1)d(T3), d(T2)) + |T|E), where |·|Q denotes the number of states, |·| E the number of transitions, and d(·) the maximum out-degree. As in regular composition, the use of perfect hashing requires a pre-processing step with linear-time expected complexity in the size of the input transducers. In many cases, this approach significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.
AB - Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, n-way composition, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The worst-case complexity of our algorithm for composing three transducers T1, T2, and T3 resulting in T, is O(|T|Q min(d(T1)d(T3), d(T2)) + |T|E), where |·|Q denotes the number of states, |·| E the number of transitions, and d(·) the maximum out-degree. As in regular composition, the use of perfect hashing requires a pre-processing step with linear-time expected complexity in the size of the input transducers. In many cases, this approach significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.
UR - http://www.scopus.com/inward/record.url?scp=68649098986&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=68649098986&partnerID=8YFLogxK
U2 - 10.1142/S0129054109006772
DO - 10.1142/S0129054109006772
M3 - Article
AN - SCOPUS:68649098986
SN - 0129-0541
VL - 20
SP - 613
EP - 627
JO - International Journal of Foundations of Computer Science
JF - International Journal of Foundations of Computer Science
IS - 4
ER -