Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, n-way composition, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The worst-case complexity of our algorithm for composing three transducers T1, T2, and T3 resulting in T, is O(|T|Q min(d(T1)d(T3), d(T2)) + |T|E), where |·|Q denotes the number of states, |·| E the number of transitions, and d(·) the maximum out-degree. As in regular composition, the use of perfect hashing requires a pre-processing step with linear-time expected complexity in the size of the input transducers. In many cases, this approach significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.
|Number of pages
|International Journal of Foundations of Computer Science
|Published - Aug 2009
ASJC Scopus subject areas
- Computer Science (miscellaneous)