TY - GEN
T1 - Distribution kernels based on moments of counts
AU - Cortes, Corinna
AU - Mohri, Mehryar
PY - 2004
Y1 - 2004
N2 - Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational kernels, to extend kernel methods to the analysis of such variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification using Support Vector Machines. However, the rational kernels previously introduced do not fully encompass distributions over alternate sequences. Prior similarity measures between two weighted automata are based only on the expected counts of cooccurring subsequences and ignore similarities (or dissimilarities) in higher order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploit this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.
AB - Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational kernels, to extend kernel methods to the analysis of such variable-length sequences or more generally weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification using Support Vector Machines. However, the rational kernels previously introduced do not fully encompass distributions over alternate sequences. Prior similarity measures between two weighted automata are based only on the expected counts of cooccurring subsequences and ignore similarities (or dissimilarities) in higher order moments of the distributions of these counts. In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploit this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.
UR - http://www.scopus.com/inward/record.url?scp=14344261324&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14344261324&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:14344261324
SN - 1581138385
SN - 9781581138382
T3 - Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004
SP - 193
EP - 200
BT - Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004
A2 - Greiner, R.
A2 - Schuurmans, D.
T2 - Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004
Y2 - 4 July 2004 through 8 July 2004
ER -