TY - JOUR
T1 - Kernel methods for learning languages
AU - Kontorovich, Leonid (Aryeh)
AU - Cortes, Corinna
AU - Mohri, Mehryar
N1 - Funding Information:
Much of the work by Leonid Kontorovich was done while visiting the Hebrew University, in Jerusalem, Israel, in the summer of 2003. Many thanks to Yoram Singer for providing hosting and guidance at the Hebrew University. Thanks also to Daniel Neill and Martin Zinkevich for helpful discussions. This work was supported in part by the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. The research at CMU was supported in part by NSF ITR grant IIS-0205456. This publication only reflects the authors’ views.
Funding Information:
The work of Mehryar Mohri was partially funded by a Google Research Award and the New York State Office of Science Technology and Academic Research (NYSTAR). This project was also sponsored in part by the Department of the Army Award Number W23RYX-3275-N605. The US Army Medical Research Acquisition Activity, 820 Chandler Street, Fort Detrick MD 21702-5014 is the awarding and administering acquisition office. The content of this material does not necessarily reflect the position or the policy of the Government and no official endorsement should be inferred.
PY - 2008/10/17
Y1 - 2008/10/17
N2 - This paper studies a novel paradigm for learning formal languages from positive and negative examples which consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. Such mappings can often be represented flexibly with string kernels, with the additional benefit of computational efficiency. The paradigm inspected can thus be viewed as that of using kernel methods for learning languages. We initiate the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. We introduce a subsequence feature mapping to a Hilbert space and prove that piecewise-testable languages are linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. We also show that the positive definite symmetric kernel associated to this embedding is a rational kernel and show that it can be computed in quadratic time using general-purpose weighted automata algorithms. Our examination of the linear separability of piecewise-testable languages leads us to study the general problem of separability with other finite regular covers. We show that all languages linearly separable under a regular finite cover embedding, a generalization of the subsequence embedding we use, are regular. We give a general analysis of the use of support vector machines in combination with kernels to determine a separating hyperplane for languages and study the corresponding learning guarantees. Our analysis includes several additional linear separability results in abstract settings and partial characterizations for the linear separability of the family of all regular languages.
AB - This paper studies a novel paradigm for learning formal languages from positive and negative examples which consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. Such mappings can often be represented flexibly with string kernels, with the additional benefit of computational efficiency. The paradigm inspected can thus be viewed as that of using kernel methods for learning languages. We initiate the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. We introduce a subsequence feature mapping to a Hilbert space and prove that piecewise-testable languages are linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. We also show that the positive definite symmetric kernel associated to this embedding is a rational kernel and show that it can be computed in quadratic time using general-purpose weighted automata algorithms. Our examination of the linear separability of piecewise-testable languages leads us to study the general problem of separability with other finite regular covers. We show that all languages linearly separable under a regular finite cover embedding, a generalization of the subsequence embedding we use, are regular. We give a general analysis of the use of support vector machines in combination with kernels to determine a separating hyperplane for languages and study the corresponding learning guarantees. Our analysis includes several additional linear separability results in abstract settings and partial characterizations for the linear separability of the family of all regular languages.
KW - Finite automata
KW - Kernels
KW - Learning automata
KW - Margin theory
KW - Piecewise-testable languages
KW - Support vector machines
UR - http://www.scopus.com/inward/record.url?scp=51049103186&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51049103186&partnerID=8YFLogxK
U2 - 10.1016/j.tcs.2008.06.037
DO - 10.1016/j.tcs.2008.06.037
M3 - Article
AN - SCOPUS:51049103186
SN - 0304-3975
VL - 405
SP - 223
EP - 236
JO - Theoretical Computer Science
JF - Theoretical Computer Science
IS - 3
ER -