TY - JOUR
T1 - Online Non-Additive Path Learning under Full and Partial Information
AU - Cortes, Corinna
AU - Kuznetsov, Vitaly
AU - Mohri, Mehryar
AU - Rahmanian, Holakou
AU - Warmuth, Manfred K.
N1 - Funding Information:
The work of MM was partly funded by NSF CCF-1535987 and NSF IIS-1618662. Part of this work was done while MKW was at UC Santa Cruz, supported by NSF grant IIS-1619271.
Publisher Copyright:
© 2019 Proceedings of Machine Learning Research. All rights reserved.
PY - 2019
Y1 - 2019
N2 - We study the problem of online path learning with non-additive gains, which is a central problem appearing in several applications, including ensemble structured prediction. We present new online algorithms for path learning with non-additive count-based gains for the three settings of full information, semi-bandit and full bandit with very favorable regret guarantees. A key component of our algorithms is the definition and computation of an intermediate context-dependent automaton that enables us to use existing algorithms designed for additive gains. We further apply our methods to the important application of ensemble structured prediction. Finally, beyond count-based gains, we give an efficient implementation of the EXP3 algorithm for the full bandit setting with an arbitrary (non-additive) gain.
AB - We study the problem of online path learning with non-additive gains, which is a central problem appearing in several applications, including ensemble structured prediction. We present new online algorithms for path learning with non-additive count-based gains for the three settings of full information, semi-bandit and full bandit with very favorable regret guarantees. A key component of our algorithms is the definition and computation of an intermediate context-dependent automaton that enables us to use existing algorithms designed for additive gains. We further apply our methods to the important application of ensemble structured prediction. Finally, beyond count-based gains, we give an efficient implementation of the EXP3 algorithm for the full bandit setting with an arbitrary (non-additive) gain.
KW - finite-state automaton
KW - non-additive gains
KW - online learning
UR - http://www.scopus.com/inward/record.url?scp=85161215556&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161215556&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85161215556
SN - 2640-3498
VL - 98
SP - 274
EP - 299
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 30th International Conference on Algorithmic Learning Theory, ALT 2019
Y2 - 22 March 2019 through 24 March 2019
ER -