TY - GEN
T1 - Structural maxent models
AU - Cortes, Corinna
AU - Kuznetsov, Vitaly
AU - Mohri, Mehryar
AU - Syed, Umar
PY - 2015
Y1 - 2015
N2 - We present a new class of density estimation models, Structural Maxent models, with feature functions selected from a union of possibly very complex sub-families and yet benefiting from strong learning guarantees. The design of our models is based on a new principle supported by uniform convergence bounds and taking into consideration the complexity of the different sub-families composing the full set of features. We prove new data-dependent learning bounds for our models, expressed in terms of the Rademacher complexities of these sub-families. We also prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation, and report the results of several experiments demonstrating that its performance improves on that of existing Li-norm regularized Maxent algorithms. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models also making use of a union of possibly complex feature subfamilies. We prove a duality theorem for these models as well, which reveals their connection with existing binary and multi-class deep boosting algorithms.
AB - We present a new class of density estimation models, Structural Maxent models, with feature functions selected from a union of possibly very complex sub-families and yet benefiting from strong learning guarantees. The design of our models is based on a new principle supported by uniform convergence bounds and taking into consideration the complexity of the different sub-families composing the full set of features. We prove new data-dependent learning bounds for our models, expressed in terms of the Rademacher complexities of these sub-families. We also prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation, and report the results of several experiments demonstrating that its performance improves on that of existing Li-norm regularized Maxent algorithms. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models also making use of a union of possibly complex feature subfamilies. We prove a duality theorem for these models as well, which reveals their connection with existing binary and multi-class deep boosting algorithms.
UR - http://www.scopus.com/inward/record.url?scp=84969504843&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969504843&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84969504843
T3 - 32nd International Conference on Machine Learning, ICML 2015
SP - 391
EP - 399
BT - 32nd International Conference on Machine Learning, ICML 2015
A2 - Bach, Francis
A2 - Blei, David
PB - International Machine Learning Society (IMLS)
T2 - 32nd International Conference on Machine Learning, ICML 2015
Y2 - 6 July 2015 through 11 July 2015
ER -