TY - JOUR
T1 - Convex risk minimization and conditional probability estimation
AU - Telgarsky, Matus
AU - Dudík, Miroslav
AU - Schapire, Robert
N1 - Funding Information:
This work is supported by National Basic Research Program of China (973 Program) under Grant 2012CB316400 and National Natural Science Foundation of China (Grant No: 61125203, 61222207, 61233011, 90920303).
Publisher Copyright:
© 2015 M. Telgarsky, M. Dudík & R. Schapire.
PY - 2015
Y1 - 2015
N2 - This paper proves, in very general settings, that convex risk minimization is a procedure to select a unique conditional probability model determined by the classification problem. Unlike most previous work, we give results that are general enough to include cases in which no minimum exists, as occurs typically, for instance, with standard boosting algorithms. Concretely, we first show that any sequence of predictors minimizing convex risk over the source distribution will converge to this unique model when the class of predictors is linear (but potentially of infinite dimension). Secondly, we show the same result holds for empirical risk minimization whenever this class of predictors is finite dimensional, where the essential technical contribution is a norm-free generalization bound.
AB - This paper proves, in very general settings, that convex risk minimization is a procedure to select a unique conditional probability model determined by the classification problem. Unlike most previous work, we give results that are general enough to include cases in which no minimum exists, as occurs typically, for instance, with standard boosting algorithms. Concretely, we first show that any sequence of predictors minimizing convex risk over the source distribution will converge to this unique model when the class of predictors is linear (but potentially of infinite dimension). Secondly, we show the same result holds for empirical risk minimization whenever this class of predictors is finite dimensional, where the essential technical contribution is a norm-free generalization bound.
KW - Classification
KW - Conditional Probability Estimation
KW - Consistency
KW - Convex Duality
KW - Maximum Entropy
KW - Orlicz Spaces
UR - http://www.scopus.com/inward/record.url?scp=84984671856&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84984671856&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84984671856
SN - 1532-4435
VL - 40
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
IS - 2015
T2 - 28th Conference on Learning Theory, COLT 2015
Y2 - 2 July 2015 through 6 July 2015
ER -