Abstract
Most machine learning tasks are inherently multi-objective. This means that the learner has to come up with a model that performs well across a number of base objectives L1, . . ., Lp, as opposed to a single one. Since optimizing with respect to multiple objectives at the same time is often computationally expensive, the base objectives are often combined in an ensemble Ppk=1 ?kLk, thereby reducing the problem to scalar optimization. The mixture weights ?k are set to uniform or some other fixed distribution, based on the learner’s preferences. We argue that learning with a fixed distribution on the mixture weights runs the risk of overfitting to some individual objectives and significantly harming others, despite performing well on an entire ensemble. Moreover, in reality, the true preferences of a learner across multiple objectives are often unknown or hard to express as a specific distribution. Instead, we propose a new framework of Agnostic Learning with Multiple Objectives (ALMO), where a model is optimized for any weights in the mixture of base objectives. We present data-dependent Rademacher complexity guarantees for learning in the ALMO framework, which are used to guide a scalable optimization algorithm and the corresponding regularization. We present convergence guarantees for this algorithm, assuming convexity of the loss functions and the underlying hypothesis space. We further implement the algorithm in a popular symbolic gradient computation framework and empirically demonstrate on a number of datasets the benefits of ALMO framework versus learning with a fixed mixture weights distribution.
Original language | English (US) |
---|---|
Journal | Advances in Neural Information Processing Systems |
Volume | 2020-December |
State | Published - 2020 |
Event | 34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online Duration: Dec 6 2020 → Dec 12 2020 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing