TY - CONF
T1 - FEW-SHOT LEARNING VIA LEARNING THE REPRESENTATION, PROVABLY
AU - Du, Simon S.
AU - Hu, Wei
AU - Kakade, Sham M.
AU - Lee, Jason D.
AU - Lei, Qi
N1 - Funding Information:
SSD acknowledges support of National Science Foundation (Grant No. DMS-1638352) and the Infosys Membership. JDL acknowledges support of the ARO under MURI Award W911NF-11-1-0303, the Sloan Research Fellowship, and NSF CCF 2002272. WH is supported by NSF, ONR, Simons Foundation, Schmidt Foundation, Amazon Research, DARPA and SRC. QL is supported by NSF #2030859 and the Computing Research Association for the CIFellows Project. The authors also acknowledge the generous support of the Institute for Advanced Study on the Theoretical Machine Learning program, where SSD, WH, JDL, and QL were participants.
Publisher Copyright:
© 2021 ICLR 2021 - 9th International Conference on Learning Representations. All rights reserved.
PY - 2021
Y1 - 2021
N2 - This paper studies few-shot learning via representation learning, where one uses T source tasks with n1 data per task to learn a representation in order to reduce the sample complexity of a target task for which there is only n2(≪ n1) data. Specifically, we focus on the setting where there exists a good common representation between source and target, and our goal is to understand how much a sample size reduction is possible. First, we study the setting where this common representation is low-dimensional and provide a risk bound of O(ndk1T + nk2 ) on the target task for the linear representation class; here d is the ambient input dimension and k(≪ d) is the dimension of the representation. This result bypasses the Ω(T1) barrier under the i.i.d. task assumption, and can capture the desired property that all n1T samples from source tasks can be pooled together for representation learning. We further extend this result to handle a general representation function class and obtain a similar result. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained (say in norm); here, we again demonstrate the advantage of representation learning in both high-dimensional linear regression and neural networks, and show that representation learning can fully utilize all n1T samples from source tasks.
AB - This paper studies few-shot learning via representation learning, where one uses T source tasks with n1 data per task to learn a representation in order to reduce the sample complexity of a target task for which there is only n2(≪ n1) data. Specifically, we focus on the setting where there exists a good common representation between source and target, and our goal is to understand how much a sample size reduction is possible. First, we study the setting where this common representation is low-dimensional and provide a risk bound of O(ndk1T + nk2 ) on the target task for the linear representation class; here d is the ambient input dimension and k(≪ d) is the dimension of the representation. This result bypasses the Ω(T1) barrier under the i.i.d. task assumption, and can capture the desired property that all n1T samples from source tasks can be pooled together for representation learning. We further extend this result to handle a general representation function class and obtain a similar result. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained (say in norm); here, we again demonstrate the advantage of representation learning in both high-dimensional linear regression and neural networks, and show that representation learning can fully utilize all n1T samples from source tasks.
UR - http://www.scopus.com/inward/record.url?scp=85119334732&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119334732&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85119334732
T2 - 9th International Conference on Learning Representations, ICLR 2021
Y2 - 3 May 2021 through 7 May 2021
ER -