TY - CONF
T1 - VICREG
T2 - 10th International Conference on Learning Representations, ICLR 2022
AU - Bardes, Adrien
AU - Ponce, Jean
AU - LeCun, Yann
N1 - Funding Information:
Acknowledgement. Jean Ponce was supported in part by the French government under management of Agence Nationale de la Recherche as part of the ”Investissements d’avenir” program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute), the Louis Vuitton/ENS Chair in Artificial Intelligence and the Inria/NYU collaboration. Adrien Bardes was supported in part by a FAIR/Prairie CIFRE PhD Fellowship. The authors wish to thank Jure Zbontar for the BYOL implementation, Stéphane Deny for useful comments on the paper, and Li Jing, Yubei Chen, Mikael Henaff, Pascal Vincent and Geoffrey Zweig for useful discussions. We thank Quentin Duval and the VISSL team for help obtaining the results of table 2.
Publisher Copyright:
© 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Recent self-supervised methods for image representation learning maximize the agreement between embedding vectors produced by encoders fed with different views of the same image. The main challenge is to prevent a collapse in which the encoders produce constant or non-informative vectors. We introduce VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with two regularizations terms applied to both embeddings separately: (1) a term that maintains the variance of each embedding dimension above a threshold, (2) a term that decorrelates each pair of variables. Unlike most other approaches to the same problem, VICReg does not require techniques such as: weight sharing between the branches, batch normalization, feature-wise normalization, output quantization, stop gradient, memory banks, etc., and achieves results on par with the state of the art on several downstream tasks. In addition, we show that our variance regularization term stabilizes the training of other methods and leads to performance improvements.
AB - Recent self-supervised methods for image representation learning maximize the agreement between embedding vectors produced by encoders fed with different views of the same image. The main challenge is to prevent a collapse in which the encoders produce constant or non-informative vectors. We introduce VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with two regularizations terms applied to both embeddings separately: (1) a term that maintains the variance of each embedding dimension above a threshold, (2) a term that decorrelates each pair of variables. Unlike most other approaches to the same problem, VICReg does not require techniques such as: weight sharing between the branches, batch normalization, feature-wise normalization, output quantization, stop gradient, memory banks, etc., and achieves results on par with the state of the art on several downstream tasks. In addition, we show that our variance regularization term stabilizes the training of other methods and leads to performance improvements.
UR - http://www.scopus.com/inward/record.url?scp=85149669483&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85149669483&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85149669483
Y2 - 25 April 2022 through 29 April 2022
ER -