The Effects of Regularization and Data Augmentation are Class Dependent

Randall Balestriero, Leon Bottou, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Regularization is a fundamental technique to improve a model's generalization performances by limiting its complexity. Deep Neural Networks (DNNs), which tend to overfit their training data, heavily rely on regularizers such as Data-Augmentation (DA) or weight-decay with hyper-parameters found from structural risk minimization, i.e., cross-validation. In this study, we demonstrate that the optimal regularization's hyper-parameters found from cross-validation over all classes leads to disastrous model performances on a minority of classes. For example, a resnet50 trained on Imagenet sees its “barn spider” test accuracy falls from 68% to 46% only by introducing random crop DA during training. Even more surprising, such unfair impact of regularization also appears when introducing uninformative regularizers such as weight decay or dropout. Those results demonstrate that our search for ever increasing generalization performance -averaged over all classes and samples-has left us with models and regularizers that silently sacrifice performances on some classes. This scenario can become dangerous when deploying a model on downstream tasks e.g. an Imagenet pre-trained resnet50 deployed on INaturalist sees its performances fall from 70% to 30% on class #8889 when introducing random crop DA during the Imagenet pre-training phase. Those results demonstrate that finding a correct measure of a model's complexity without class-dependent preference remains an open research question.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherNeural information processing systems foundation
ISBN (Electronic)9781713871088
StatePublished - 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: Nov 28 2022Dec 9 2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period11/28/2212/9/22

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'The Effects of Regularization and Data Augmentation are Class Dependent'. Together they form a unique fingerprint.

Cite this