Implicit Sparse Regularization: The Impact of Depth and Early Stopping

Jiangyuan Li, Thanh V. Nguyen, Chinmay Hegde, Raymond K.W. Wong

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    In this paper, we study the implicit bias of gradient descent for sparse regression. We extend results on regression with quadratic parametrization, which amounts to depth-2 diagonal linear networks, to more general depth-N networks, under more realistic settings of noise and correlated designs. We show that early stopping is crucial for gradient descent to converge to a sparse model, a phenomenon that we call implicit sparse regularization. This result is in sharp contrast to known results for noiseless and uncorrelated-design cases. We characterize the impact of depth and early stopping and show that for a general depth parameter N, gradient descent with early stopping achieves minimax optimal sparse recovery with sufficiently small initialization w0 and step size η. In particular, we show that increasing depth enlarges the scale of working initialization and the early-stopping window so that this implicit sparse regularization effect is more likely to take place.

    Original languageEnglish (US)
    Title of host publicationAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
    EditorsMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
    PublisherNeural information processing systems foundation
    Pages28298-28309
    Number of pages12
    ISBN (Electronic)9781713845393
    StatePublished - 2021
    Event35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
    Duration: Dec 6 2021Dec 14 2021

    Publication series

    NameAdvances in Neural Information Processing Systems
    Volume34
    ISSN (Print)1049-5258

    Conference

    Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
    CityVirtual, Online
    Period12/6/2112/14/21

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Information Systems
    • Signal Processing

    Fingerprint

    Dive into the research topics of 'Implicit Sparse Regularization: The Impact of Depth and Early Stopping'. Together they form a unique fingerprint.

    Cite this