To Compress or Not to Compress—Self-Supervised Learning and Information Theory: A Review

Ravid Shwartz Ziv, Yann LeCun

Research output: Contribution to journalReview articlepeer-review

Abstract

Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data. Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels. Information theory has shaped deep neural networks, particularly the information bottleneck principle. This principle optimizes the trade-off between compression and preserving relevant information, providing a foundation for efficient network design in supervised contexts. However, its precise role and adaptation in self-supervised learning remain unclear. In this work, we scrutinize various self-supervised learning approaches from an information-theoretic perspective, introducing a unified framework that encapsulates the self-supervised information-theoretic learning problem. This framework includes multiple encoders and decoders, suggesting that all existing work on self-supervised learning can be seen as specific instances. We aim to unify these approaches to understand their underlying principles better and address the main challenge: many works present different frameworks with differing theories that may seem contradictory. By weaving existing research into a cohesive narrative, we delve into contemporary self-supervised methodologies, spotlight potential research areas, and highlight inherent challenges. Moreover, we discuss how to estimate information-theoretic quantities and their associated empirical problems. Overall, this paper provides a comprehensive review of the intersection of information theory, self-supervised learning, and deep neural networks, aiming for a better understanding through our proposed unified approach.

Original languageEnglish (US)
Article number252
JournalEntropy
Volume26
Issue number3
DOIs
StatePublished - Mar 2024

Keywords

  • deep neural networks
  • information theory
  • representation learning
  • self-supervised learning

ASJC Scopus subject areas

  • Information Systems
  • Mathematical Physics
  • Physics and Astronomy (miscellaneous)
  • General Physics and Astronomy
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'To Compress or Not to Compress—Self-Supervised Learning and Information Theory: A Review'. Together they form a unique fingerprint.

Cite this