Abstract
We describe a minimalistic and interpretable method for unsupervised representation learning that does not require data augmentation, hyperparameter tuning, or other engineering designs, but nonetheless achieves performance close to the state-of-the-art (SOTA) SSL methods.Our approach leverages the sparse manifold transform [21], which unifies sparse coding, manifold learning, and slow feature analysis.With a one-layer deterministic (one training epoch) sparse manifold transform, it is possible to achieve 99.3% KNN top-1 accuracy on MNIST, 81.1% KNN top-1 accuracy on CIFAR-10, and 53.2% on CIFAR-100.With simple grayscale augmentation, the model achieves 83.2% KNN top-1 accuracy on CIFAR-10 and 57% on CIFAR-100.These results significantly close the gap between simplistic “white-box” methods and SOTA methods.We also provide visualization to illustrate how an unsupervised representation transform is formed.The proposed method is closely connected to latent-embedding self-supervised methods and can be treated as the simplest form of VICReg.Though a small performance gap remains between our simple constructive model and SOTA methods, the evidence points to this as a promising direction for achieving a principled and white-box approach to unsupervised representation learning, which has potential to significantly improve learning efficiency.
Original language | English (US) |
---|---|
State | Published - 2023 |
Event | 11th International Conference on Learning Representations, ICLR 2023 - Kigali, Rwanda Duration: May 1 2023 → May 5 2023 |
Conference
Conference | 11th International Conference on Learning Representations, ICLR 2023 |
---|---|
Country/Territory | Rwanda |
City | Kigali |
Period | 5/1/23 → 5/5/23 |
ASJC Scopus subject areas
- Language and Linguistics
- Computer Science Applications
- Education
- Linguistics and Language