Abstract
Single-Index Models are high-dimensional regression problems with planted structure, whereby labels depend on an unknown one-dimensional projection of the input via a generic, non-linear, and potentially non-deterministic transformation. As such, they encompass a broad class of statistical inference tasks, and provide a rich template to study statistical and computational trade-offs in the high-dimensional regime. While the information-theoretic sample complexity to recover the hidden direction is linear in the dimension d, we show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require Ω(dk*/2) samples, where k* is a “generative” exponent associated with the model that we explicitly characterize. Moreover, we show that this sample complexity is also sufficient, by establishing matching upper bounds using a partial-trace algorithm. Therefore, our results provide evidence of a sharp computational-to-statistical gap (under both the SQ and LDP class) whenever k* > 2. To complete the study, we construct smooth and Lipschitz deterministic target functions with arbitrarily large generative exponents k*,.
Original language | English (US) |
---|---|
Pages (from-to) | 1262 |
Number of pages | 1 |
Journal | Proceedings of Machine Learning Research |
Volume | 247 |
State | Published - 2024 |
Event | 37th Annual Conference on Learning Theory, COLT 2024 - Edmonton, Canada Duration: Jun 30 2024 → Jul 3 2024 |
Keywords
- Low-Degree Polynomials
- Single-Index Models
- Statistical Queries
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability