TY - GEN
T1 - Active Linear Regression for lpNorms and Beyond
AU - Musco, Cameron
AU - Musco, Christopher
AU - Woodruff, David P.
AU - Yasuda, Taisuke
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector and output a near minimizer to the objective function. For lp norm regression for any 0 < p < 8, we give an algorithm based on Lewis weight sampling which outputs a(1+?)-approximate solution using just O(d/?2) queries to b for p ? (0,1), O(d/?) queries for p ? (1,2), and O(dp/2/?p) queries for p ? (2, 8). For p ? (0,2), our bounds are optimal up to logarithmic factors, thus settling the query complexity for this range of p. For p ? (2, 8), our dependence on d is optimal, while our dependence on ? is off by at most a single ? factor, up to logarithmic factors. Our result resolves an open question of Chen and Derezinski, who gave near optimal bounds for the l1 norm, but required at least d2/?2 samples for lp regression with p ? (1,2), and gave no bounds for p ? (2, 8) or p 8 (0,1). We also provide the first total sensitivity upper bound for loss functions with at most degree p polynomial growth. This improves a recent result of Tukan, Maalouf, and Feldman. By combining this with our techniques for lp regression, we obtain the first active regression algorithms for such loss functions, including the important cases of the Tukey and Huber losses. This answers another question of Chen and Derezinski. Our sensitivity bounds also give improvements to a variety of previous results using sensitivity sampling, including Orlicz norm subspace embeddings, robust subspace approximation, and dimension reduction for smoothed p-norms. Finally, our active sampling results give the first sublinear time algorithms for Kronecker product regression under every lp norm. Previous results required reading the entire b vector in the kernel feature space.11Extended abstract; full version available at https://arxiv.org/abs/2111.04888.
AB - We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector and output a near minimizer to the objective function. For lp norm regression for any 0 < p < 8, we give an algorithm based on Lewis weight sampling which outputs a(1+?)-approximate solution using just O(d/?2) queries to b for p ? (0,1), O(d/?) queries for p ? (1,2), and O(dp/2/?p) queries for p ? (2, 8). For p ? (0,2), our bounds are optimal up to logarithmic factors, thus settling the query complexity for this range of p. For p ? (2, 8), our dependence on d is optimal, while our dependence on ? is off by at most a single ? factor, up to logarithmic factors. Our result resolves an open question of Chen and Derezinski, who gave near optimal bounds for the l1 norm, but required at least d2/?2 samples for lp regression with p ? (1,2), and gave no bounds for p ? (2, 8) or p 8 (0,1). We also provide the first total sensitivity upper bound for loss functions with at most degree p polynomial growth. This improves a recent result of Tukan, Maalouf, and Feldman. By combining this with our techniques for lp regression, we obtain the first active regression algorithms for such loss functions, including the important cases of the Tukey and Huber losses. This answers another question of Chen and Derezinski. Our sensitivity bounds also give improvements to a variety of previous results using sensitivity sampling, including Orlicz norm subspace embeddings, robust subspace approximation, and dimension reduction for smoothed p-norms. Finally, our active sampling results give the first sublinear time algorithms for Kronecker product regression under every lp norm. Previous results required reading the entire b vector in the kernel feature space.11Extended abstract; full version available at https://arxiv.org/abs/2111.04888.
KW - active learning
KW - linear regression
UR - http://www.scopus.com/inward/record.url?scp=85141577863&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85141577863&partnerID=8YFLogxK
U2 - 10.1109/FOCS54457.2022.00076
DO - 10.1109/FOCS54457.2022.00076
M3 - Conference contribution
AN - SCOPUS:85141577863
T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
SP - 744
EP - 753
BT - Proceedings - 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science, FOCS 2022
PB - IEEE Computer Society
T2 - 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022
Y2 - 31 October 2022 through 3 November 2022
ER -