## Abstract

We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector and output a near minimizer to the objective function. For lp norm regression for any 0 < p < 8, we give an algorithm based on Lewis weight sampling which outputs a(1+?)-approximate solution using just O(d/?2) queries to b for p ? (0,1), O(d/?) queries for p ? (1,2), and O(dp/2/?p) queries for p ? (2, 8). For p ? (0,2), our bounds are optimal up to logarithmic factors, thus settling the query complexity for this range of p. For p ? (2, 8), our dependence on d is optimal, while our dependence on ? is off by at most a single ? factor, up to logarithmic factors. Our result resolves an open question of Chen and Derezinski, who gave near optimal bounds for the l1 norm, but required at least d2/?2 samples for lp regression with p ? (1,2), and gave no bounds for p ? (2, 8) or p 8 (0,1). We also provide the first total sensitivity upper bound for loss functions with at most degree p polynomial growth. This improves a recent result of Tukan, Maalouf, and Feldman. By combining this with our techniques for lp regression, we obtain the first active regression algorithms for such loss functions, including the important cases of the Tukey and Huber losses. This answers another question of Chen and Derezinski. Our sensitivity bounds also give improvements to a variety of previous results using sensitivity sampling, including Orlicz norm subspace embeddings, robust subspace approximation, and dimension reduction for smoothed p-norms. Finally, our active sampling results give the first sublinear time algorithms for Kronecker product regression under every lp norm. Previous results required reading the entire b vector in the kernel feature space.11Extended abstract; full version available at https://arxiv.org/abs/2111.04888.

Original language | English (US) |
---|---|

Title of host publication | Proceedings - 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science, FOCS 2022 |

Publisher | IEEE Computer Society |

Pages | 744-753 |

Number of pages | 10 |

ISBN (Electronic) | 9781665455190 |

DOIs | |

State | Published - 2022 |

Event | 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022 - Denver, United States Duration: Oct 31 2022 → Nov 3 2022 |

### Publication series

Name | Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS |
---|---|

Volume | 2022-October |

ISSN (Print) | 0272-5428 |

### Conference

Conference | 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022 |
---|---|

Country/Territory | United States |

City | Denver |

Period | 10/31/22 → 11/3/22 |

## Keywords

- active learning
- linear regression

## ASJC Scopus subject areas

- General Computer Science

## Fingerprint

Dive into the research topics of 'Active Linear Regression for l_{p}Norms and Beyond'. Together they form a unique fingerprint.