Private Sequential Hypothesis Testing for Statisticians: Privacy, Error Rates, and Sample Size

Wanrong Zhang, Yajun Mei, Rachel Cummings

Research output: Contribution to journalConference articlepeer-review

Abstract

The sequential hypothesis testing problem is a class of statistical analyses where the sample size is not fixed in advance. Instead, the decision-process takes in new observations sequentially to make real-time decisions for testing an alternative hypothesis against a null hypothesis until some stopping criterion is satisfied. In many common applications of sequential hypothesis testing, the data can be highly sensitive and may require privacy protection; for example, sequential hypothesis testing is used in clinical trials, where doctors sequentially collect data from patients and must determine when to stop recruiting patients and whether the treatment is effective. The field of differential privacy has been developed to offer data analysis tools with strong privacy guarantees, and has been commonly applied to machine learning and statistical tasks. In this work, we study the sequential hypothesis testing problem under a slight variant of differential privacy, known as Renyi differential privacy. We present a new private algorithm based on Wald's Sequential Probability Ratio Test (SPRT) that also gives strong theoretical privacy guarantees. We provide theoretical analysis on statistical performance measured by Type I and Type II error as well as the expected sample size. We also empirically validate our theoretical results on several synthetic databases, showing that our algorithms also perform well in practice. Unlike previous work in private hypothesis testing that focused only on the classical fixed sample setting, our results in the sequential setting allow a conclusion to be reached much earlier, and thus saving the cost of collecting additional samples.

Original languageEnglish (US)
Pages (from-to)11356-11373
Number of pages18
JournalProceedings of Machine Learning Research
Volume151
StatePublished - 2022
Event25th International Conference on Artificial Intelligence and Statistics, AISTATS 2022 - Virtual, Online, Spain
Duration: Mar 28 2022Mar 30 2022

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Private Sequential Hypothesis Testing for Statisticians: Privacy, Error Rates, and Sample Size'. Together they form a unique fingerprint.

Cite this