Minimax estimation of the L1 distance

Jiantao Jiao, Yanjun Han, Tsachy Weissman

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the problem of estimating the L1 distance between two discrete probability measures P and Q from empirical data in a nonasymptotic and large alphabet setting. When Q is known and one obtains n samples from P, we show that for every Q, the minimax rate-optimal estimator with n samples achieves performance comparable to that of the maximum likelihood estimator with nlnn samples. When both P and Q are unknown, we construct minimax rate-optimal estimators, whose worst case performance is essentially that of the known Q case with Q being uniform, implying that Q being uniform is essentially the most difficult case. The effective sample size enlargement phenomenon, identified by Jiao et al., holds both in the known Q case for every Q and the Q unknown case. However, the construction of optimal estimators for |P-Q|1 requires new techniques and insights beyond the approximation-based method of functional estimation by Jiao et al.

Original languageEnglish (US)
Article number8379458
Pages (from-to)6672-6706
Number of pages35
JournalIEEE Transactions on Information Theory
Volume64
Issue number10
DOIs
StatePublished - Oct 2018

Keywords

  • Divergence estimation
  • functional estimation
  • high-dimensional statistics
  • multivariate approximation theory
  • optimal classification error
  • total variation distance

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Minimax estimation of the L1 distance'. Together they form a unique fingerprint.

Cite this