Conditional expectation estimation through attributable components

Esteban G. Tabak, Giulio Trigila

Research output: Contribution to journalArticlepeer-review

Abstract

A general methodology is proposed for the explanation of variability in a quantity of interest x in terms of covariates z = (z1, ., zL). It provides the conditional mean x(z) as a sum of components, where each component is represented as a product of non-parametric one-dimensional functions of each covariate zl that are computed through an alternating projection procedure. Both x and the zl can be real or categorical variables; in addition, some or all values of each zl can be unknown, providing a general framework for multi-clustering, classification and covariate imputation in the presence of confounding factors. The procedure can be considered as a preconditioning step for the more general determination of the full conditional distribution ρ(x|z) through a data-driven optimal-transport barycenter problem. In particular, just iterating the procedure once yields the second order structure (i.e. the covariance) of ρ(x|z). The methodology is illustrated through examples that include the explanation of variability of ground temperature across the continental United States and the prediction of book preference among potential readers.

Original languageEnglish (US)
Pages (from-to)727-754
Number of pages28
JournalInformation and Inference
Volume7
Issue number4
DOIs
StatePublished - Dec 11 2018

Keywords

  • Conditional density estimation
  • Optimal transport
  • Principal component analysis

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Analysis
  • Applied Mathematics
  • Statistics and Probability
  • Numerical Analysis

Fingerprint

Dive into the research topics of 'Conditional expectation estimation through attributable components'. Together they form a unique fingerprint.

Cite this