Abstract
A methodology to estimate from samples the probability density of a random variable x conditional to the values of a set of covariates { zl} is proposed. The methodology relies on a data-driven formulation of the Wasserstein barycenter, posed as a minimax problem in terms of the conditional map carrying each sample point to the barycenter and a potential characterizing the inverse of this map. This minimax problem is solved through the alternation of a flow developing the map in time and the maximization of the potential through an alternate projection procedure. The dependence on the covariates { zl} is formulated in terms of convex combinations, so that it can be applied to variables of nearly any type, including real, categorical and distributional. The methodology is illustrated through numerical examples on synthetic and real data. The real-world example chosen is meteorological, forecasting the temperature distribution at a given location as a function of time, and estimating the joint distribution at a location of the highest and lowest daily temperatures as a function of the date.
Original language | English (US) |
---|---|
Pages (from-to) | 665-688 |
Number of pages | 24 |
Journal | Machine Learning |
Volume | 109 |
Issue number | 4 |
DOIs | |
State | Published - Apr 1 2020 |
Keywords
- Conditional density estimation
- Confounding factors
- Explanation of variability
- Optimal transport
- Sampling
- Uncertainty quantification
- Wasserstein barycenter
ASJC Scopus subject areas
- Software
- Artificial Intelligence