TY - GEN
T1 - A gradual, semi-discrete approach to generative network training via explicit wasserstein minimization
AU - Chen, Yucheng
AU - Telgarsky, Matus
AU - Zhang, Chao
AU - Bailey, Bolton
AU - Hsu, Daniel
AU - Peng, Jian
N1 - Publisher Copyright:
© 2019 by the Author(S).
PY - 2019
Y1 - 2019
N2 - This paper provides a simple procedure to fit generative networks to target distributions, with the goal of a small Wasserstein distance (or other optimal traasport cost). The approach is based on two principles: (a) if the source randomness of the network is a continuous distribution (the "semi-discrete" setting), then the Wasserstein distance is realized by a deterministic optimal transport mapping; (b) given an optimal transport mapping between a generator network and a target distribution, the Wasserstein distance may be decreased via a regression between the generated data and the mapped target points. The procedure here therefore alternates these two steps, forming an optimal transport and regressing against it, gradually adjusting the generator network towards the target distribution. Mathematically, this approach is shown to minimize the Wasserstein distance to both the empirical target distribution, and also its underlying population counterpart. Empirically, good performance is demonstrated on the training and testing sets of the MNIST and Thin-8 data. The paper closes with a discussion of the un-suitability of the Wasserstein distance for certain tasks, as has been identified in prior work (Arora et al., 2017; Huang et al., 2017).
AB - This paper provides a simple procedure to fit generative networks to target distributions, with the goal of a small Wasserstein distance (or other optimal traasport cost). The approach is based on two principles: (a) if the source randomness of the network is a continuous distribution (the "semi-discrete" setting), then the Wasserstein distance is realized by a deterministic optimal transport mapping; (b) given an optimal transport mapping between a generator network and a target distribution, the Wasserstein distance may be decreased via a regression between the generated data and the mapped target points. The procedure here therefore alternates these two steps, forming an optimal transport and regressing against it, gradually adjusting the generator network towards the target distribution. Mathematically, this approach is shown to minimize the Wasserstein distance to both the empirical target distribution, and also its underlying population counterpart. Empirically, good performance is demonstrated on the training and testing sets of the MNIST and Thin-8 data. The paper closes with a discussion of the un-suitability of the Wasserstein distance for certain tasks, as has been identified in prior work (Arora et al., 2017; Huang et al., 2017).
UR - http://www.scopus.com/inward/record.url?scp=85078001884&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078001884&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85078001884
T3 - 36th International Conference on Machine Learning, ICML 2019
SP - 1845
EP - 1858
BT - 36th International Conference on Machine Learning, ICML 2019
PB - International Machine Learning Society (IMLS)
T2 - 36th International Conference on Machine Learning, ICML 2019
Y2 - 9 June 2019 through 15 June 2019
ER -