Stochastic differential equation approximations of generative adversarial network training and its long-run behavior

Haoyang Cao; Xin Guo

doi:10.1017/jpr.2023.57

Stochastic differential equation approximations of generative adversarial network training and its long-run behavior

Part of: Stochastic analysis Limit theorems

Published online by Cambridge University Press: 02 October 2023

Haoyang Cao

and

Xin Guo

Show author details

Haoyang Cao*: Affiliation:
École Polytechnique
Xin Guo*: Affiliation:
University of California, Berkeley
*: *Postal address: Centre de Mathématiques Appliquées, École Polytechnique, Route de Saclay, 91128, Palaiseau Cedex, France. Email: haoyang.cao@polytechnique.edu
**Postal address: Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, CA 94720, USA. Email: xinguo@berkeley.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This paper analyzes the training process of generative adversarial networks (GANs) via stochastic differential equations (SDEs). It first establishes SDE approximations for the training of GANs under stochastic gradient algorithms, with precise error bound analysis. It then describes the long-run behavior of GAN training via the invariant measures of its SDE approximations under proper conditions. This work builds a theoretical foundation for GAN training and provides analytical tools to study its evolution and stability.

Keywords

Generative adversarial networks stochastic gradient algorithm stochastic differential equation

MSC classification

Primary: 60H30: Applications of stochastic analysis (to PDE, etc.)

Secondary: 60F17: Functional limit theorems; invariance principles 60H10: Stochastic ordinary differential equations

Type: Original Article
Information: Journal of Applied Probability , Volume 61 , Issue 2 , June 2024 , pp. 465 - 489

DOI: https://doi.org/10.1017/jpr.2023.57 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arjovsky, M. and Bottou, L. (2017). Towards principled methods for training generative adversarial networks. In Proc. 15th Int. Conf. Learning Representations.Google Scholar

Arjovsky, M., Chintala, S. and Bottou, L. (2017). Wasserstein generative adversarial networks. Proc. Mach. Learn. Res. 70, 214–223.Google Scholar

Berard, H., Gidel, G., Almahairi, A., Vincent, P. and Lacoste-Julien, S. (2020). A closer look at the optimization landscape of generative adversarial networks. In Proc. Int. Conf. Learning Representations.Google Scholar

Cao, H., Guo, X. and Laurière, M. (2020). Connecting GANs, MFGs, and OT. Preprint, arXiv:2002.04112.Google Scholar

Chen, L., Pelger, M. and Zhu, J. (2023). Deep learning in asset pricing. Management Science.Google Scholar

Coletta, A., Prata, M., Conti, M., Mercanti, E., Bartolini, N., Moulin, A., Vyetrenko, S. and Balch, T. (2021). Towards realistic market simulations: A generative adversarial networks approach. In Proc. 2nd ACM Int. Conf. AI in Finance.CrossRef Google Scholar

Conforti, G., Kazeykina, A. and Ren, Z. (2023). Game on random environment, mean-field Langevin system, and neural networks. Math. Operat. Res. 48, 78–99.CrossRef Google Scholar

Da Prato, G. (2006). An Introduction to Infinite-Dimensional Analysis. Springer, New York.CrossRef Google Scholar

Denton, E. L., Chintala, S., Szlam, A. and Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. Adv. Neural Inf. Proc. Sys. 28, 1486–1494.Google Scholar

Dieuleveut, A., Durmus, A. and Bach, F. (2020). Bridging the gap between constant step size stochastic gradient descent and Markov chains. Ann. Statist. 48, 1348–1382.CrossRef Google Scholar

Domingo-Enrich, C., Jelassi, S., Mensch, A., Rotskoff, G. and Bruna, J. (2020). A mean-field analysis of two-player zero-sum games. Adv. Neural Inf. Proc. Sys. 33, 20215–20226.Google Scholar

Evans, L. C. (1998). Partial Differential Equations, vol. 19. American Mathematical Society, Providence, RI.Google Scholar

Genevay, A., Peyré, G. and Cuturi, M. (2017). GAN and VAE from an optimal transport point of view. Preprint, arXiv:1706.01807.Google Scholar

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y. (2014). Generative adversarial nets. Adv. Neural Inf. Proc. Sys. 27, 2672–2680.Google Scholar

Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V. and Courville, A. (2017). Improved training of Wasserstein GANs. Adv. Neural Inf. Proc. Sys. 31, 5767–5777.Google Scholar

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. and Hochreiter, S. (2017). GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Proc. Sys. 31, 6626–6637.Google Scholar

Hong, J. and Wang, X. (2019). Invariant measures for stochastic differential equations. In Invariant Measures for Stochastic Nonlinear Schödinger Equations: Numerical Approximations and Symplectic Structures. Springer, Singapore, pp. 31–61.CrossRef Google Scholar

Hu, W., Li, C. J., Li, L. and Liu, J.-G. (2019). On the diffusion approximation of nonconvex stochastic gradient descent. Ann. Math. Sci. Appl. 4, 3–32.CrossRef Google Scholar

Khasminskii, R. (2011). Stochastic Stability of Differential Equations 2nd edn, vol. 66. Springer, New York.Google Scholar

Krylov, N. V. (2008). Controlled Diffusion Processes. Springer, New York.Google Scholar

Kulharia, V., Ghosh, A., Mukerjee, A., Namboodiri, V. and Bansal, M. (2017). Contextual RNN-GANs for abstract reasoning diagram generation. In Proc. 31st AAAI Conf. Artificial Intelligence, pp. 1382–1388.CrossRef Google Scholar

Laborde, M. and Oberman, A. (2020). A Lyapunov analysis for accelerated gradient methods: From deterministic to stochastic case. In Proc. 23rd Int. Conf. Artificial Intelligence Statist., pp. 602–612.Google Scholar

Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z. et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 4681–4690.CrossRef Google Scholar

Li, Q., Tai, C. and E, W. (2019). Stochastic modified equations and dynamics of stochastic gradient algorithms I: Mathematical foundations. J. Mach. Learn. Res. 20, 1–47.Google Scholar

Liu, G.-H. and Theodorou, E. A. (2019). Deep learning theory review: An optimal control and dynamical systems perspective. Preprint, arXiv:1908.10920.Google Scholar

Luc, P., Couprie, C., Chintala, S. and Verbeek, J. (2016). Semantic segmentation using adversarial networks. In Proc. NIPS Workshop Adversarial Training.Google Scholar

Luise, G., Pontil, M. and Ciliberto, C. (2020). Generalization properties of optimal transport GANs with latent distribution learning. Preprint, arXiv:2007.14641.Google Scholar

Mescheder, L., Geiger, A. and Nowozin, S. (2018). Which training methods for GANs do actually converge? In Proc. Int. Conf. Machine Learning, pp. 3481–3490.Google Scholar

Radford, A., Metz, L. and Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In Proc. 4th Int. Conf. Learning Representations.Google Scholar

Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B. and Lee, H. (2016). Generative adversarial text to image synthesis. In Proc. 33rd Int. Conf. Machine Learning, pp. 1060–1069.Google Scholar

Rogers, L. C. G. and Williams, D. (2000). Diffusions, Markov Processes and Martingales. Volume 2: Itô Calculus. Cambridge University Press.Google Scholar

Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A. and Chen, X. (2016). Improved techniques for training GANs. In Proc. 30th Int. Conf. Neural Inf. Proc. Syst., pp. 2234–2242.Google Scholar

Schaefer, F., Zheng, H. and Anandkumar, A. (2020). Implicit competitive regularization in GANs. Proc. Mach. Learn. Res. 119, 8533–8544.Google Scholar

Sion, M. (1958). On general minimax theorems. Pacific J. Math. 8, 171–176.CrossRef Google Scholar

Storchan, V., Vyetrenko, S. and Balch, T. (2021). Learning who is in the market from time series: Market participant discovery through adversarial calibration of multi-agent simulators. Preprint, arXiv:2108.00664.Google Scholar

Takahashi, S., Chen, Y. and Tanaka-Ishii, K. (2019). Modeling financial time-series with generative adversarial networks. Physica A 527, 121261.CrossRef Google Scholar

Thanh-Tung, H., Tran, T. and Venkatesh, S. (2019). Improving generalization and stability of generative adversarial networks. In Proc. Int. Conf. Learning Representations.Google Scholar

Veretennikov, A. Y. (1988). Bounds for the mixing rate in the theory of stochastic equations. Theory Prob. Appl. 32, 273–281.CrossRef Google Scholar

Von Neumann, J. (1959). On the theory of games of strategy. Contrib. Theory Games 4, 13–42.Google Scholar

Vondrick, C., Pirsiavash, H. and Torralba, A. (2016). Generating videos with scene dynamics. In Proc. 30th Conf. Neural Inf. Proc. Syst., pp. 613–621.Google Scholar

Wiatrak, M., Albrecht, S. V. and Nystrom, A. (2019). Stabilizing generative adversarial networks: A survey. Preprint, arXiv:1910.00927.Google Scholar

Wiese, M., Bai, L., Wood, B., Morgan, J. P. and Buehler, H. (2019). Deep hedging: Learning to simulate equity option markets. Preprint, arXiv:1911.01700.Google Scholar

Wiese, M., Knobloch, R., Korn, R. and Kretschmer, P. (2020). Quant GANs: Deep generation of financial time series. Quant. Finance 20, 1–22.CrossRef Google Scholar

Yaida, S. (2019). Fluctuation–dissipation relations for stochastic gradient descent. In Proc. Int. Conf. Learning Representations.Google Scholar

Yeh, R. A., Chen, C., Yian Lim, T., Schwing, A. G., Hasegawa-Johnson, M. and Do, M. N. (2017). Semantic image inpainting with deep generative models. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 5485–5493.CrossRef Google Scholar

Zhang, K., Zhong, G., Dong, J., Wang, S. and Wang, Y. (2019). Stock market prediction based on generative adversarial network. Procedia Comp. Sci. 147, 400–406.CrossRef Google Scholar

Zhu, B., Jiao, J. and Tse, D. (2020). Deconstructing generative adversarial networks. IEEE Trans. Inf. Theory 66, 7155–7179.CrossRef Google Scholar

Zhu, J.-Y., Krähenbühl, P., Shechtman, E. and Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In Proc. Eur. Conf. Computer Vision, pp. 597–613.CrossRef Google Scholar

Article contents

Stochastic differential equation approximations of generative adversarial network training and its long-run behavior

Abstract

Keywords

MSC classification

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests