Hostname: page-component-76dd75c94c-lpd2x Total loading time: 0 Render date: 2024-04-30T09:48:21.941Z Has data issue: false hasContentIssue false

Stochastic differential equation approximations of generative adversarial network training and its long-run behavior

Published online by Cambridge University Press:  02 October 2023

Haoyang Cao*
Affiliation:
École Polytechnique
Xin Guo*
Affiliation:
University of California, Berkeley
*
*Postal address: Centre de Mathématiques Appliquées, École Polytechnique, Route de Saclay, 91128, Palaiseau Cedex, France. Email: haoyang.cao@polytechnique.edu
**Postal address: Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, CA 94720, USA. Email: xinguo@berkeley.edu

Abstract

This paper analyzes the training process of generative adversarial networks (GANs) via stochastic differential equations (SDEs). It first establishes SDE approximations for the training of GANs under stochastic gradient algorithms, with precise error bound analysis. It then describes the long-run behavior of GAN training via the invariant measures of its SDE approximations under proper conditions. This work builds a theoretical foundation for GAN training and provides analytical tools to study its evolution and stability.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arjovsky, M. and Bottou, L. (2017). Towards principled methods for training generative adversarial networks. In Proc. 15th Int. Conf. Learning Representations.Google Scholar
Arjovsky, M., Chintala, S. and Bottou, L. (2017). Wasserstein generative adversarial networks. Proc. Mach. Learn. Res. 70, 214223.Google Scholar
Berard, H., Gidel, G., Almahairi, A., Vincent, P. and Lacoste-Julien, S. (2020). A closer look at the optimization landscape of generative adversarial networks. In Proc. Int. Conf. Learning Representations.Google Scholar
Cao, H., Guo, X. and Laurière, M. (2020). Connecting GANs, MFGs, and OT. Preprint, arXiv:2002.04112.Google Scholar
Chen, L., Pelger, M. and Zhu, J. (2023). Deep learning in asset pricing. Management Science.CrossRefGoogle Scholar
Coletta, A., Prata, M., Conti, M., Mercanti, E., Bartolini, N., Moulin, A., Vyetrenko, S. and Balch, T. (2021). Towards realistic market simulations: A generative adversarial networks approach. In Proc. 2nd ACM Int. Conf. AI in Finance.CrossRefGoogle Scholar
Conforti, G., Kazeykina, A. and Ren, Z. (2023). Game on random environment, mean-field Langevin system, and neural networks. Math. Operat. Res. 48, 7899.CrossRefGoogle Scholar
Da Prato, G. (2006). An Introduction to Infinite-Dimensional Analysis. Springer, New York.CrossRefGoogle Scholar
Denton, E. L., Chintala, S., Szlam, A. and Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. Adv. Neural Inf. Proc. Sys. 28, 14861494.Google Scholar
Dieuleveut, A., Durmus, A. and Bach, F. (2020). Bridging the gap between constant step size stochastic gradient descent and Markov chains. Ann. Statist. 48, 13481382.CrossRefGoogle Scholar
Domingo-Enrich, C., Jelassi, S., Mensch, A., Rotskoff, G. and Bruna, J. (2020). A mean-field analysis of two-player zero-sum games. Adv. Neural Inf. Proc. Sys. 33, 2021520226.Google Scholar
Evans, L. C. (1998). Partial Differential Equations, vol. 19. American Mathematical Society, Providence, RI.Google Scholar
Genevay, A., Peyré, G. and Cuturi, M. (2017). GAN and VAE from an optimal transport point of view. Preprint, arXiv:1706.01807.Google Scholar
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y. (2014). Generative adversarial nets. Adv. Neural Inf. Proc. Sys. 27, 26722680.Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V. and Courville, A. (2017). Improved training of Wasserstein GANs. Adv. Neural Inf. Proc. Sys. 31, 57675777.Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. and Hochreiter, S. (2017). GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Proc. Sys. 31, 66266637.Google Scholar
Hong, J. and Wang, X. (2019). Invariant measures for stochastic differential equations. In Invariant Measures for Stochastic Nonlinear Schödinger Equations: Numerical Approximations and Symplectic Structures. Springer, Singapore, pp. 3161.CrossRefGoogle Scholar
Hu, W., Li, C. J., Li, L. and Liu, J.-G. (2019). On the diffusion approximation of nonconvex stochastic gradient descent. Ann. Math. Sci. Appl. 4, 332.CrossRefGoogle Scholar
Khasminskii, R. (2011). Stochastic Stability of Differential Equations 2nd edn, vol. 66. Springer, New York.Google Scholar
Krylov, N. V. (2008). Controlled Diffusion Processes. Springer, New York.Google Scholar
Kulharia, V., Ghosh, A., Mukerjee, A., Namboodiri, V. and Bansal, M. (2017). Contextual RNN-GANs for abstract reasoning diagram generation. In Proc. 31st AAAI Conf. Artificial Intelligence, pp. 13821388.CrossRefGoogle Scholar
Laborde, M. and Oberman, A. (2020). A Lyapunov analysis for accelerated gradient methods: From deterministic to stochastic case. In Proc. 23rd Int. Conf. Artificial Intelligence Statist., pp. 602612.Google Scholar
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z. et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 46814690.CrossRefGoogle Scholar
Li, Q., Tai, C. and E, W. (2019). Stochastic modified equations and dynamics of stochastic gradient algorithms I: Mathematical foundations. J. Mach. Learn. Res. 20, 147.Google Scholar
Liu, G.-H. and Theodorou, E. A. (2019). Deep learning theory review: An optimal control and dynamical systems perspective. Preprint, arXiv:1908.10920.Google Scholar
Luc, P., Couprie, C., Chintala, S. and Verbeek, J. (2016). Semantic segmentation using adversarial networks. In Proc. NIPS Workshop Adversarial Training.Google Scholar
Luise, G., Pontil, M. and Ciliberto, C. (2020). Generalization properties of optimal transport GANs with latent distribution learning. Preprint, arXiv:2007.14641.Google Scholar
Mescheder, L., Geiger, A. and Nowozin, S. (2018). Which training methods for GANs do actually converge? In Proc. Int. Conf. Machine Learning, pp. 3481–3490.Google Scholar
Radford, A., Metz, L. and Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In Proc. 4th Int. Conf. Learning Representations.Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B. and Lee, H. (2016). Generative adversarial text to image synthesis. In Proc. 33rd Int. Conf. Machine Learning, pp. 10601069.Google Scholar
Rogers, L. C. G. and Williams, D. (2000). Diffusions, Markov Processes and Martingales. Volume 2: Itô Calculus. Cambridge University Press.Google Scholar
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A. and Chen, X. (2016). Improved techniques for training GANs. In Proc. 30th Int. Conf. Neural Inf. Proc. Syst., pp. 22342242.Google Scholar
Schaefer, F., Zheng, H. and Anandkumar, A. (2020). Implicit competitive regularization in GANs. Proc. Mach. Learn. Res. 119, 85338544.Google Scholar
Sion, M. (1958). On general minimax theorems. Pacific J. Math. 8, 171176.CrossRefGoogle Scholar
Storchan, V., Vyetrenko, S. and Balch, T. (2021). Learning who is in the market from time series: Market participant discovery through adversarial calibration of multi-agent simulators. Preprint, arXiv:2108.00664.Google Scholar
Takahashi, S., Chen, Y. and Tanaka-Ishii, K. (2019). Modeling financial time-series with generative adversarial networks. Physica A 527, 121261.CrossRefGoogle Scholar
Thanh-Tung, H., Tran, T. and Venkatesh, S. (2019). Improving generalization and stability of generative adversarial networks. In Proc. Int. Conf. Learning Representations.Google Scholar
Veretennikov, A. Y. (1988). Bounds for the mixing rate in the theory of stochastic equations. Theory Prob. Appl. 32, 273281.CrossRefGoogle Scholar
Von Neumann, J. (1959). On the theory of games of strategy. Contrib. Theory Games 4, 1342.Google Scholar
Vondrick, C., Pirsiavash, H. and Torralba, A. (2016). Generating videos with scene dynamics. In Proc. 30th Conf. Neural Inf. Proc. Syst., pp. 613621.Google Scholar
Wiatrak, M., Albrecht, S. V. and Nystrom, A. (2019). Stabilizing generative adversarial networks: A survey. Preprint, arXiv:1910.00927.Google Scholar
Wiese, M., Bai, L., Wood, B., Morgan, J. P. and Buehler, H. (2019). Deep hedging: Learning to simulate equity option markets. Preprint, arXiv:1911.01700.Google Scholar
Wiese, M., Knobloch, R., Korn, R. and Kretschmer, P. (2020). Quant GANs: Deep generation of financial time series. Quant. Finance 20, 122.CrossRefGoogle Scholar
Yaida, S. (2019). Fluctuation–dissipation relations for stochastic gradient descent. In Proc. Int. Conf. Learning Representations.Google Scholar
Yeh, R. A., Chen, C., Yian Lim, T., Schwing, A. G., Hasegawa-Johnson, M. and Do, M. N. (2017). Semantic image inpainting with deep generative models. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 54855493.CrossRefGoogle Scholar
Zhang, K., Zhong, G., Dong, J., Wang, S. and Wang, Y. (2019). Stock market prediction based on generative adversarial network. Procedia Comp. Sci. 147, 400406.CrossRefGoogle Scholar
Zhu, B., Jiao, J. and Tse, D. (2020). Deconstructing generative adversarial networks. IEEE Trans. Inf. Theory 66, 71557179.CrossRefGoogle Scholar
Zhu, J.-Y., Krähenbühl, P., Shechtman, E. and Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In Proc. Eur. Conf. Computer Vision, pp. 597613.CrossRefGoogle Scholar