A multi-level procedure for enhancing accuracy of machine learning algorithms

KJETIL O. LYE; SIDDHARTHA MISHRA; ROBERTO MOLINARO

doi:10.1017/S0956792520000224

A multi-level procedure for enhancing accuracy of machine learning algorithms

Part of: Partial differential equations, initial value and time-dependent initial-boundary value problems

Published online by Cambridge University Press: 14 July 2020

KJETIL O. LYE ,

SIDDHARTHA MISHRA and

ROBERTO MOLINARO

Show author details

KJETIL O. LYE: Affiliation:
SINTEF Digital, Oslo, Norway, email: kjetil.olsen.lye@sintef.no
SIDDHARTHA MISHRA: Affiliation:
Seminar for Applied Mathematics (SAM), D-Math, ETH Zürich, Rämistrasse 101, Zürich-8092, Switzerland, emails: smishra@sam.math.ethz.ch; roberto.molinaro@sam.math.ethz.ch
ROBERTO MOLINARO: Affiliation:
Seminar for Applied Mathematics (SAM), D-Math, ETH Zürich, Rämistrasse 101, Zürich-8092, Switzerland, emails: smishra@sam.math.ethz.ch; roberto.molinaro@sam.math.ethz.ch

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We propose a multi-level method to increase the accuracy of machine learning algorithms for approximating observables in scientific computing, particularly those that arise in systems modelled by differential equations. The algorithm relies on judiciously combining a large number of computationally cheap training data on coarse resolutions with a few expensive training samples on fine grid resolutions. Theoretical arguments for lowering the generalisation error, based on reducing the variance of the underlying maps, are provided and numerical evidence, indicating significant gains over underlying single-level machine learning algorithms, are presented. Moreover, we also apply the multi-level algorithm in the context of forward uncertainty quantification and observe a considerable speedup over competing algorithms.

Keywords

Variance Reduction Multi-level Algorithms Machine Learning Uncertainity Quantification

MSC classification

Primary: 65M06: Finite difference methods

Secondary: 65M08: Finite volume methods 65M75: Probabilistic methods, particle methods, etc.

Type: Papers
Information: European Journal of Applied Mathematics , Volume 32 , Special Issue 3: Connections between Deep learning and Partial Differential Equations , June 2021 , pp. 436 - 469

DOI: https://doi.org/10.1017/S0956792520000224 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arora, S., Ge, R., Neyshabur, B. & Zhang, Y. (2018) Stronger generalization bounds for deep nets via a compression approach. In: Proceedings of the 35th International Conference on Machine Learning, Vol. 80. PMLR, July 2018, pp. 254–263.Google Scholar

Caflisch, R. E. (1988) Monte Carlo and Quasi–Monte Carlo methods. Acta. Numer. 1, 1–49.Google Scholar

Cucker, F. & Smale, S. (2001) On the mathematical foundations of learning. Bull. Amer. Math. Soc. 39(1), 1–49.CrossRef Google Scholar

Cybenko, G. (1989) Approximations by superpositions of sigmoidal functions. Approximation Theory Appl. 9(3), 17–28.Google Scholar

De Ryck, T., Mishra, S. & Deep, R. (2020) On the approximation of rough functions with deep neural networks. Preprint, available from arXiv:1912.06732.Google Scholar

Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T. F. G., Qin, C., Zidek, A., Nelson, A., Bridgland, A., Penedones, H., Petersen, S., Simonyan, K., Crossan, S., Jones, D. T., Silver, D., Kavukcuoglu, K., Hassabis, D. & Senior, A. W. (2019) De novo structure prediction with deep learning based scoring. Google DeepMind Working Paper.Google Scholar

Giles, M. B. (2008) Multilevel Monte Carlo path simulation. Oper. Res. 56, 607–617.CrossRef Google Scholar

Giles, M. B. (2015) Multilevel Monte Carlo methods. Acta Numer. 24, 259–328.CrossRef Google Scholar

Goodfellow, I., Bengio, Y. & Courville, A. (2016) Deep Learning, MIT press, Cambridge, Massachusetts, USA.Google Scholar

Han, J., Jentzen, A. & Weinan, E. (2018) Solving high-dimensional partial differential equations using deep learning. PNAS 115(34), 8505–8510.CrossRef Google Scholar PubMed

Heinrich, S. (2001) Multilevel Monte Carlo methods. In: Large-Scale Scientific Computing, Third International Conference LSSC 2001, Sozopol, Bulgaria, 2001, Lecture Notes in Computer Science, Vol. 2170, Springer Verlag, pp. 58–67.CrossRef Google Scholar

Hirsch, C., Wunsch, D., Szumbarksi, J., Laniewski-Wollk, L. & pons-Prats, J. (editors). (2018) Uncertainty Management for Robust Industrial Design in Aeronautics, Notes on Numerical Fluid Mechanics and Multidisciplinary Design, Vol. 140, Springer, Berlin, Germany.Google Scholar

Hornik, K., Stinchcombe, M. & White, H. (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366.CrossRef Google Scholar

Kingma, D. P. & Ba, J. L. (2015) Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13.Google Scholar

Lagaris, I. E., Likas, A. & Fotiadis, D. I. (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Networks 9(5), 987–1000.CrossRef Google Scholar PubMed

LeCun, Y., Bengio, Y. & Hinton, G. (2015) Deep learning. Nature 521, 436–444.CrossRef Google Scholar PubMed

Lu, L., Jin, P. & Karniadakis, G. E. (2019) DeepONet: learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. Preprint, available from arXiv:1910.03193.Google Scholar

Lye, K. O., Mishra, S. & Ray, D. Deep learning observables in computational fluid dynamics. J. Comput. Phys. 410, 109339 (2020).CrossRef Google Scholar

Lye, K. O. Statistical Solutions of Hyperbolic Systems of Conservation Laws. PhD thesis, ETH Zurich.Google Scholar

Mishra, S. (2018) A machine learning framework for data driven acceleration of computations of differential equations. Math. Eng. 1(1), 118–146.CrossRef Google Scholar

Mishra, S. & Rusch, K. (2020) Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences. Preprint, available from arXiv:2005.12564.Google Scholar

Mishra, S. & Schwab, C. (2012) Sparse tensor multi-level Monte Carlo finite volume methods for hyperbolic conservation laws with random initial data. Math. Comput. 81(180), 1979–2018.CrossRef Google Scholar

Mishra, S., Schwab, Ch . & Šukys, J. (2012) Multi-level Monte Carlo finite volume methods for nonlinear systems of conservation laws in multi-dimensions. J. Comput. Phys. 231(8), 3365–3388.CrossRef Google Scholar

Miyanawala, T. P. & Jaiman, R. K. (2017) An efficient deep learning technique for the Navier–Stokes equations: application to unsteady wake flow dynamics. Preprint, available from arXiv:1710.09099v2.Google Scholar

Neyshabur, B., Li, Z., Bhojanapalli, S., LeCun, Y. & Srebro, N. (2018) Towards understanding the role of over-parametrization in generalization of neural networks. arXiv preprint arXiv:1805.12076.Google Scholar

Peherstorfer, B., Willcox, K. & Gunzburger, M. (2018) Survey of multifidelity methods in uncertainty propagation, inference, and optimization. SIAM Rev. 60(3), 550–591.CrossRef Google Scholar

Quateroni, A., Manzoni, A. & Negri, F. (2015) Reduced Basis Methods for Partial Differential Equations: an Introduction, Springer Verlag, Berlin, Germany.Google Scholar

Raissi, M. & Karniadakis, G. E. (2018) Hidden physics models: machine learning of nonlinear partial differential equations. J. Comput. Phys. 357, 125–141.CrossRef Google Scholar

Raissi, M., Yazdani, A. & Karniadakis, G. E. (2020) Hidden fluid mechanics: learning velocity and pressure fields from flow visualizations. Science 367(6481), 1026–1030.CrossRef Google Scholar PubMed

Rasmussen, C. E. (2003) Gaussian Processes in Machine Learning, Summer School on Machine Learning, Springer, Berlin, Heidelberg.Google Scholar

Ray, D. & Hesthaven, J. S. (2018) An artificial neural network as a troubled cell indicator. J. Comput. Phys. 367, 166–191.CrossRef Google Scholar

Ray, D., Chandrasekhar, P., Fjordholm, U. S. & Mishra, S. (2016) Entropy stable scheme on two-dimensional unstructured grids for Euler equations. Commun. Comput. Phys. 19(5), 1111–1140.CrossRef Google Scholar

Ruder, S. (2017) An overview of gradient descent optimization algorithms. Preprint, available from arXiv.1609.04747v2.Google Scholar

Sacks, et al. (1989) Design and analysis of computer experiments. Stat. Sci. 4, 409–423.Google Scholar

Shalev-Shwartz, S. & Ben-David, S. (2014) Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, Cambridge, UK.CrossRef Google Scholar

Statistical Functions (scipy.stats). Python Library. https://docs.scipy.org/doc/scipy/reference/stats.html Google Scholar

Tompson, J., Schlachter, K., Sprechmann, P. & Perlin, K. (2017) Accelarating Eulerian fluid simulation with convolutional networks. Preprint, available from arXiv:1607.03597v6.Google Scholar

E W. Han, J. & Jentzen, A. (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun. Math. Stat. 5(4), 349–380.Google Scholar

E W. Ma, C. & Wu, L. (2018) A priori estimates for the generalization error for two-layer neural networks. ArXIV preprint, available from arXiv:1810.06397.Google Scholar

E, W. & Yu, B. (2018) The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat. 6(1), 1–12.CrossRef Google Scholar

Yarotsky, D. (2017) Error bounds for approximations with deep ReLU networks. Neural Networks 94, 103–114.CrossRef Google Scholar PubMed

Zaspel, P., Huang, B., Harbrecht, H. & Anotole von Lillenfeld, O. Boosting quantum machine learning with multi-level combination technique: Pople diagrams revisited. Preprint, available as arxiv1808.02799v2.Google Scholar

Article contents

A multi-level procedure for enhancing accuracy of machine learning algorithms

Abstract

Keywords

MSC classification

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests