Hostname: page-component-76fb5796d-25wd4 Total loading time: 0 Render date: 2024-04-26T15:29:23.867Z Has data issue: false hasContentIssue false

A domain-theoretic framework for robustness analysis of neural networks

Published online by Cambridge University Press:  23 May 2023

Can Zhou
Affiliation:
Department of Computer Science, University of Oxford, Oxford, UK
Razin A. Shaikh
Affiliation:
Department of Computer Science, University of Oxford, Oxford, UK Quantinuum Ltd., Oxford, UK
Yiran Li
Affiliation:
School of Computer Science, University of Nottingham Ningbo China, Ningbo, China
Amin Farjudian*
Affiliation:
School of Computer Science, University of Nottingham Ningbo China, Ningbo, China
*
*Corresponding author. Email: Amin.Farjudian@gmail.com

Abstract

A domain-theoretic framework is presented for validated robustness analysis of neural networks. First, global robustness of a general class of networks is analyzed. Then, using the fact that Edalat’s domain-theoretic L-derivative coincides with Clarke’s generalized gradient, the framework is extended for attack-agnostic local robustness analysis. The proposed framework is ideal for designing algorithms which are correct by construction. This claim is exemplified by developing a validated algorithm for estimation of Lipschitz constant of feedforward regressors. The completeness of the algorithm is proved over differentiable networks and also over general position ${\mathrm{ReLU}}$ networks. Computability results are obtained within the framework of effectively given domains. Using the proposed domain model, differentiable and non-differentiable networks can be analyzed uniformly. The validated algorithm is implemented using arbitrary-precision interval arithmetic, and the results of some experiments are presented. The software implementation is truly validated, as it handles floating-point errors as well.

Type
Paper
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramsky, S. and Jung, A. (1994). Domain theory. In: Abramsky, S., Gabbay, D. M. and Maibaum, T. S. E. (eds.) Handbook of Logic in Computer Science, vol. 3, Oxford, Clarendon Press, 1168.Google Scholar
Abramsky, S. (1990). Abstract interpretation, logical relations, and Kan extensions. Journal of Logic and Computation 1 (1) 540.CrossRefGoogle Scholar
Albiac, F. and Kalton, N. J. (2006). Topics in Banach Space Theory, New York, Springer.Google Scholar
Araujo, A., Négrevergne, B., Chevaleyre, Y. and Atif, J. (2021). On Lipschitz regularization of convolutional layers using Toeplitz matrix theory. In: AAAI.Google Scholar
Bartlett, P. L., Long, P. M., Lugosi, G. and Tsigler, A. (2020). Benign overfitting in linear regression. Proceedings of the National Academy of Sciences 117 (48) 3006330070. doi: 10.1073/pnas.1907378117.CrossRefGoogle Scholar
Belkin, M. (2021). Fit without fear: Remarkable mathematical phenomena of deep learning through the prism of interpolation. Acta Numerica 30 203248. doi: 10.1017/S0962492921000039.CrossRefGoogle Scholar
Bhowmick, A., D’Souza, M. and Raghavan, G. S. (2021). LipBaB: Computing exact Lipschitz constant of ReLU networks. In: Farkaš, I., Masulli, P., Otte, S. and Wermter, S. (eds.) Artificial Neural Networks and Machine Learning – ICANN 2021, Cham, Springer International Publishing, 151–162. doi: 10.1007/978-3-030-86380-7_13.Google Scholar
Burn, G. L., Hankin, C. and Abramsky, S. (1986). Strictness analysis for higher-order functions. Science of Computer Programming 7 249278.CrossRefGoogle Scholar
Carlini, N. and Wagner, D. A. (2017). Towards rvaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22–26, 2017, IEEE Computer Society, 39–57. doi: 10.1109/SP.2017.49.CrossRefGoogle Scholar
Chaudhuri, S., Gulwani, S. and Lublinerman, R. (2012). Continuity and robustness of programs. Communications of the ACM 55 (8) 107115.CrossRefGoogle Scholar
Chaudhuri, S., Gulwani, S., Lublinerman, R. and Navidpour, S. (2011). Proving programs robust. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ESEC/FSE’11, Association for Computing Machinery, 102–112.CrossRefGoogle Scholar
Chen, T., Lasserre, J. B., Magron, V. and Pauwels, E. (2020). Semialgebraic optimization for Lipschitz constants of ReLU networks. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. and Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, virtual.Google Scholar
Clarke, F. H., Ledyaev, Y. S., Stern, R. J. and Wolenski, R. R. (1998). Nonsmooth Analysis and Control Theory, New York, Springer.Google Scholar
Clarke, F. H. (1990). Optimization and Nonsmooth Analysis, Philadelphia, Classics in Applied Mathematics, Society for Industrial and Applied Mathematics.Google Scholar
Cousot, P. and Cousot, R. (1977). Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages. POPL’77, 238–252.CrossRefGoogle Scholar
Di Gianantonio, P. (1996). Real number computability and domain theory. Information and Computation 127 (1) 1125.CrossRefGoogle Scholar
Di Gianantonio, P. and Edalat, A. (2013). A language for differentiable functions. In: Pfenning, F. (ed.) Foundations of Software Science and Computation Structures, Berlin, Heidelberg, Springer Berlin Heidelberg, 337–352. doi: 10.1007/978-3-642-37075-5_22.CrossRefGoogle Scholar
Dietterich, T. G. (2017). Steps toward robust artificial intelligence. AI Magazine 38 (3) 324. doi: 10.1609/aimag.v38i3.2756.CrossRefGoogle Scholar
Dragičević, T., Wheeler, P. and Blaabjerg, F. (2019). Artificial intelligence aided automated design for reliability of power electronic systems. IEEE Transactions on Power Electronics 34 (8) 71617171. doi: 10.1109/TPEL.2018.2883947.CrossRefGoogle Scholar
Edalat, A. (1995a). Domain theory in learning processes. Electronic Notes in Theoretical Computer Science 1. MFPS XI, Mathematical Foundations of Programming Semantics, Eleventh Annual Conference, 114–131.CrossRefGoogle Scholar
Edalat, A. (1995b). Dynamical systems, measures and fractals via domain theory. Information and Computation 120 (1) 3248.CrossRefGoogle Scholar
Edalat, A., Lieutier, A. and Pattinson, D. (2013). A computational model for multi-variable differential calculus. Information and Computation 224 2345.CrossRefGoogle Scholar
Edalat, A. and Pattinson, D. (2007). A domain-theoretic account of Picard’s theorem. LMS Journal of Computation and Mathematics 10 83118.CrossRefGoogle Scholar
Edalat, A. and Pattinson, D. (2005). Inverse and implicit functions in domain theory. In: Proceedings of the 20th Annual IEEE Symposium on Logic in Computer Science, IEEE Computer Society, pp. 417–426.CrossRefGoogle Scholar
Edalat, A. and Sünderhauf, P. (1999). A domain theoretic approach to computability on the real line. Theoretical Computer Science 210 7398.CrossRefGoogle Scholar
Edalat, A. (2008). A continuous derivative for real-valued functions. In: New Computational Paradigms: Changing Conceptions of What is Computable, Springer, 493–519.CrossRefGoogle Scholar
Edalat, A. (1997). Domains for computation in mathematics, physics and exact real arithmetic. Bulletin of Symbolic Logic 3 (4) 401452.CrossRefGoogle Scholar
Edalat, A. and Escardó, M. H. (2000). Integration in real PCF. Information and Computation 160 (1–2) 128166.CrossRefGoogle Scholar
Edalat, A., Farjudian, A., Mohammadian, M. and Pattinson, D. (2020). Domain theoretic second-order Euler’s method for solving initial value problems. Electronic Notes in Theoretical Computer Science 352. The 36th Mathematical Foundations of Programming Semantics Conference, 2020 (MFPS 2020), Paris, France, 105–128.CrossRefGoogle Scholar
Edalat, A. and Lieutier, A. (2004). Domain theory and differential calculus (functions of one variable). Mathematical Structures in Computer Science 14 (6) 771802.CrossRefGoogle Scholar
Edalat, A. and Pattinson, D. (2007). Denotational semantics of hybrid automata. The Journal of Logic and Algebraic Programming 73 (1) 321.CrossRefGoogle Scholar
Erker, T., Escardó, M. H. and Keimel, K. (1998). The way-below relation of function spaces over semantic domains. Topology and its Applications 89 (1) 6174. doi:10.1016/S0166-8641(97)00226--5.CrossRefGoogle Scholar
Escardó, M. H. (1996). PCF extended with real numbers. Theoretical Computer Science 162 79–5.CrossRefGoogle Scholar
Escardó, M. H. (1998). Properly injective spaces and function spaces. Topology and its Applications 89 (1) 75120. doi: 10.1016/S0166-8641(97)00225-3.CrossRefGoogle Scholar
Farjudian, A. (2007). Shrad: A language for sequential real number computation. Theory of Computing Systems 41 (1) 49105.CrossRefGoogle Scholar
Farjudian, A. and Moggi, E. (2022). Robustness, Scott Continuity, and Computability. arXiv:2208.12347 [cs.LO].Google Scholar
Fazlyab, M., Robey, A., Hassani, H., Morari, M. and Pappas, G. J. (2019). Efficient and accurate estimation of Lipschitz constants for deep neural networks. In: Wallach, H. M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B. and Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada, 11423–11434.Google Scholar
Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P. Chaudhuri, S. and Vechev, M. (2018). AI2: Safety and robustness certification of neural networks with abstract interpretation. In: 2018 IEEE Symposium on Security and Privacy (SP), 3–18.CrossRefGoogle Scholar
Gierz, G., Hofmann, K. H., Keimel, K., Lawson, J. D., Mislove, M. W. and Scott, D. S. (2003). Continuous Lattices and Domains , Encycloedia of Mathematics and its Applications, vol. 93, New York, Cambridge University Press.Google Scholar
Gierz, G., Hofmann, K. H., Keimel, K., Lawson, J. D., Mislove, M. W. and Scott, D. S. (1980). A Compendium of Continuous Lattices , Berlin Heidelberg, Springer.Google Scholar
Goodfellow, I. J., Shlens, J. and Szegedy, C. (2015). Explaining and harnessing adversarial examples. CoRR abs/1412.6572. http://arxiv.org/abs/1412.6572.Google Scholar
Goubault-Larrecq, J. (2013). Non-Hausdorff Topology and Domain Theory, New York, Cambridge University Press.CrossRefGoogle Scholar
Haan, L. de and Ferreira, A. (2006). Extreme Value Theory: An Introduction, New York, Springer.CrossRefGoogle Scholar
Hashemi, N., Ruths, J. and Fazlyab, M. (2021). Certifying incremental quadratic constraints for neural networks via convex optimization. In: Jadbabaie, A., Lygeros, J., Pappas, G. J., Parrilo, A., , P., Recht, B., Tomlin, C. J. and Zeilinger, M. N. (eds.) Proceedings of the 3rd Annual Conference on Learning for Dynamics and Control, L4DC 2021, 7–8 June 2021, Virtual Event, Switzerland, Proceedings of Machine Learning Research, vol. 144, 842–853.Google Scholar
Heaven, D. (2019). Why deep-learning AIs are so easy to fool. Nature 574 163166.CrossRefGoogle ScholarPubMed
Hein, M. and Andriushchenko, M. (2017). Formal guarantees on the robustness of a classifier against adversarial manipulation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17, Long Beach, California, USA, 2263–2273.Google Scholar
Hertling, P. (2017). Clarke’s generalized gradient and Edalat’s L-derivative. Journal of Logic & Analysis 9 121.Google Scholar
Jech, T. (2002). Set Theory , Berlin Heidelberg, Springer.Google Scholar
Jia, K. and Rinard, M. (2021). Exploiting verified neural networks via floating point numerical error. In: Dragoi, C., Mukherjee, S., and Namjoshi, K. S. (eds.) Static Analysis - 28th International Symposium, SAS 2021, Chicago, IL, USA, October 17–19, 2021, Proceedings, Lecture Notes in Computer Science, vol. 12913, Springer, 191–205. doi: 10.1007/978-3-030-88806-0_9.CrossRefGoogle Scholar
Jordan, M. and Dimakis, A. G. (2020). Exactly computing the local Lipschitz constant of ReLU networks. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada. Google Scholar
Jordan, M. and Dimakis, A. G. (2021). Provable Lipschitz certification for generative models. In: Meila, M. and Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 139, 5118–5126.Google Scholar
Keimel, K. (2017). Domain theory its ramifications and interactions. Electronic Notes in Theoretical Computer Science 333 (Supplement C). The Seventh International Symposium on Domain Theory and Its Applications (ISDT), 3–16.CrossRefGoogle Scholar
Ko, C.-Y., Lyu, Z., Weng, L., Daniel, L., Wong, N. and Lin, D. (2019). POPQORN: Quantifying robustness of recurrent neural networks. In: Chaudhuri, K. and Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 97, 3468–3477.Google Scholar
Ko, K.-I. (1991). Complexity Theory of Real Functions, Boston, Birkhäuser.CrossRefGoogle Scholar
Konečný, M. and Farjudian, A. (2010a). Compositional semantics of dataflow networks with query-driven communication of exact values. Journal of Universal Computer Science 16 (18) 26292656.Google Scholar
Konečný, M. and Farjudian, A. (2010b). Semantics of query-driven communication of exact values. Journal of Universal Computer Science 16 (18) 25972628.Google Scholar
Latorre, F., Rolland, P. and Cevher, V. (2020). Lipschitz constant estimation of Neural Networks via sparse polynomial optimization. In: International Conference on Learning Representations.Google Scholar
Laurel, J., Yang, R., Singh, G. and Misailovic, S. (2022). A dual number abstraction for static analysis of Clarke Jacobians. Proceedings of the ACM on Programming Languages 6. doi: 10.1145/3498718.CrossRefGoogle Scholar
Lee, S., Lee, J. and Park, S. (2020). Lipschitz-certifiable training with a tight outer bound. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F., and Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, Curran Associates, Inc., 16891–16902.Google Scholar
Mirman, M., Gehr, T. and Vechev, M. T. (2018). Differentiable abstract interpretation for provably robust neural networks. In: Dy, J. G. and Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018, Proceedings of Machine Learning Research, vol. 80, PMLR, 3575–3583.Google Scholar
Moggi, E., Farjudian, A., Duracz, A. and Taha, W. (2018). Safe & robust reachability analysis of hybrid systems. Theoretical Computer Science 747 7599.CrossRefGoogle Scholar
Moggi, E., Farjudian, A. and Taha, W. (2019a). System analysis and robustness. In: Margaria, T., Graf, S. and Larsen, K. G. (eds.) Models, Mindsets, Meta: The What, the How, and the Why Not? Essays Dedicated to Bernhard Steffen on the Occasion of His 60th Birthday, Springer International Publishing, 36–44.CrossRefGoogle Scholar
Moggi, E., Farjudian, A. and Taha, W. (2019b). System analysis and robustness. In: Proceedings of the 20th Italian Conference on Theoretical Computer Science, ICTCS 2019, Como, Italy, September 9–11, 2019, pp. 17. http://ceur-ws.org/Vol-2504/paper1.pdf.Google Scholar
Moore, R. E., Kearfott, R. B. and Cloud, M. J. (2009). Introduction to Interval Analysis, Philadelphia, PA, USA, Society for Industrial and Applied Mathematics.CrossRefGoogle Scholar
Muthukumar, V., Vodrahalli, K., Subramanian, V. and Sahai, A. (2020). Harmless interpolation of Noisy data in regression. IEEE Journal on Selected Areas in Information Theory 1 (1) 6783. doi: 10.1109/JSAIT.2020.2984716.CrossRefGoogle Scholar
Neumaier, A. (1993). The wrapping effect, ellipsoid arithmetic, stability and confidence regions. In: Albrecht, R., Alefeld, G., and Stetter, H. J. (eds.) Validation Numerics: Theory and Applications, Springer, 175–190. doi: 10.1007/978-3-7091-6918-6_14.CrossRefGoogle Scholar
Pauli, P., Koch, A., Berberich, J., Kohler, P. and Allgöwer, F. (2022). Training robust neural networks using Lipschitz bounds. IEEE Control Systems Letters 6 121126. doi: 10.1109/LCSYS.2021.3050444.CrossRefGoogle Scholar
Prabhakar, P. and Afzal, Z. R. (2019). Abstraction based output range analysis for neural networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS 2019), 15788–15798.Google Scholar
Revol, N. and Rouillier, F. (2005). Motivations for an arbitrary precision interval arithmetic and the MPFI library. Reliable Computing 11 (4) 275290.CrossRefGoogle Scholar
Rudd, K. (2013). Solving Partial Differential Equations Using Artificial Neural Networks. Phd thesis. Department of Mechanical Engineering and Material Sciences, Duke University.Google Scholar
Rudin, W. (1991). Functional Analysis, 2nd ed., Singapore, McGraw-Hill.Google Scholar
Scott, D. (1970). Outline of a mathematical theory of computation. In: Proceedings of the Fourth Annual Princeton Conference on Information Sciences and Systems, 169–176.Google Scholar
Serban, A., Poll, E. and Visser, J. (2020). Adversarial examples on object recognition: A comprehensive survey. ACM Computing Surveys 53 (3) 138. doi: 10.1145/3398394.CrossRefGoogle Scholar
Simpson, A. K. (1998). Lazy functional algorithms for exact real functionals. In: Brim, L., Gruska, J. and Zlatuška, J. (eds.) Mathematical Foundations of Computer Science 1998, Springer, 456–464.CrossRefGoogle Scholar
Singh, G., Gehr, T., Püschel, M. and Vechev, M. (2019). An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages 3 (POPL) 130.CrossRefGoogle Scholar
Smyth, M. B. (1977). Effectively given domains. Theoretical Computer Science 5 257274.CrossRefGoogle Scholar
Sotoudeh, M. and Thakur, A. V. (2020). Abstract neural networks. In: Pichardie, D. and Sighireanu, M. (eds.) Static Analysis - 27th International Symposium, SAS 2020, Virtual Event, November 18–20, 2020, Proceedings, Lecture Notes in Computer Science, vol. 12389, 65–88. doi: 10.1007/978-3-030-65474-0_4.CrossRefGoogle Scholar
Stanley, R. P. (2006). An Introduction to Hyperplane Arrangements, IAS/Park City Mathematics Series.Google Scholar
Stoltenberg-Hansen, V. and Tucker, J. V. (1999). Concrete models of computation for topological algebras. Theoretical Computer Science 219 (1) 347378.CrossRefGoogle Scholar
Stoltenberg-Hansen, V. and Tucker, J. V. (1995). Effective algebras. In: Abramsky, S., Gabbay, D. M., and Maibaum, T. S. E. (eds.) Handbook of Logic in Computer Science, Semantic Modelling, vol. IV, Oxford University Press, 357–526.Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I and Fergus, R. (2014). Intriguing Properties of Neural Networks. arXiv: 1312.6199 [cs.CV].Google Scholar
Tankimanova, A. and James, A. P. (2018). Neural network-based analog-to-digital converters. In: James, A. P. (ed.) Memristor and Memristive Neural Networks, Rijeka, IntechOpen. Chap. 14. doi: 10.5772/intechopen.73038.Google Scholar
Virmaux, A. and Scaman, K. (2018). Lipschitz regularity of deep neural networks: Analysis and efficient estimation. In: Bengio, S., Wallach, H. M., Larochelle, H., Grauman, K., Bianchi, N. C. and Garnett, R. (eds.) Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, 3839–3848.Google Scholar
Wang, S., Pei, K., Whitehouse, J., Yang, J. and Jana, S. (2018). Formal security analysis of neural networks using symbolic intervals. In: Proceedings of the 27th USENIX Conference on Security Symposium. SEC’18, Baltimore, MD, USA, USENIX Association, 1599–1614.Google Scholar
Weihrauch, K. (2000). Computable Analysis, An Introduction , Berlin Heidelberg, Springer.Google Scholar
Weng, T.-W., Zhang, H., Chen, P.-Y., Yi, J., Su, D., Gao, Y., Hsieh, C.-J. and Daniel, L. (2018a). Evaluating the robustness of neural networks: An extreme value theory approach. In: International Conference on Learning Representations.Google Scholar
Weng, T.-W., Zhang, H., Chen, H., Song, Z., Hsieh, C.-J., Boning, D., Dhillon, I.S. and Daniel, L. (2018b). Towards fast computation of certified robustness for ReLU networks. In: International Conference on Machine Learning (ICML).Google Scholar
Wong, E. and Kolter, J. Z. (2018). Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Dy, J. G. and Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018, Proceedings of Machine Learning Research, vol. 80, PMLR, 5283–5292.Google Scholar
Wong, E., Schmidt, F. R. and Kolter, J. Z. (2019). Wasserstein Adversarial Examples via Projected Sinkhorn Iterations. arXiv: 1902.07906 [cs.LG].Google Scholar
Zombori, D., Bánhelyi, B., Csendes, T., Megyeri, I. and Jelasity, M. (2021). Fooling a complete neural network verifier. In: International Conference on Learning Representations.Google Scholar