We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we formulate the general regression problem relevant to function estimation. We begin with simple frequentist methods and quickly move to regression within the Bayesian paradigm. We then present two complementary mathematical formulations: one that relies on Gaussian process priors, appropriate for the regression of continuous quantities, and one that relies on Beta–Bernoulli process priors, appropriate for the regression of discrete quantities. In the context of the Gaussian process, we discuss more advanced topics including various admissible kernel functions, inducing point methods, sampling methods for nonconjugate Gaussian process prior-likelihood pairs, and elliptical slice samplers. For Beta–Bernoulli processes, we address questions of posterior convergence in addition to applications. Taken together, both Gaussian processes and Beta–Bernoulli processes constitute our first foray into Bayesian nonparametrics. With end of chapter projects, we explore more advanced modeling questions relevant to optics and microscopy.
Edited by
Alik Ismail-Zadeh, Karlsruhe Institute of Technology, Germany,Fabio Castelli, Università degli Studi, Florence,Dylan Jones, University of Toronto,Sabrina Sanchez, Max Planck Institute for Solar System Research, Germany
Abstract: In this chapter, we survey some recent developments in the field of geophysical inversion. We aim to provide an accessible general introduction to the breadth of current research, rather than focusing in depth on particular topics. We hope to give the reader an appreciation for the similarities and connections between different approaches, and their relative strengths and weaknesses.
Mobile robots are a key component for the automation of many tasks that either require high precision or are deemed too hazardous for human personnel. One of the typical duties for mobile robots in the industrial sector is to perform trajectory tracking, which involves pursuing a specific path through both space and time. In this paper, an iterative learning-based procedure for highly accurate tracking is proposed. This contribution shows how data-based techniques, namely Gaussian process regression, can be used to tailor a motion model to a specific reoccurring reference. The procedure is capable of explorative behavior meaning that the robot automatically explores states around the prescribed trajectory, enriching the data set for learning and increasing the robustness and practical training accuracy. The trade-off between highly accurate tracking and exploration is done automatically by an optimization-based reference generator using a suitable cost function minimizing the posterior variance of the underlying Gaussian process model. While this study focuses on omnidirectional mobile robots, the scheme can be applied to a wide range of mobile robots. The effectiveness of this approach is validated in meaningful real-world experiments on a custom-built omnidirectional mobile robot where it is shown that explorative behavior can outperform purely exploitative approaches.
Kernel methods provide an alternative family of non-linear methods to neural networks, with support vector machine being the best known among kernel methods. Almost all linear statistical methods have been non-linearly generalized by the kernel approach, including ridge regression, linear discriminant analysis, principal component analysis, canonical correlation analysis, and so on. The kernel method has also been extended to probabilisitic models, for example Gaussian processes.
Simulations of future climate contain variability arising from a number of sources, including internal stochasticity and external forcings. However, to the best of our abilities climate models and the true observed climate depend on the same underlying physical processes. In this paper, we simultaneously study the outputs of multiple climate simulation models and observed data, and we seek to leverage their mean structure as well as interdependencies that may reflect the climate’s response to shared forcings. Bayesian modeling provides a fruitful ground for the nuanced combination of multiple climate simulations. We introduce one such approach whereby a Gaussian process is used to represent a mean function common to all simulated and observed climates. Dependent random effects encode possible information contained within and between the plurality of climate model outputs and observed climate data. We propose an empirical Bayes approach to analyze such models in a computationally efficient way. This methodology is amenable to the CMIP6 model ensemble, and we demonstrate its efficacy at forecasting global average near-surface air temperature. Results suggest that this model and the extensions it engenders may provide value to climate prediction and uncertainty quantification.
Wind turbine towers are subjected to highly varying internal loads, characterized by large uncertainty. The uncertainty stems from many factors, including what the actual wind fields experienced over time will be, modeling uncertainties given the various operational states of the turbine with and without controller interaction, the influence of aerodynamic damping, and so forth. To monitor the true experienced loading and assess the fatigue, strain sensors can be installed at fatigue-critical locations on the turbine structure. A more cost-effective and practical solution is to predict the strain response of the structure based only on a number of acceleration measurements. In this contribution, an approach is followed where the dynamic strains in an existing onshore wind turbine tower are predicted using a Gaussian process latent force model. By employing this model, both the applied dynamic loading and strain response are estimated based on the acceleration data. The predicted dynamic strains are validated using strain gauges installed near the bottom of the tower. Fatigue is subsequently assessed by comparing the damage equivalent loads calculated with the predicted as opposed to the measured strains. The results confirm the usefulness of the method for continuous tracking of fatigue life consumption in onshore wind turbine towers.
Tree-ring chronologies encode interannual variability in forest growth rates over long time periods from decades to centuries or even millennia. However, each chronology is a highly localized measurement describing conditions at specific sites where wood samples have been collected. The question whether these local growth variabilites are representative for large geographical regions remains an open issue. To overcome the limitations of interpreting a sparse network of sites, we propose an upscaling approach for annual tree-ring indices that approximate forest growth variability and compute gridded data products that generalize the available information for multiple tree genera. Using regression approaches from machine learning, we predict tree-ring indices in space and time based on climate variables, but considering also species range maps as constraints for the upscaling. We compare various prediction strategies in cross-validation experiments to identify the best performing setup. Our estimated maps of tree-ring indices are the first data products that provide a dense view on forest growth variability at the continental level with 0.5° and 0.0083° spatial resolution covering the years 1902–2013. Furthermore, we find that different genera show very variable spatial patterns of anomalies. We have selected Europe as study region and focused on the six most prominent tree genera, but our approach is very generic and can easily be applied elsewhere. Overall, the study shows perspectives but also limitations for reconstructing spatiotemporal dynamics of complex biological processes. The data products are available at https://www.doi.org/10.17871/BACI.248.
Reduced-order models (ROMs) are computationally inexpensive simplifications of high-fidelity complex ones. Such models can be found in computational fluid dynamics where they can be used to predict the characteristics of multiphase flows. In previous work, we presented a ROM analysis framework that coupled compression techniques, such as autoencoders, with Gaussian process regression in the latent space. This pairing has significant advantages over the standard encoding–decoding routine, such as the ability to interpolate or extrapolate in the initial conditions’ space, which can provide predictions even when simulation data are not available. In this work, we focus on this major advantage and show its effectiveness by performing the pipeline on three multiphase flow applications. We also extend the methodology by using deep Gaussian processes as the interpolation algorithm and compare the performance of our two variations, as well as another variation from the literature that uses long short-term memory networks, for the interpolation.
Electromagnetic simulation software has become an important tool for antenna design. However, high-fidelity simulation of wideband or ultra-wideband antennas is very expensive. Therefore, antenna optimization design by using an electromagnetic solver may be limited due to its high computational cost. This problem can be alleviated by the utilization of fast and accurate surrogate models. Unfortunately, conventional surrogate models for antenna design are usually prohibitive because training data acquisition is time-consuming. In order to solve the problem, a modeling method named progressive Gaussian process (PGP) is proposed in this study. Specially, when a Gaussian process (GP) is trained, test sample with the largest predictive variance is inputted into an electromagnetic solver to simulate its results. After that, the test sample is added to the training set to train the GP progressively. The process can incrementally increase some important trusted training data and improve the model generalization performance. Based on the proposed PGP, two monopole antennas are optimized. The optimization results show effectiveness and efficiency of the method.
This paper studies the joint tail asymptotics of extrema of the multi-dimensional Gaussian process over random intervals defined as
$P(u)\;:\!=\; \mathbb{P}\{\cap_{i=1}^n (\sup_{t\in[0,\mathcal{T}_i]} ( X_{i}(t) +c_i t )>a_i u )\}$
,
$u\rightarrow\infty$
, where
$X_i(t)$
,
$t\ge0$
,
$i=1,2,\ldots,n$
, are independent centered Gaussian processes with stationary increments,
$\boldsymbol{\mathcal{T}}=(\mathcal{T}_1, \ldots, \mathcal{T}_n)$
is a regularly varying random vector with positive components, which is independent of the Gaussian processes, and
$c_i\in \mathbb{R}$
,
$a_i>0$
,
$i=1,2,\ldots,n$
. Our result shows that the structure of the asymptotics of P(u) is determined by the signs of the drifts
$c_i$
. We also discuss a relevant multi-dimensional regenerative model and derive the corresponding ruin probability.
We present a hierarchical Dirichlet regression model with Gaussian process priors that enables accurate and well-calibrated forecasts for U.S. Senate elections at varying time horizons. This Bayesian model provides a balance between predictions based on time-dependent opinion polls and those made based on fundamentals. It also provides uncertainty estimates that arise naturally from historical data on elections and polls. Experiments show that our model is highly accurate and has a well calibrated coverage rate for vote share predictions at various forecasting horizons. We validate the model with a retrospective forecast of the 2018 cycle as well as a true out-of-sample forecast for 2020. We show that our approach achieves state-of-the art accuracy and coverage despite relying on few covariates.
Chapter 11 addresses time- and/or space-variant structural reliability problems. It begins with a description of problem types as encroaching or outcrossing, subject to the type of dependence on the time or space variable. A brief review of essentials from the random process theory is presented, including second-moment characterization of the process in terms of mean and auto-covariance functions and the power spectral density. Special attention is given to Gaussian and Poisson processes as building blocks for stochastic load modeling. Bounds to the failure probability are developed in terms of mean crossing rates or using a series system representation through parameter discretization. A Poisson-based approximation for rare failure events is also presented. Next, the Poisson process is used to build idealized stochastic load models that describe macro-level load changes or intermittent occurrences with random magnitudes and durations. The chapter concludes with the development of the load-coincidence method for combination of stochastic loads. The probability distribution of the maximum combined load effect is derived and used to estimate the failure probability.
Nonlinear stochastic dynamics is a broad topic well beyond the scope of this book. Chapter 13 describes a particular method of solution for a certain class of nonlinear stochastic dynamic problem by use of FORM. The approach belongs to the class of solution methods known as equivalent linearization. In this case, the linearization is carried out by replacing the nonlinear system with a linear one that has a tail probability equal to the FORM approximation of the tail probability of the nonlinear system – hence the name tail-equivalent linearization method. The equivalent linear system is obtained non-parametrically in terms of its unit impulse response function. For small failure probabilities, the accuracy of the method is shown to be far superior to those of other linearization methods. Furthermore, the method is able to capture the non-Gaussian distribution of the nonlinear response. This chapter develops this method for systems subjected to Gaussian and non-Gaussian excitations and nonlinear systems with differentiable loading paths. Approximations for level crossing rates and the first-passage probability are also developed. The method is extended to nonlinear structures subjected to multiple excitations, such as bi-component base motion, and to evolutionary input processes.
We study, under mild conditions, the weak approximation constructed from a standard Poisson process for a class of Gaussian processes, and establish its sample path moderate deviations. The techniques consist of a good asymptotic exponential approximation in moderate deviations, the Besov–Lèvy modulus embedding, and an exponential martingale technique. Moreover, our results are applied to the weak approximations associated with the moving average of Brownian motion, fractional Brownian motion, and an Ornstein–Uhlenbeck process.
We investigate joint modelling of longevity trends using the spatial statistical framework of Gaussian process (GP) regression. Our analysis is motivated by the Human Mortality Database (HMD) that provides unified raw mortality tables for nearly 40 countries. Yet few stochastic models exist for handling more than two populations at a time. To bridge this gap, we leverage a spatial covariance framework from machine learning that treats populations as distinct levels of a factor covariate, explicitly capturing the cross-population dependence. The proposed multi-output GP models straightforwardly scale up to a dozen populations and moreover intrinsically generate coherent joint longevity scenarios. In our numerous case studies, we investigate predictive gains from aggregating mortality experience across nations and genders, including by borrowing the most recently available “foreign” data. We show that in our approach, information fusion leads to more precise (and statistically more credible) forecasts. We implement our models in R, as well as a Bayesian version in Stan that provides further uncertainty quantification regarding the estimated mortality covariance structure. All examples utilise public HMD datasets.
For a zero-mean, unit-variance stationary univariate Gaussian process we derive the probability that a record at the time n, say
$X_n$
, takes place, and derive its distribution function. We study the joint distribution of the arrival time process of records and the distribution of the increments between records. We compute the expected number of records. We also consider two consecutive and non-consecutive records, one at time j and one at time n, and we derive the probability that the joint records
$(X_j,X_n)$
occur, as well as their distribution function. The probability that the records
$X_n$
and
$(X_j,X_n)$
take place and the arrival time of the nth record are independent of the marginal distribution function, provided that it is continuous. These results actually hold for a strictly stationary process with Gaussian copulas.
The purpose of this chapter is to introduce the canonical Gaussian field defined bythe energy norm of the operator,which will play a central role in the interplay among the results of the previous chapters, Gaussian process regression, and game theory. The chapter begins witha presentation of basic definitions and results related to Gaussian random variables, Gaussian vectors, Gaussian spaces, Gaussian conditioning, Gaussian processes, Gaussian measures, and Gaussian fields.
We obtain an asymptotic formula for the persistence probability in the positive real line of a random polynomial arising from evolutionary game theory. It corresponds to the probability that a multi-player two-strategy random evolutionary game has no internal equilibria. The key ingredient is to approximate the sequence of random polynomials indexed by their degrees by an appropriate centered stationary Gaussian process.
Let Xn(k) be the number of vertices at level k in a random recursive tree with n+1 vertices. We are interested in the asymptotic behavior of Xn(k) for intermediate levels k=kn satisfying kn→∞ and kn=o(logn) as n→∞. In particular, we prove weak convergence of finite-dimensional distributions for the process (Xn ([knu]))u>0, properly normalized and centered, as n→∞. The limit is a centered Gaussian process with covariance (u,v)↦(u+v)−1. One-dimensional distributional convergence of Xn(kn), properly normalized and centered, was obtained with the help of analytic tools by Fuchs et al. (2006). In contrast, our proofs, which are probabilistic in nature, exploit a connection of our model with certain Crump–Mode–Jagers branching processes.
Brown‒Resnick processes are max-stable processes that are associated to Gaussian processes. Their simulation is often based on the corresponding spectral representation which is not unique. We study to what extent simulation accuracy and efficiency can be improved by minimizing the maximal variance of the underlying Gaussian process. Such a minimization is a difficult mathematical problem that also depends on the geometry of the simulation domain. We extend Matheron's (1974) seminal contribution in two directions: (i) making his description of a minimal maximal variance explicit for convex variograms on symmetric domains, and (ii) proving that the same strategy also reduces the maximal variance for a huge class of nonconvex variograms representable through a Bernstein function. A simulation study confirms that our noncostly modification can lead to substantial improvements among Gaussian representations. We also compare it with three other established algorithms.