Gaussian processes are widely used for prior modeling in data-centric applications including regression, classification, interpolation of computer output, and Bayesian inverse problems. However, despite their wide-spread adoption, practical and efficient ways of specifying flexible Gaussian processes are still lacking, particularly in the nonstationary case. In addition, Gaussian process methodology suffers from pressing computational challenges concerning their scalability to large datasets and high dimensional settings. This workshop will bring together computational and applied mathematicians, statisticians and subject matter researchers to push forward the modeling and computation with Gaussian processes, their novel use in classical scientific computing tasks, and the emerging theoretical analysis of the associated methodology.
A related but distinct challenge facing Gaussian process methodology is its computational scalability to large datasets. This tractability challenge, which stems from the need to compute the Cholesky factorization of a dense covariance matrix, has been at the forefront of researcher’s minds for decades, resulting in methods such as circulant embedding, Vecchia approximations, and the use of sparse representations. In recent times, new classes of approximation have appeared – including Hutchinson estimators for dealing with the trace terms of the likelihood and hierarchical off-diagonal low rank approximation of the covariance matrix which exploit the smoothness of the covariance kernel between well separated regions — that have the potential to vastly extend the scalability of Gaussian processes. These and other recent computational developments have facilitated the use of Gaussian processes with larger datasets and have also prompted a renewed interest in employing Gaussian processes in several classical scientific computing tasks such as numerical solution of partial differential equations, dimension reduction, and experimental design. However, a full understanding of the error caused by such computational techniques is still missing, and their scalability to high-dimensional settings needs further investigation.
There is a clear need to bring together computational and applied mathematicians, statisticians and subject matter researchers to push the field beyond its present practices, and in particular to investigate (a) new models that go beyond what, even under stationarity, are very narrow classes of stationary covariance functions; (b) how to express and understand the structure in the modeled process and exploit it in the computation phase; (c) novel application of Gaussian processes in classical scientific computing tasks; and (d) rigorous error analysis of the associated methodology.
This workshop will include a poster session for graduate students and postdocs. The application for proposing a poster will be available on this page after you register.
Hierarchical models and computing with stochastic PDEs
Speaker: Finn Lindgren (The University of Edinburgh)
10:15-10:45 CDT
Break
10:45-11:45 CDT
Statistical Finite Elements for Nonlinear PDEs
Speaker: Connor Duffin (University of Cambridge)
In this talk, I will present a statistical finite element method for nonlinear, time-dependent phenomena. The statistical finite element method (statFEM) is a statistical augmentation of the finite element method that enables model-data synthesis through the admission of model misspecification inside of the governing equations, as represented by a Gaussian process. The method is Bayesian, coherently updates model mismatch upon receipt of observed data, and is applicable to a wide range of problems across science and engineering for which finite element methods are appropriate. Scalability is ensured through making a low-rank approximation to the posterior covariance matrix. I’ll first introduce the statFEM, before detailing the methodology for nonlinear problems. I will discuss various case studies, applying the method to experimental and synthetic data in canonical nonlinear PDEs.
11:45-13:00 CDT
Lunch
13:00-14:00 CDT
Maximum likelihood estimation for Gaussian processes under inequality constraints
Speaker: Francois Bachoc (Institut de Mathématiques de Toulouse)
We consider covariance parameter estimation for a Gaussian process under inequality constraints (boundedness, monotonicity or convexity) in fixed-domain asymptotics. We address the estimation of the variance parameter and the estimation of the microergodic parameter of the Matérn and Wendland covariance functions. First, we show that the (unconstrained) maximum likelihood estimator has the same asymptotic distribution, unconditionally and conditionally to the fact that the Gaussian process satisfies the inequality constraints. Then, we study the recently suggested constrained maximum likelihood estimator. We show that it has the same asymptotic distribution as the (unconstrained) maximum likelihood estimator. In addition, we show in simulations that the constrained maximum likelihood estimator is generally more accurate on finite samples.
14:00-15:00 CDT
Generalized probabilistic principal component analysis of microscopic and macroscopic dynamics
Speaker: Mengyang Gu (University of California, Santa Barbara (UCSB))
In the first half of the talk, we will introduce the generalized probabilistic principal component analysis (GPPCA) to study the latent factor model for multiple correlated outcomes, where each factor is modeled by a Gaussian process. The method generalizes the previous probabilistic formulation of principal component analysis (PPCA) by providing the closed-form maximum marginal likelihood estimator of the factor loadings and other parameters. Based on the explicit expression of the precision matrix in the marginal likelihood that we derived, the number of the computational operations is linear with respect to the number of output variables. In the second half of the talk, we will introduce connections of random factor processes with fixed or estimated basis to a number of real applications in probing dynamics, such as differential dynamic microscopy (DDM), agent-based models of collective motions, and geophysical models for ground deformation. In particular, DDM is a widely used Fourier-based scattering tool, for analyzing particle movements from microscopic videos without tracking individual particles. The connections between physical and statistical models enable more accurate estimation of dynamical information and more scalable computation for statistical models.
We propose NonStGM, a general nonparametric graphical modeling framework for studying dynamic associations among the components of a nonstationary multivariate time series. It builds on the framework of Gaussian Graphical Models (GGM) and stationary time series Graphical models (StGM), and complements existing works on parametric graphical models based on change point vector autoregressions (VAR). Analogous to StGM, the proposed framework captures conditional noncorrelations (both intertemporal and contemporaneous) in the form of an undirected graph. In addition, to describe the more nuanced nonstationary relationships among the components of the time series, we introduce the new notion of conditional nonstationarity/stationarity and incorporate it within the graph architecture. This allows one to distinguish between direct and indirect nonstationary relationships among system components, and can be used to search for small subnetworks that serve as the “source” of nonstationarity in a large system. Together, the two concepts of conditional noncorrelation and nonstationarity/stationarity provide a parsimonious description of the dependence structure of the time series. In GGM, the graphical model structure is encoded in the sparsity pattern of the inverse covariance matrix. Analogously, we explicitly connect conditional noncorrelation and stationarity between and within components of the multivariate time series to zero and Toeplitz embeddings of an infinite-dimensional inverse covariance operator. In order to learn the graph, we move to the Fourier domain. We show that in the Fourier domain, conditional stationarity and noncorrelation relationships in the inverse covariance operator are encoded with a specific sparsity structure of its integral kernel operator. Within the local stationary framework we show that these sparsity patterns can be recovered from finite-length time series by node-wise regression of discrete Fourier Transforms (DFT) across different Fourier frequencies. We illustrate the features of our general framework under the special case of time-varying Vector Autoregressive models. We demonstrate the feasibility of learning NonStGM structure from data using simulation studies.
10:15-10:45 CDT
Break
10:45-11:45 CDT
Statistical Emulators for High-Dimensional Complex Forward Models in Remote Sensing
Speaker: Emily Kang (University of Cincinnati)
The retrieval algorithms in remote sensing generally involve complex physical forward models that are nonlinear and computationally expensive to evaluate. Statistical emulation provides an alternative with cheap computation and can be used to quantify uncertainty, calibrate model parameters, and improve computational efficiency of the retrieval algorithms. Motivated by this, we introduce a framework of building statistical emulators by combining dimension reduction of input and output spaces and Gaussian process modeling. The functional principal component analysis (FPCA) via a conditional expectation method is chosen to reduce the dimension of the output space of the forward model. In addition, instead of making restrictive assumptions regarding the dependence structure of the high-dimensional input space, we apply the gradient-based kernel dimension reduction method to reduce the dimension of input space. A computationally efficient Gaussian process model is then constructed at the low-dimensional input and output spaces. Theoretical properties of the resulting statistical emulator are explored, and the performance of our approach is demonstrated with numerical studies with NASA’s Orbiting Carbon Observatory-2 (OCO2) data.
11:45-13:00 CDT
Lunch
13:00-14:00 CDT
Multivariate Gaussian Random Fields: Statistical and Sample Path Properties
Speaker: Yimin Xiao (Michigan State University)
In this talk, we present some recent results on statistical and sample path properties of several classes of multivariate Gaussian random fields including multivariate Mat’ern Gaussian fields, operator fractional Brownian motion, vector-valued operator-scaling random fields, and matrix-valued Gaussian random fields. These results illustrate explicitly the effects of the dependence structures among the coordinate processes on statistical and sample path properties of multivariate Gaussian random fields. A useful technical tool for establishing these results is the property of strong local nondeterminism.
14:00-15:00 CDT
Efficient solvers for Bayesian inverse problems and Gaussian random fields
Speaker: Arvind Saibaba (North Carolina State University)
A commonly used prior distribution for the infinite-dimensional Bayesian formulation is the Whittle-Mat’ern prior, which involves a covariance operator based on a fractional elliptic operator. Due to computational considerations, the approach is computationally challenging for non-integer exponents. A similar computational bottleneck arises in generating samples using the SPDE approach to Gaussian random fields. We present two different solvers for efficiently applying the discretized covariance operator (and its square root) for all admissible values of the exponent: the first uses a multipreconditioned Krylov subspace method, and the second exploits a certain low-rank structure in the solution. We also discuss how the multipreconditioned solver can be used to accelerate the solution of linear Bayesian inverse problems to obtain the maximum a posteriori estimate and the approximate posterior variance. Numerical experiments demonstrate the performance and scalability of the solvers and their applicability to model and real-data inverse problems in tomography and a time-dependent heat equation. This is joint work with Harbir Antil (George Mason) and Hussam Al Daas (Rutherford Appleton Laboratory).
15:00-15:30 CDT
Break
15:30-16:30 CDT
Scalable Gaussian-Process Inference Using Vecchia Approximations
Speaker: Matthias Katzfuss (Texas A&M University)
Gaussian processes (GPs) are popular, flexible, and interpretable probabilistic models for functions in geospatial analysis, computer-model emulation, and machine learning. However, direct application of GPs involves dense covariance matrices and is computationally infeasible for large datasets. We consider a framework for fast GP inference based on the so-called Vecchia approximation, which implies a sparse Cholesky factor of the inverse covariance matrix. The approximation can be written in closed form and computed in parallel, and it includes many popular existing approximations as special cases. We discuss various applications and extensions of the framework, including latent and nonisotropic GPs.
Wednesday, August 31, 2022
9:15-10:15 CDT
Deep Gaussian Process Surrogates for Computer Experiments
Speaker: Robert Gramacy (Virginia Polytechnic Institute & State University (Virginia Tech))
Deep Gaussian processes (DGPs) upgrade ordinary GPs through functional composition, in which intermediate GP layers warp the original inputs, providing flexibility to model non-stationary dynamics. Recent applications in machine learning favor approximate, optimization-based inference for fast predictions, but applications to computer surrogate modeling — with an eye towards downstream tasks like calibration, Bayesian optimization, and input sensitivity analysis — demand broader uncertainty quantification (UQ). We prioritize UQ through full posterior integration in a Bayesian scheme, hinging on elliptical slice sampling the latent layers. We demonstrate how our DGP’s non-stationary flexibility, combined with appropriate UQ, allows for active learning: a virtuous cycle of data acquisition and model updating that departs from traditional space-filling design and yields more accurate surrogates for fixed simulation effort. But not all simulation campaigns can be developed sequentially, and many existing computer experiments are simply too big for full DGP posterior integration because of cubic scaling bottlenecks. For this case we introduce the Vecchia approximation, popular for ordinary GPs in spatial data settings. We show that Vecchia-induced sparsity of Cholesky factors allows for linear computational scaling without compromising DGP accuracy or UQ. We vet both active learning and Vecchia-approximated DGPs on numerous illustrative examples and a real simulation involving drag on satellites in low-Earth orbit. We showcase implementation in the deepgp package for R on CRAN.
10:15-10:45 CDT
Break
10:45-11:45 CDT
Regularity and efficient simulation of Gaussian processes defined through SPDEs
Speaker: Kristin Kirchner (Delft University of Technology)
Many models in spatial statistics are based on Gaussian random fields (GRFs). Motivated by the SPDE approach which exploits a well-known relation between the Matérn class of GRFs and stochastic partial differential equations (SPDEs), in this talk I will consider GRFs on a bounded Euclidean domain which are solutions of fractional-order elliptic SPDEs driven by additive spatial white noise. I will discuss the regularity of these GRFs in Sobolev and Hölder spaces and propose a numerical approximation, which provably converges in these spaces at explicit and sharp rates. Furthermore, I will address the computational benefits of the proposed method compared to kernel-based approaches, illustrated by several numerical experiments.
Finally, I will give an outlook on spatiotemporal models which are based on SPDEs involving fractional powers of parabolic space-time differential operators.
This talk is based on joint works with David Bolin, Sonja Cox, Mihály Kovács, Christoph Schwab and Joshua Willems.
11:45-13:00 CDT
Lunch
13:00-14:00 CDT
Fast methods for conditional simulation
Speaker: Douglas Nychka (Colorado School of Mines)
An advantage of a Gaussian process (GP) model for surface fitting is the companion estimates of the function’s uncertainty. The standard method for assessing uncertainty of a GP estimate is through conditional simulation, a Monte Carlo sampling algorithm of the multivariate Gaussian distribution. Conditional simulation is a powerful tool, for example allowing for Monte Carlo based uncertainty on surface contours ( level sets), a difficult and nonlinear inference problem. This algorithm, however, has two bottlenecks: generating spatial predictions on large, but regular grids and also simulation of a Gaussian process on both a large regular grid and at irregular locations. Here accurate approximations are proposed that allow for for fast computation of these steps. In both cases the computational efficiency is achieved by relying on the fast Fourier transform for 2D convolution and also sparse matrix multiplication. Under common spatial applications a speedup by a factor of 10 to a 100 or more is obtained and makes it possible to determine uncertainty of GP estimates on a laptop and in often an interactive setting. Besides the practical benefits of this speedup the two approximations are examples of interesting features of GP analysis. Namely exploiting the “screening effect” for spatial prediction and the errors bounds in interpolation when the GP is related to an element in a reproducing kernel Hilbert space. This work is in collaboration with Maggie Bailey and Soutir Bandyopadhyay. See Bailey, Maggie D., Soutir Bandyopadhyay, and Douglas Nychka. “Adapting conditional simulation using circulant embedding for irregularly spaced spatial data.” Stat 11.1 (2022): e446.
14:00-16:00 CDT
Poster Session + Social Hour
Thursday, September 1, 2022
9:15-10:15 CDT
Gaussian Whittle-Matérn fields on metric graphs
Speaker: David Bolin (King Abdullah Univ. of Science and Technology (KAUST))
We define a new class of Gaussian processes on compact metric graphs such as street or river networks. The proposed models, the Whittle-Matérn fields, are defined via a fractional stochastic partial differential equation on the compact metric graph and are a natural extension of Gaussian fields with Matérn covariance functions on Euclidean domains to the non-Euclidean metric graph setting. Existence of the processes, as well as their sample path regularity properties are derived. The model class in particular contains differentiable Gaussian processes. To the best of our knowledge, this is the first construction of a valid differentiable Gaussian field on general compact metric graphs. We then focus on a model subclass which we show contains processes with Markov properties. For this case, we show how to evaluate finite dimensional distributions of the process exactly and computationally efficiently. This facilitates using the proposed models for statistical inference without the need for any approximations.
This is joint work with Alexandre Simas and Jonas Wallin.
10:15-10:45 CDT
Break
10:45-11:45 CDT
Graph-Based Approximation of Matérn Gaussian Fields
Speaker: Ruiyi Yang (Princeton University)
Matérn Gaussian fields (MGFs) have been popular modeling choices in many aspects of Bayesian methodologies. In this talk we will discuss a generalization of MGFs to manifolds and graphs. In the first part, we formalize the definition of MGFs on manifolds and introduce a sparse approximation based on graph discretization together with a convergence analysis, complementing the finite element approximations of MGFs on Euclidean domains. Numerical examples will demonstrate their wide applicability. In the second part, we introduce a framework for choosing the number of discretization nodes when approximating MGFs and demonstrate the ideas through the finite element approaches. We propose a conceptually simple selection rule and show that in many practical scenarios a low-rank finite element approximation is sufficient to further reduce computational cost without hindering the estimation accuracy.
10:45-11:45 CDT
Physics-Informed Learning Machines
Speaker: Maziar Raissi (University of Colorado Boulder)
A grand challenge with great opportunities is to develop a coherent framework that enables blending conservation laws, physical principles, and/or phenomenological behaviors expressed by differential equations with the vast data sets available in many fields of engineering, science, and technology. At the intersection of probabilistic machine learning, deep learning, and scientific computations, this work is pursuing the overall vision to establish promising new directions for harnessing the long-standing developments of classical methods in applied mathematics and mathematical physics to design learning machines with the ability to operate in complex domains without requiring large quantities of data. To materialize this vision, this work is exploring two complementary directions: (1) designing data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and non-linear differential equations, to extract patterns from high-dimensional data generated from experiments, and (2) designing novel numerical algorithms that can seamlessly blend equations and noisy multi-fidelity data, infer latent quantities of interest (e.g., the solution to a differential equation), and naturally quantify uncertainty in computations.
11:45-13:00 CDT
Lunch
13:00-14:00 CDT
Large Scale Kriging by Substituting Optimization for Inversion
Speaker: Matt Plumlee (Northwestern University)
Gaussian processes are popular tools for prediction that have been shown to be orders of magnitude more accurate than modern competitors on a host of prediction tasks. However, the computational cost of fitting them can be daunting. Inspired by the recent deployments of large-scale optimization in deep learning, this talk illustrates how carefully written optimization problems can be used to replace the usual matrix decomposition used to fit Gaussian process predictors. This can be used throughout hyperparameter tuning and is extremely scalable with modern linear algebra libraries. A computational deployment of the methodology has been used on large-scale machines and produces significant improvements to accuracy over existing methods while being quick to query.
14:00-15:00 CDT
Computational Graph Completion
Speaker: Houman Owhadi (Caltech)
We present a framework for generating, organizing, and reasoning with computational knowledge. It is motivated by the observation that most problems in Computational Sciences and Engineering (CSE) can be formulated as that of completing (from data) a computational graph (or hypergraph) representing dependencies between functions and variables. Nodes represent variables, and edges represent functions. Functions and variables may be known, unknown, or random. Data comes in the form of observations of distinct values of a finite number of subsets of the variables of the graph (satisfying its functional dependencies). The underlying problem combines a regression problem (approximating unknown functions) with a matrix completion problem (recovering unobserved variables in the data). Replacing unknown functions by Gaussian Processes (GPs) and conditioning on observed data provides a simple but efficient approach to completing such graphs. Since this completion process can be reduced to an algorithm, as one solves $sqrt{2}$ on a pocket calculator without thinking about it, one could, with the automation of the proposed framework, solve a complex CSE problem by drawing a diagram. We will in particular show how the numerical Bayesian inversion of PDE models can be formulated as a computational graph completion problem by reinterpreting recent results (https://arxiv.org/abs/2103.12959) with Y. Chen, B. Hosseini and A. Stuart on solving/learning nonlinear PDEs with GPs.
15:00-15:30 CDT
Break
15:30-16:30 CDT
Gaussian Process Regression Constrained by Boundary Value Problems
Speaker: Mamikon Gulian (Sandia National Laboratories)
As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview of several classes of constrained Gaussian processes and discuss some of their advantages and numerical properties. We then develop a framework for Gaussian processes regression constrained by boundary value problems. The framework may be applied to infer the solution of a well-posed boundary value problem with a known second-order differential operator and boundary conditions, but for which only scattered observations of the source term are available. Scattered observations of the solution may also be used in the regression. The framework combines co-kriging with the linear transformation of a Gaussian process together with the use of kernels given by spectral expansions in eigenfunctions of the boundary value problem. Thus, it benefits from a reduced-rank property of covariance matrices. We demonstrate that the resulting framework yields more accurate and stable solution inference as compared to physics-informed Gaussian process regression without boundary condition constraints.
Friday, September 2, 2022
9:15-10:15 CDT
Convergence rates of non-stationary and deep Gaussian process regression
Speaker: Aretha Teckentrup (University of Edinburgh)
We are interested in the task of estimating an unknown function from data, given as a set of point evaluations. In this context, Gaussian process regression is often used as a Bayesian inference procedure, and we are interested in the convergence as the number of data points goes to infinity. Using results from scattered data approximation, we provide a convergence analysis of the method applied to a given, unknown function of interest. We are particularly interested in the case of non-stationary covariance kernels, and the extension of the results to deep Gaussian processes.