This was part of
The Multifaceted Complexity of Machine Learning
Sampling Beyond Log-concavity
Andrej Risteski, Carnegie Mellon University
Thursday, April 15, 2021
Abstract: Many tasks involving generative models involve being able to sample from distributions parametrized as p(x) = e^{-f(x)}/Z where Z is the normalizing constant, for some function f whose values and gradients we can query. This mode of access to f is natural — for instance sampling from posteriors in latent-variable models. Classical results show that a natural random walk, Langevin diffusion, mixes rapidly when f is convex. Unfortunately, even in simple examples, the applications listed above will entail working with functions f that are nonconvex.
Paralleling the move away from convexity in optimization, we initiate a study of instances of relevance in practice where Langevin diffusion (combined with other tools) can provably be shown to mix rapidly: distributions p that are multimodal, as well as distributions p that have a natural manifold structure on their level sets.