Sampling Beyond Log-concavity

This was part of The Multifaceted Complexity of Machine Learning

Andrej Risteski, Carnegie Mellon University

Thursday, April 15, 2021

Abstract: Many tasks involving generative models involve being able to sample from distributions parametrized as p(x) = e^{-f(x)}/Z where Z is the normalizing constant, for some function f whose values and gradients we can query. This mode of access to f is natural — for instance sampling from posteriors in latent-variable models. Classical results show that a natural random walk, Langevin diffusion, mixes rapidly when f is convex. Unfortunately, even in simple examples, the applications listed above will entail working with functions f that are nonconvex. Paralleling the move away from convexity in optimization, we initiate a study of instances of relevance in practice where Langevin diffusion (combined with other tools) can provably be shown to mix rapidly: distributions p that are multimodal, as well as distributions p that have a natural manifold structure on their level sets.