A high-bias, low-variance review of data-driven collective variable discovery and enhanced sampling

This was part of Learning Collective Variables and Coarse Grained Models

Andrew Ferguson, University of Chicago

Monday, April 22, 2024

Abstract:

The effectiveness of enhanced sampling techniques that apply accelerating biases to selected collective variables (CVs) to drive free energy barrier crossing and rare events depends sensitively on the quality of the chosen CVs. This talk will review the mathematical and algorithmic underpinnings of a variety of popular data-driven techniques to learn CVs from simulation trajectories based on principles of maximal variance or autocorrelation and their deployment within iterative CV discovery and enhanced sampling protocols. This will be followed by a presentation of the molecular latent space simulators (LSS) approach to learn highly efficient and accurate surrogate models for the dynamics of molecular systems by stacking three specialized deep learning networks to (i) encode a molecular system into a slow latent space, (ii) propagate dynamics in this latent space, and (iii) generatively decode a synthetic molecular trajectory. The talk will be followed by a short hands-on session demonstrating some of these CV discovery and enhanced techniques on simple toy and molecular systems within user-friendly Python codes packaged into Jupyter notebooks.