No MCMC for me: Amortized sampling for fast and stable training of energy-based models

Will Sussman Grathwohl, Jacob Jin Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud

Published: 12 Jan 2021, Last Modified: 22 Oct 2023ICLR 2021 PosterReaders: Everyone

Keywords: Generative Models, EBM, Energy-Based Models, Energy Based Models, semi-supervised learning, JEM

Abstract: Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty. Despite recent advances, training EBMs on high-dimensional data remains a challenging problem as the state-of-the-art approaches are costly, unstable, and require considerable tuning and domain expertise to apply successfully. In this work, we present a simple method for training EBMs at scale which uses an entropy-regularized generator to amortize the MCMC sampling typically used in EBM training. We improve upon prior MCMC-based entropy regularization methods with a fast variational approximation. We demonstrate the effectiveness of our approach by using it to train tractable likelihood models. Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training. This allows us to extend JEM models to semi-supervised classification on tabular data from a variety of continuous domains.

One-sentence Summary: We present a new generator-based approach for training EBMs and demonstrate that it trains models which obtain high likelihood and overcomes stability issues common in EBM training.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Code: [![github](/images/github_icon.svg) wgrathwohl/VERA](https://github.com/wgrathwohl/VERA)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2010.04230/code)

35 Replies