Characterizing the Role of a Single Coupling Layer in Affine Normalizing Flows

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12544))

Included in the following conference series:

DAGM German Conference on Pattern Recognition

1395 Accesses

Abstract

Deep Affine Normalizing Flows are efficient and powerful models for high-dimensional density estimation and sample generation. Yet little is known about how they succeed in approximating complex distributions, given the seemingly limited expressiveness of individual affine layers. In this work, we take a first step towards theoretical understanding by analyzing the behaviour of a single affine coupling layer under maximum likelihood loss. We show that such a layer estimates and normalizes conditional moments of the data distribution, and derive a tight lower bound on the loss depending on the orthogonal transformation of the data before the affine coupling. This bound can be used to identify the optimal orthogonal transform, yielding a layer-wise training algorithm for deep affine flows. Toy examples confirm our findings and stimulate further research by highlighting the remaining gap between layer-wise and end-to-end training of deep affine flows.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Training Invertible Neural Networks as Autoencoders

Training Algorithms for Mixtures of Normalizing Flows

Reviving autoencoder pretraining

Article Open access 26 October 2022

References

Ardizzone, L., Kruse, J., Rother, C., Köthe, U.: Analyzing inverse problems with invertible neural networks. In: International Conference on Learning Representations (2018)
Google Scholar
Bigoni, D., Zahm, O., Spantini, A., Marzouk, Y.: Greedy inference with layers of lazy maps. arXiv preprint arXiv:1906.00031 (2019)
Dinh, L., Krueger, D., Bengio, Y.: NICE: non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014)
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016)
Fleishman, A.I.: A method for simulating non-normal distributions. Psychometrika 43(4), 521–532 (1978)
Article Google Scholar
Hoogeboom, E., Peters, J., van den Berg, R., Welling, M.: Integer discrete flows and lossless compression. In: Advances in Neural Information Processing Systems, pp. 12134–12144 (2019)
Google Scholar
Jacobsen, J.H., Behrmann, J., Zemel, R., Bethge, M.: Excessive invariance causes adversarial vulnerability. arXiv preprint arXiv:1811.00401 (2018)
Jaini, P., Selby, K.A., Yu, Y.: Sum-of-squares polynomial flow. arXiv preprint arXiv:1905.02325 (2019)
Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1x1 convolutions. In: Advances in Neural Information Processing Systems, pp. 10215–10224 (2018)
Google Scholar
Marzouk, Y., Moselhy, T., Parno, M., Spantini, A.: Sampling via measure transport: an introduction. In: Ghanem, R., Higdon, D., Owhadi, H. (eds.) Handbook of Uncertainty Quantification, pp. 1–41. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-11259-6_23-1
Chapter Google Scholar
Meng, C., Ke, Y., Zhang, J., Zhang, M., Zhong, W., Ma, P.: Large-scale optimal transport map estimation using projection pursuit. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8116–8127. Curran Associates, Inc. (2019)
Google Scholar
Nalisnick, E., Matsukawa, A., Teh, Y.W., Lakshminarayanan, B.: Detecting out-of-distribution inputs to deep generative models using a test for typicality. arXiv preprint arXiv:1906.02994 (2019)
Noé, F., Olsson, S., Köhler, J., Wu, H.: Boltzmann generators: sampling equilibrium states of many-body systems with deep learning. Science 365(6457), eaaw1147 (2019)
Google Scholar
Papamakarios, G., Nalisnick, E., Rezende, D.J., Mohamed, S., Lakshminarayanan, B.: Normalizing flows for probabilistic modeling and inference. arXiv preprint arXiv:1912.02762 (2019)
Putzky, P., Welling, M.: Invert to learn to invert. In: Advances in Neural Information Processing Systems, pp. 446–456 (2019)
Google Scholar
Tabak, E.G., Trigila, G.: Conditional expectation estimation through attributable components. Inf. Infer. J. IMA 7(4), 727–754 (2018)
MathSciNet MATH Google Scholar
Trigila, G., Tabak, E.G.: Data-driven optimal transport. Commun. Pure Appl. Math. 69(4), 613–648 (2016)
Article MathSciNet Google Scholar

Download references

Acknowledgement

This work is supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC-2181/1 – 390900948 (the Heidelberg STRUCTURES Cluster of Excellence).

Furthermore, we thank our colleagues Lynton Ardizzone, Jakob Kruse, Jens Müller, and Peter Sorrenson for their help, support and fruitful discussions.

Author information

Authors and Affiliations

Heidelberg Collaboratory for Image Processing, Heidelberg University, Heidelberg, Germany
Felix Draxler, Jonathan Schwarz, Christoph Schnörr & Ullrich Köthe
Visual Learning Lab, Heidelberg University, Heidelberg, Germany
Felix Draxler & Ullrich Köthe
Image and Pattern Analysis Group, Heidelberg University, Heidelberg, Germany
Felix Draxler, Jonathan Schwarz & Christoph Schnörr

Authors

Felix Draxler
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Schwarz
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Schnörr
View author publications
You can also search for this author in PubMed Google Scholar
Ullrich Köthe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Draxler .

Editor information

Editors and Affiliations

University of Tübingen, Tübingen, Germany
Zeynep Akata
University of Tübingen, Tübingen, Germany
Andreas Geiger
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 328 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Draxler, F., Schwarz, J., Schnörr, C., Köthe, U. (2021). Characterizing the Role of a Single Coupling Layer in Affine Normalizing Flows. In: Akata, Z., Geiger, A., Sattler, T. (eds) Pattern Recognition. DAGM GCPR 2020. Lecture Notes in Computer Science(), vol 12544. Springer, Cham. https://doi.org/10.1007/978-3-030-71278-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-71278-5_1
Published: 17 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71277-8
Online ISBN: 978-3-030-71278-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics