[go: up one dir, main page]

Skip to main content

Characterizing the Role of a Single Coupling Layer in Affine Normalizing Flows

  • Conference paper
  • First Online:
Pattern Recognition (DAGM GCPR 2020)

Abstract

Deep Affine Normalizing Flows are efficient and powerful models for high-dimensional density estimation and sample generation. Yet little is known about how they succeed in approximating complex distributions, given the seemingly limited expressiveness of individual affine layers. In this work, we take a first step towards theoretical understanding by analyzing the behaviour of a single affine coupling layer under maximum likelihood loss. We show that such a layer estimates and normalizes conditional moments of the data distribution, and derive a tight lower bound on the loss depending on the orthogonal transformation of the data before the affine coupling. This bound can be used to identify the optimal orthogonal transform, yielding a layer-wise training algorithm for deep affine flows. Toy examples confirm our findings and stimulate further research by highlighting the remaining gap between layer-wise and end-to-end training of deep affine flows.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ardizzone, L., Kruse, J., Rother, C., Köthe, U.: Analyzing inverse problems with invertible neural networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  2. Bigoni, D., Zahm, O., Spantini, A., Marzouk, Y.: Greedy inference with layers of lazy maps. arXiv preprint arXiv:1906.00031 (2019)

  3. Dinh, L., Krueger, D., Bengio, Y.: NICE: non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014)

  4. Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016)

  5. Fleishman, A.I.: A method for simulating non-normal distributions. Psychometrika 43(4), 521–532 (1978)

    Article  Google Scholar 

  6. Hoogeboom, E., Peters, J., van den Berg, R., Welling, M.: Integer discrete flows and lossless compression. In: Advances in Neural Information Processing Systems, pp. 12134–12144 (2019)

    Google Scholar 

  7. Jacobsen, J.H., Behrmann, J., Zemel, R., Bethge, M.: Excessive invariance causes adversarial vulnerability. arXiv preprint arXiv:1811.00401 (2018)

  8. Jaini, P., Selby, K.A., Yu, Y.: Sum-of-squares polynomial flow. arXiv preprint arXiv:1905.02325 (2019)

  9. Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1x1 convolutions. In: Advances in Neural Information Processing Systems, pp. 10215–10224 (2018)

    Google Scholar 

  10. Marzouk, Y., Moselhy, T., Parno, M., Spantini, A.: Sampling via measure transport: an introduction. In: Ghanem, R., Higdon, D., Owhadi, H. (eds.) Handbook of Uncertainty Quantification, pp. 1–41. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-11259-6_23-1

    Chapter  Google Scholar 

  11. Meng, C., Ke, Y., Zhang, J., Zhang, M., Zhong, W., Ma, P.: Large-scale optimal transport map estimation using projection pursuit. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8116–8127. Curran Associates, Inc. (2019)

    Google Scholar 

  12. Nalisnick, E., Matsukawa, A., Teh, Y.W., Lakshminarayanan, B.: Detecting out-of-distribution inputs to deep generative models using a test for typicality. arXiv preprint arXiv:1906.02994 (2019)

  13. Noé, F., Olsson, S., Köhler, J., Wu, H.: Boltzmann generators: sampling equilibrium states of many-body systems with deep learning. Science 365(6457), eaaw1147 (2019)

    Google Scholar 

  14. Papamakarios, G., Nalisnick, E., Rezende, D.J., Mohamed, S., Lakshminarayanan, B.: Normalizing flows for probabilistic modeling and inference. arXiv preprint arXiv:1912.02762 (2019)

  15. Putzky, P., Welling, M.: Invert to learn to invert. In: Advances in Neural Information Processing Systems, pp. 446–456 (2019)

    Google Scholar 

  16. Tabak, E.G., Trigila, G.: Conditional expectation estimation through attributable components. Inf. Infer. J. IMA 7(4), 727–754 (2018)

    MathSciNet  MATH  Google Scholar 

  17. Trigila, G., Tabak, E.G.: Data-driven optimal transport. Commun. Pure Appl. Math. 69(4), 613–648 (2016)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgement

This work is supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC-2181/1 – 390900948 (the Heidelberg STRUCTURES Cluster of Excellence).

Furthermore, we thank our colleagues Lynton Ardizzone, Jakob Kruse, Jens Müller, and Peter Sorrenson for their help, support and fruitful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felix Draxler .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 328 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Draxler, F., Schwarz, J., Schnörr, C., Köthe, U. (2021). Characterizing the Role of a Single Coupling Layer in Affine Normalizing Flows. In: Akata, Z., Geiger, A., Sattler, T. (eds) Pattern Recognition. DAGM GCPR 2020. Lecture Notes in Computer Science(), vol 12544. Springer, Cham. https://doi.org/10.1007/978-3-030-71278-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-71278-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-71277-8

  • Online ISBN: 978-3-030-71278-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics