[go: up one dir, main page]

Skip to main content

Revisiting Neuron Coverage and Its Application to Test Generation

  • Conference paper
  • First Online:
Computer Safety, Reliability, and Security. SAFECOMP 2020 Workshops (SAFECOMP 2020)

Abstract

The use of neural networks in perception pipelines of autonomous systems such as autonomous driving is indispensable due to their outstanding performance. But, at the same time their complexity poses a challenge with respect to safety. An important question in this regard is how to substantiate test sufficiency for such a function. One approach from software testing literature is that of coverage metrics. Similar notions of coverage, called neuron coverage, have been proposed for deep neural networks and try to assess to what extent test input activates neurons in a network. Still, the correspondence between high neuron coverage and safety-related network qualities remains elusive. Potentially, a high coverage could imply sufficiency of test data. In this paper, we argue that the coverage metrics as discussed in the current literature do not satisfy these high expectations and present a line of experiments from the field of computer vision to prove this claim.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In a software architecture inspired view, both activation and non-activation should be included for coverage (cf. branch coverage). Since non-activation is the standard case and typically achieved with few tests, we focus on the activation part.

  2. 2.

    Obviously, the performance of the weakly trained model, does not generalize to strong augmentation.

References

  1. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018)

    Article  Google Scholar 

  2. Alam, M., Samad, M.D., Vidyaratne, L., Glandon, A., Iftekharuddin, K.M.: Survey on deep neural networks in speech and vision systems. arXiv:1908.07656 (2019)

  3. Bron, A., Farchi, E., Magid, Y., Nir, Y., Ur, S.: Applications of synchronization coverage. In: Symposium on Principles and Practice of Parallel Programming, pp. 206–212 (2005)

    Google Scholar 

  4. Burkov, A.: Machine Learning Engineering (2020). http://www.mlebook.com/wiki/doku.phps

  5. Geirhos, R., Temme, C.R., Rauber, J., Schütt, H.H., Bethge, M., Wichmann, F.A.: Generalisation in humans and deep neural networks. In: Advances in Neural Information Processing Systems, pp. 7538–7550 (2018)

    Google Scholar 

  6. Gladisch, C., Heinzemann, C., Herrmann, M., Woehrle, M.: Leveraging combinatorial testing for safety-critical computer vision datasets. In: Workshop on Safe Artificial Intelligence for Automated Driving (2020)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  8. Kim, J., Feldt, R., Yoo, S.: Guiding deep learning system testing using surprise adequacy. In: International Conference on Software Engineering (2019)

    Google Scholar 

  9. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  10. Li, Z., Ma, X., Xu, C., Cao, C.: Structural coverage criteria for neural networks could be misleading. In: 41st International Conference on Software Engineering: New Ideas and Emerging Results, pp. 89–92. IEEE Press (2019)

    Google Scholar 

  11. Ma, L., et al.: Deepct: tomographic combinatorial testing for deep learning systems. In: 26th International Conference on Software Analysis, Evolution and Reengineering, pp. 614–618. IEEE (2019)

    Google Scholar 

  12. Ma, L., et al.: Deepgauge: multi-granularity testing criteria for deep learning systems. In: International Conference on Automated Software Engineering (2018)

    Google Scholar 

  13. Olah, C., et al.: The building blocks of interpretability. Distill 3, e10 (2018). https://doi.org/10.23915/distill.00010

    Article  Google Scholar 

  14. Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: Symposium on Operating Systems Principles, pp. 1–18 (2017)

    Google Scholar 

  15. Pezzè, M., Young, M.: Software Testing and Analysis: Process, Principles, and Techniques. Wiley, Hoboken (2008)

    MATH  Google Scholar 

  16. Schwalbe, G., Schels, M.: A survey on methods for the safety assurance of machine learning based systems. In: European Congress Embedded Real Time Software and Systems (2020)

    Google Scholar 

  17. Sun, Y., Huang, X., Kroening, D., Sharp, J., Hill, M., Ashmore, R.: Structural test coverage criteria for deep neural networks. ACM Trans. Embed. Comput. Syst. 18(5s), 1–23 (2019)

    Article  Google Scholar 

  18. Tian, Y., Pei, K., Jana, S., Ray, B.: DeepTest: automated testing of deep-neural-network-driven autonomous cars. arXiv:1708.08559 (2017)

  19. Wang, H., Xu, J., Xu, C., Ma, X., Lu, J.: Dissector: input validation for deep learning applications by crossing-layer dissection. In: International Conference on Software Engineering (2020)

    Google Scholar 

  20. Woods, W., Chen, J., Teuscher, C.: Adversarial explanations for understanding image classification decisions and improved neural network robustness. Nat. Mach. Intell. 1(11), 508–516 (2019)

    Article  Google Scholar 

  21. Zhang, J., Li, J.: Testing and verification of neural-network-based safety-critical control software: a systematic literature review. Inf. Softw. Technol. 123, 106296 (2020)

    Article  Google Scholar 

Download references

Acknowledgment

The research leading to the results presented above are funded by the German Federal Ministry for Economic Affairs and Energy within the project KI Absicherung—Safe AI for automated driving.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Houben .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Abrecht, S. et al. (2020). Revisiting Neuron Coverage and Its Application to Test Generation. In: Casimiro, A., Ortmeier, F., Schoitsch, E., Bitsch, F., Ferreira, P. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2020 Workshops. SAFECOMP 2020. Lecture Notes in Computer Science(), vol 12235. Springer, Cham. https://doi.org/10.1007/978-3-030-55583-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55583-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55582-5

  • Online ISBN: 978-3-030-55583-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics