[go: up one dir, main page]

Skip to main content

HPC AI500: A Benchmark Suite for HPC AI Systems

  • Conference paper
  • First Online:
Benchmarking, Measuring, and Optimizing (Bench 2018)

Abstract

In recent years, with the trend of applying deep learning (DL) in high performance scientific computing, the unique characteristics of emerging DL workloads in HPC raise great challenges in designing, implementing HPC AI systems. The community needs a new yard stick for evaluating the future HPC systems. In this paper, we propose HPC AI500—a benchmark suite for evaluating HPC systems that running scientific DL workloads. Covering the most representative scientific fields, each workload from HPC AI500 is based on real-world scientific DL applications. Currently, we choose 14 scientific DL benchmarks from perspectives of application scenarios, data sets, and software stack. We propose a set of metrics for comprehensively evaluating the HPC AI systems, considering both accuracy, performance as well as power and cost. We provide a scalable reference implementation of HPC AI500. The specification and source code are publicly available from http://www.benchcouncil.org/HPCAI500/index.html. Meanwhile, the AI benchmark suites for datacenter, IoT, Edge are also released on the BenchCouncil web site.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)

    Google Scholar 

  2. http://www.image-net.org/

  3. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI, vol. 16 (2016)

    Google Scholar 

  4. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia. ACM (2014)

    Google Scholar 

  5. Chen, Y., et al.: DianNao family: energy-efficient hardware accelerators for machine learning. Commun. ACM 59(11), 105–112 (2016)

    Article  Google Scholar 

  6. Jouppi, N.P., et al.: In-datacenter performance analysis of a tensor processing unit. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE (2017)

    Google Scholar 

  7. Robert, A., et al.: Fathom: reference workloads for modern deep learning methods. In: 2016 IEEE International Symposium on Workload Characterization (IISWC). IEEE (2016)

    Google Scholar 

  8. Coleman, C., et al.: DAWNBench: an end-to-end deep learning benchmark and competition. Training 100(101), 102 (2017)

    Google Scholar 

  9. Zhu, H., et al.: TBD: benchmarking and analyzing deep neural network training arXiv preprint arXiv:1803.06905 (2018)

  10. Shi, S., et al.: Benchmarking state-of-the-art deep learning software tools. In: 2016 7th International Conference on Cloud Computing and Big Data (CCBD). IEEE (2016)

    Google Scholar 

  11. Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach. Elsevier, Amsterdam (2011)

    MATH  Google Scholar 

  12. Wang, L., et al.: BigDataBench: a big data benchmark suite from internet services. In: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). IEEE (2014)

    Google Scholar 

  13. Jia, Z., Wang, L., Zhan, J., et al.: Characterizing data analysis workloads in data centers. In: 2013 IEEE International Symposium on Workload Characterization (IISWC), pp. 66–76. IEEE (2013)

    Google Scholar 

  14. Hao, T., Huang, Y., Wen, X., et al.: Edge AIBench: towards comprehensive end-to-end edge computing benchmarking. In: 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18) (2018)

    Google Scholar 

  15. Luo, C., Zhang, F., Huang, C., Xiong, X., Chen, J., et al.: AIoT Bench: towards comprehensive benchmarking mobile and embedded device intelligence. In: 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18) (2018)

    Google Scholar 

  16. Gao, W., Tang, F., Wang, L., Zhan, J., et al.: AIBench: an industry standard internet service AI benchmark suite. Technical report (2019)

    Google Scholar 

  17. Gao, W., Luo, C., Wang, L., Xiong, X., et al.: AIBench: towards scalable and comprehensive datacenter AI benchmarking. In: 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18) (2018)

    Google Scholar 

  18. Dean, J.: Keynote: Large Scale Deep Learning

    Google Scholar 

  19. Collobert, R., Bengio, S., Marithoz, J.: Torch: a modular machine learning software library, no. EPFL-REPORT-82802. Idiap (2002)

    Google Scholar 

  20. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  21. Kurth, T., Treichler, S., Romero, J., et al.: Exascale deep learning for climate analytics. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, p. 51. IEEE Press (2018)

    Google Scholar 

  22. Kurth, T., Zhang, J., Satish, N., et al.: Deep learning at 15pf: supervised and semi-supervised classification for scientific data. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 7. ACM (2017)

    Google Scholar 

  23. Mathuriya, A., Bard, D., Mendygral, P., et al.: CosmoFlow: using deep learning to learn the universe at scale. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, p. 65. IEEE Press (2018)

    Google Scholar 

  24. https://www.oreilly.com/ideas/a-look-at-deep-learning-for-science

  25. Bhimji, W., Farrell, S.A., Kurth, T., et al.: Deep neural networks for physics analysis on low-level whole-detector data at the LHC. J. Phys.: Conf. Ser. 1085(4), 042034 (2018)

    Google Scholar 

  26. Ravanbakhsh, S., Oliva J.B., Fromenteau, S., et al.: Estimating cosmological parameters from the dark matter distribution, pp. 2407–2416. In: ICML (2016)

    Google Scholar 

  27. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  28. Chen, T., Chen, Y., Duranton, M., et al.: BenchNN: On the broad potential application scope of hardware neural network accelerators. In: 2012 IEEE International Symposium on Workload Characterization (IISWC), pp. 36–45. IEEE (2012)

    Google Scholar 

  29. https://mlperf.org/

  30. Ben-Nun, T., Besta, M., Huber, S., et al.: A modular benchmarking infrastructure for high-performance and reproducible deep learning. arXiv preprint arXiv:1901.10183 (2019)

  31. Patton, R.M., Johnston, J.T., Young, S.R., et al.: 167-PFlops deep learning for electron microscopy: from learning physics to atomic manipulation. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, p. 50. IEEE Press (2018)

    Google Scholar 

  32. Li, M., Andersen, D.G., Park, J.W., et al.: Scaling distributed machine learning with the parameter server. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 583–598 (2014)

    Google Scholar 

  33. Ravanbakhsh, S., Lanusse, F., Mandelbaum, R., et al.: Enabling dark energy with deep generative models of galaxy images. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  34. Mustafa, M., Bard, D., Bhimji, W., et al.: Creating virtual universes using generative adversarial networks. arXiv preprint arXiv:1706.02390 (2017)

  35. Schmelzle, J., Lucchi, A., Kacprzak, T., et al.: Cosmological model discrimination with deep learning. arXiv preprint arXiv:1707.05167 (2017)

  36. Peterson, C.: Track finding with neural networks. Nucl. Instrum. Methods Phys. Res. Sect. A: Accel. Spectrom. Detect. Assoc. Equip. 279(3), 537–545 (1989)

    Article  Google Scholar 

  37. Denby, B.: Neural networks and cellular automata in experimental high energy physics. Comput. Phys. Commun. 49(3), 429–448 (1988)

    Article  MathSciNet  Google Scholar 

  38. de Oliveira, L., Kagan, M., Mackey, L., et al.: Jet-images-deep learning edition. J. High Energy Phys. 2016(7), 69 (2016)

    Article  Google Scholar 

  39. Komiske, P.T., Metodiev, E.M., Schwartz, M.D.: Deep learning in color: towards automated quark/gluon jet discrimination. J. High Energy Phys. 2017(1), 110 (2017)

    Article  Google Scholar 

  40. Liu, Y., Racah, E., Correa, J., et al.: Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv preprint arXiv:1605.01156 (2016)

  41. Hong, S., Kim, S., Joh, M., et al.: GlobeNet: convolutional neural networks for typhoon eye tracking from remote sensing imagery. arXiv preprint arXiv:1708.03417 (2017)

  42. Racah, E., Beckham, C., Maharaj, T., et al.: ExtremeWeather: a large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. In: Advances in Neural Information Processing Systems, pp. 3402–3413 (2017)

    Google Scholar 

  43. Gmez-Bombarelli, R., Wei, J.N., Duvenaud, D., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4(2), 268–276 (2018)

    Article  Google Scholar 

  44. https://www.ecowatch.com/un-extreme-weather-climate-change-2633131018.html

  45. https://www.cbsnews.com/news/extreme-weather-events-2018-top-3-most-expensive-climate-driven-events-took-place-in-us/

  46. https://extremeweatherdataset.github.io/

  47. http://stanford.edu/group/stanford_atlas/

  48. Spira, M., Djouadi, A., Graudenz, D., et al.: Higgs boson production at the LHC. Nucl. Phys. B 453(1–2), 17–82 (1995)

    Article  Google Scholar 

  49. https://en.wikipedia.org/wiki/Cosmology

  50. Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  51. Sjstrand, T., Mrenna, S., Skands, P.: PYTHIA 6.4 physics and manual. J. High Energy Phys. 2006(05), 026 (2006)

    Article  Google Scholar 

  52. https://www-n.oca.eu/ohahn/MUSIC/

  53. https://bitbucket.org/tassev/pycola/

  54. https://en.wikipedia.org/wiki/Convolution

  55. Mathuriya, A., Kurth, T., Rane, V., et al.: Scaling GRPC tensorflow on 512 nodes of cori supercomputer. arXiv preprint arXiv:1712.09388 (2017)

  56. Sergeev, A., Del Balso, M.: Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018)

  57. Gibiansky, A.: Bringing HPC techniques to deep learning (2017). http://research.baidu.com/bringing-hpc-techniques-deep-learning. Accessed 6 Dec 2017

  58. https://www.open-mpi.org/

  59. https://www.jlab.org/indico/event/247/session/8/contribution/30/material/slides/0.pdf

  60. Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (2015)

    Google Scholar 

  61. Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)

    Google Scholar 

  62. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

    Google Scholar 

  63. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  64. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Standardization Research Project of Chinese Academy of Sciences No.BZ201800001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianfeng Zhan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, Z. et al. (2019). HPC AI500: A Benchmark Suite for HPC AI Systems. In: Zheng, C., Zhan, J. (eds) Benchmarking, Measuring, and Optimizing. Bench 2018. Lecture Notes in Computer Science(), vol 11459. Springer, Cham. https://doi.org/10.1007/978-3-030-32813-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32813-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32812-2

  • Online ISBN: 978-3-030-32813-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics