Abstract
The Distributed Asynchronous Object Storage (DAOS) is an open source scale-out storage system that is designed from the ground up to support NVMe storage in user space. DAOS can run over any TCP network, but it can also take advantage of high performance fabrics like 100/200/400 Gbps Ethernet, InfiniBand, Slingshot, or Omni-Path. This paper describes the networking architecture of DAOS and discusses scaling and performance aspects of running DAOS over those high performance fabrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liang, Z., Lombardi, J., Chaarawi, M., Hennecke, M.: DAOS: a scale-out high performance storage stack for storage class memory. In: Panda, D. (ed.) SCFA 2020. LNCS, vol. 12082, pp. 40–54. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-48842-0_3
Zhen, L., Yong, F., Wang, D., Lombardi, J.: Distributed transactions and self-healing system of DAOS. In: Nichols, J., et al. (eds.) SMC 2022. CCIS, vol. 1315, pp. 334–348. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63393-6_22
Scot Breitenfeld, M., et al.: DAOS for extreme-scale systems in scientific applications (2017). https://arxiv.org/pdf/1712.00423.pdf
IO500 Homepage. https://io500.org/
gRPC: A high performance, open source universal RPC framework. https://grpc.io/about/
dRPC: A lightweight, drop-in, protocol-buffer based gRPC replacement. https://storj.github.io/drpc/
OFI libfabric. https://github.com/ofiwg/libfabric/releases/tag/v1.19.0
fi_provider(7) man page. https://ofiwg.github.io/libfabric/main/man/fi_provider.7.html
fi_rxm(7) man page. https://ofiwg.github.io/libfabric/main/man/fi_rxm.7.html
Unified Communication Framework (UCF) consortium. UCX. https://openucx.org/
The Mercury Remote Procedure Call (RPC) framework. https://mercury-hpc.github.io/user/overview/
Soumagne, J., et al.: Mercury: enabling remote procedure call for high-performance computing. In: IEEE International Conference on Cluster Computing (2013). https://doi.org/10.1109/CLUSTER.2013.6702617
Soumagne, J., Carns, P., Ross, R.: Advancing RPC for data services at exascale. IEEE Data Eng. Bull. 43(1), 23–34 (2020). http://sites.computer.org/debull/A20mar/p23.pdf
NVIDIA InfiniBand fabric stack: Mellanox OFED. https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
HPE Slingshot interconnect solutions for HPC networking. https://www.hpe.com/us/en/compute/hpc/slingshot-interconnect.html
Cornelis Omni-Path Express Fabric Software v10.14.1 (2024). http://www.cornelisnetworks.com/support
Argonne Leadership Computing Facility. Aurora. https://www.alcf.anl.gov/aurora
Hennecke, M.: Understanding DAOS storage performance scalability. In: International Conference on High Performance Computing in Asia-Pacific Region Workshops (HPCASIAWORKSHOP 2023), 27 February–2 March 2023. Raffles Blvd, Singapore (2023). https://doi.org/10.1145/3581576.3581577
Hennecke, M.: Performance Evolution of DAOS Servers (2023). https://www.intel.com/content/www/us/en/high-performance-computing/performance-evolution-of-daos-servers.html
OpenMPI Portable Hardware Locality (hwloc). https://www.open-mpi.org/projects/hwloc/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hennecke, M., Oganezov, A., Soumagne, J., Carrier, J., Moore, J. (2025). High Performance Fabric Support in DAOS. In: Weiland, M., Neuwirth, S., Kruse, C., Weinzierl, T. (eds) High Performance Computing. ISC High Performance 2024 International Workshops. ISC High Performance 2023. Lecture Notes in Computer Science, vol 15058. Springer, Cham. https://doi.org/10.1007/978-3-031-73716-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-73716-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73715-2
Online ISBN: 978-3-031-73716-9
eBook Packages: Computer ScienceComputer Science (R0)