Abstract
This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user-transparent way. The proposed framework is a combination of the COMP Superscalar (COMPSs) programming model and runtime, which provides a straightforward way to develop task-based parallel applications from sequential codes, and containers management platforms that ease the deployment of applications in computing environments (as Docker, Mesos or Singularity). This framework provides scientists and developers with an easy way to implement parallel distributed applications and deploy them in a one-click fashion. We have built a prototype which integrates COMPSs with different containers engines in different scenarios: i) a Docker cluster, ii) a Mesos cluster, and iii) Singularity in an HPC cluster. We have evaluated the overhead in the building phase, deployment and execution of two benchmark applications compared to a Cloud testbed based on KVM and OpenStack and to the usage of bare metal nodes. We have observed an important gain in comparison to cloud environments during the building and deployment phases. This enables better adaptation of resources with respect to the computational load. In contrast, we detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.
Similar content being viewed by others
References
Advanced Multi-layered unification filesystem. Web page at https://aufs.sourceforge.net/ (2017). Accessed April 11 2017
Chameleon Cloud Project. Web page at https://www.chameleoncloud.org (2017). Accessed April 11 2017
Chameleon Cloud Project. Web page at https://www.chameleoncloud.org/about/hardware-description/ (2017). Accessed April 11 2017
Chef. Web page at https://www.chef.io/ (2017). Accessed April 11 2017
Chronos Scheduler for Mesos. Web page at https://mesos.github.io/chronos/ (2017). Accessed April 11 2017
COMP Superscalar. Web page at http://compss.bsc.es/ (2017). Accessed April 11 2017
COMPSs Application Repository. Web page at http://compss.bsc.es/projects/bar (2017). Accessed April 11 2017
Docker. Web page at https://www.docker.com/ (2017). Accessed April 11 2017
Docker Plug-ins. Web page at https://docs.docker.com/engine/extend/legacy_plugins/ (2017). Accessed April 11 2017
GUIDANCE: An Integrated Framework for Large-scale Genome and Phenome-Wide Association Studies on Parallel Computing Platforms. Web page at http://cg.bsc.es/guidance/ (2017). Accessed April 11 2017
Kubernetes. Web page at https://kubernetes.io/ (2017). Accessed April 11 2017
MareNostrum supercomputer. Web page at https://www.bsc.es/innovation-and-services/supercomputers-and-facilities/marenostrum (2017). Accessed April 11 2017
Multiscale Genomics Project. Web page at https://www.multiscalegenomics.eu/ (2017). Accessed April 11 2017
Puppet. Web page at https://puppet.com/ (2017). Accessed April 11 2017
Shifter. Web page at http://www.nersc.gov/research-and-development/user-defined-images/ (2017). Accessed April 11 2017
Singularity. Web page at http://singularity.lbl.gov/ (2017). Accessed April 11 2017
transPLANT Project. Web page at http://www.transplantdb.eu/ (2017). Accessed April 11 2017
VM Ware. Web page at http://www.vmware.com/ (2017). Accessed April 11 2017
Cloud-init. Web page at https://launchpad.net/cloud-init (2016). Accessed November 15 2016
Nova-Docker driver for OpenStack. Web page at https://github.com/openstack/nova-docker (2016). Accessed November 15 2016
OneDock: Docker driver for Open Nebula. Web page at https://github.com/indigo-dc/onedock/ (2016). Accessed November 15 2016
Amaral, R., Badia, R.M., Blanquer, I., Braga-Neto, R., Candela, L., Castelli, D., Flann, C., De Giovanni, R., Gray, W.A., Jones, A., Lezzi, D., Pagano, P., Perez-Canhos, V., Quevedo, F., Rafanell, R., Rebello, V., Sousa-Baena, M.S., Torres, E.: Supporting biodiversity studies with the eubrazilopenbio hybrid data infrastructure. Concurrency Comput.: Pract. Experience 27 (2), 376–394 (2015). https://doi.org/10.1002/cpe.3238
Anton, V., Ramon-Cortes, C., Ejarque, J., Badia, R.M.: Transparent execution of task-based parallel applications in docker with comp superscalar. pp. 463–467 IEEE. https://doi.org/10.1109/PDP.2017.26 (2017)
Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., et al.: Above the clouds: a berkeley view of cloud computing. EECS Department, University of California, Berkeley, Tech. Rep UCB/EECS-2009-28 (2009)
Armstrong, D., Espling, D., Tordsson, J., Djemame, K., Elmroth, E.: Contextualization: dynamic configuration of virtual machines. J. Cloud Comput. 4(1), 1 (2015)
Badia, R.M., Conejero, J., Diaz, C., Ejarque, J., Lezzi, D., Lordan, F., Ramon-Cortes, C., Sirvent, R.: Comp superscalar, an interoperable programming framework. SoftwareX 3, 32–36 (2015). https://doi.org/10.1016/j.softx.2015.10.004
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the Art of Virtualization. In: ACM SIGOPS Operating Systems Review, vol. 37, pp 164–177. ACM (2003)
Bruneo, D., Fritz, T., Keidar-Barner, S., Leitner, P., Longo, F., Marquezan, C., Metzger, A., Pohl, K., Puliafito, A., Raz, D., et al.: Cloudwave: where adaptive cloud management meets Devops. In: 2014 IEEE Symposium on Computers and Communications (ISCC), pp 1–6. IEEE (2014)
Conejero, J., Corella, S., Badia, R.M., Labarta, J.: Task-based programming in compss to converge from hpc to big data. The International Journal of High Performance Computing Applications 0(0). https://doi.org/10.1177/1094342017701278
Di Tommaso, P., Palumbo, E., Chatzou, M., Prieto, P., Heuer, M.L., Notredame, C.: The impact of Docker containers on the performance of genomic pipelines. PeerJ 3, e1273 (2015). https://doi.org/10.7717/peerj.1273. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4586803/
Ejarque, J., Sulistio, A., Lordan, F., Gilet, P., Sirvent, R., Badia, R.M.: Service construction tools for easy cloud deployment. In: 7th IBERIAN Grid Infrastructure Conference Proceedings, p 119 (2013)
Badia, E.T.R.M., Lea, J.: Pycompss: parallel computational workflows in python. The International Journal of High Performance Computing Applications (IJHPCA) 31, 66–82 (2017). https://doi.org/10.1177/1094342015594678
Felter, W., Ferreira, A., Rajamony, R., Rubio, J.: An Updated Performance Comparison of Virtual Machines and Linux Containers. In: 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp 171–172. IEEE (2015)
Galante, G., Erpen De Bona, L.C., Mury, A.R., Schulze, B., da Rosa Righi, R.: An analysis of public clouds elasticity in the execution of scientific applications: a survey. J Grid Comput. 14(2), 193–216 (2016). https://doi.org/10.1007/s10723-016-9361-3
Gerlach, W., Tang, W., Keegan, K., Harrison, T., Wilke, A., Bischof, J., D–Souza, M., Devoid, S., Murphy-Olson, D., Desai, N., et al.: Skyport: container-based execution environment management for multi-cloud scientific workflows. In: Proceedings of the 5Th International Workshop on Data-Intensive Computing in the Clouds, pp 25–32. IEEE Press (2014)
Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R., Shenker, S., Stoica, I.: Mesos: a platform for fine-grained resource sharing in the data center. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI–11, pp. 295–308. USENIX Association, Berkeley (2011). http://dl.acm.org/citation.cfm?id=1972457.1972488
Katsaros, G., Menzel, M., Lenk, A., Revelant, J.R., Skipp, R., Eberhardt, J.: Cloud application portability with Tosca, Chef and Openstack. In: 2014 IEEE International Conference on Cloud Engineering (IC2E), pp 295–302. IEEE (2014)
Kivity, A., Kamay, Y., Laor, D., Lublin, U., Liguori, A.: Kvm: the Linux Virtual Machine Monitor. In: Proceedings of the Linux Symposium, vol. 1, pp 225–230 (2007)
Krishnan, S., Gonzalez, J.L.U.: Google compute engine. In: Building Your Next Big Thing with Google Cloud Platform, pp 53–81. Springer (2015)
Lordan, F., Tejedor, E., Ejarque, J., Rafanell, R., Alvarez, J., Marozzo, F., Lezzi, D., Sirvent, R., Talia, D., Badia, R.M.: Servicess: an interoperable programming framework for the cloud. J Grid Comput. 12(1), 67–91 (2014). https://digital.csic.es/handle/10261/132141
Lorido-Botran, T., Miguel-Alonso, J., Lozano, J.A.: A review of auto-scaling techniques for elastic applications in cloud environments. J Grid Comput. 12(4), 559–592 (2014)
Meng, H., Thain, D.: Umbrella: a Portable Environment Creator for Reproducible Computing on Clusters, Clouds, and Grids. In: Proceedings of the 8th International Workshop on Virtualization Technologies in Distributed Computing, VTDC ’15, pp. 23–30. ACM, New York (2015). https://doi.org/10.1145/2755979.2755982
Merkel, D.: Docker: lightweight linux containers for consistent development and deployment. Linux Journal 2014(239), 2 (2014)
Peinl, R., Holzschuher, F., Pfitzer, F.: Docker cluster management for the cloud - survey results and own solution. J Grid Comput. 14(2), 265–282 (2016). https://doi.org/10.1007/s10723-016-9366-y
Sánchez-Expósito, S., Martín, P., Ruiz, J.E., Verdes-Montenegro, L., Garrido, J., Sirvent, R., Falcó, A.R., Badia, R.M., Lezzi, D.: Web services as building blocks for science gateways in astrophysics. J Grid Comput. 14(4), 673–685 (2016). https://doi.org/10.1007/s10723-016-9382-y
Sefraoui, O., Aissaoui, M., Eleuldj, M.: Openstack: toward an open-source solution for cloud computing. Int. J Comput. Appl. 55(3), 38–42 (2012)
Sotomayor, B., Montero, R.S., Llorente, I.M., Foster, I.: Virtual infrastructure management in private and hybrid clouds. IEEE Internet comput. 13(5), 14–22 (2009)
Zheng, C., Thain, D.: Integrating containers into workflows: a case study using Makeflow, Work Queue, and Docker. In: Proceedings of the 8Th International Workshop on Virtualization Technologies in Distributed Computing, pp 31–38. ACM (2015)
Acknowledgments
This work is partly supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316 project, by the Generalitat de Catalunya under contracts 2014-SGR-1051 and 2014-SGR-1272, and by the European Union through the Horizon 2020 research and innovation program under grant 690116 (EUBra-BIGSEA Project). Results presented in this paper were obtained using the Chameleon testbed supported by the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ramon-Cortes, C., Serven, A., Ejarque, J. et al. Transparent Orchestration of Task-based Parallel Applications in Containers Platforms. J Grid Computing 16, 137–160 (2018). https://doi.org/10.1007/s10723-017-9425-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-017-9425-z