Abstract
Within the European DataGrid project, Work Package 2 has designed and implemented a set of integrated replica management services for use by data intensive scientific applications. These services, based on the web services model, enable movement and replication of data at high speed from one geographical site to another, management of distributed replicated data, optimization of access to data, and the provision of a metadata management tool. In this paper we describe the architecture and implementation of these services and evaluate their performance under demanding Grid conditions.
Similar content being viewed by others
References
B. Allcock, J. Bester, J. Bresnahan, A.L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal and S. Tuecke, “Data Management and Transfer in High Performance Computational Grid Environments”, Parallel Computing Journal, Vol. 28, No. 5, pp. 749–771, 2002.
B. Allcock, I. Foster, V. Nefedov, A. Chervenak, E. Deelman, C. Kesselman, J. Lee, A. Sim, A. Shoshani, B. Drach and D. Williams, “High-Performance Remote Access to Climate Simulation Data: A Challenge Problem for Data Grid Technologies”, in 14th International IEEE Supercomputing Conference (SC 2001), Denver, Texas, USA, 2001.
W.E. Allcock, I. Foster and R. Madduri, “Reliable Data Transport: A Critical Service for the Grid”, in Global Grid Forum 11, Honolulu, Hawaii, USA, 2004.
“Apache Axis”. http://ws.apache.org/axis/
“Apache Tomcat”. http://jakarta.apache.org/tomcat/
C. Baru, R. Moore, A. Rajasekar and M. Wan, “The SDSC Storage Resource Broker”, in CASCON’98, Toronto, Canada, 1998.
W.H. Bell, D.G. Cameron, L. Capozza, A.P. Millar, K. Stockinger and F. Zini, “Design of a Replica Optimisation Framework”, Technical Report DataGrid-02-TED-021215, CERN, Geneva, Switzerland, 2002.
I. Bird, B. Hess, A. Kowalski, D. Petravick, R. Wellner, J. Gu, E. Otoo, A. Romosan, A. Sim, A. Shoshani, W. Hoschek, P. Kunszt, H. Stockinger, K. Stockinger, B. Tierney and J.-P. Baud, “SRM joint functional design”, in Global Grid Forum 4, Toronto, Canada, 2002.
B. Bloom, “Space/Time Trade-offs in Hash Coding with Allowable Errors”, Communications of ACM, Vol. 13, No. 7, pp. 422–426, 1970.
D. Bosio, J. Casey, A. Frohner, L. Guy, P. Kunszt, E. Laure, S. Lemaitre, L. Lucio, H. Stockinger, K. Stockinger, W. Bell, D. Cameron, G. McCance, P. Millar, J. Hahkala, N. Karlsson, V. Nenonen, M. Silander, O. Mulmo, G.-L. Volpato, G. Andronico, F. DiCarlo, L. Salconi, A. Domenici, R. Carvajal-Schiaffino and F. Zini, “Next-Generation EU DataGrid Data Management Services”, in Computing in High Energy Physics (CHEP 2003), La Jolla, California, USA, 2003.
R. Butler, D. Engert, I. Foster, C. Kesselman, S. Tuecke, J. Volmer and V. Welch, “A National-Scale Authentication Infrastructure”, IEEE Computer, Vol. 33, No. 12, pp. 60–66, 2000.
D.G. Cameron, R. Carvajal-Schiaffino, P. Millar, C. Nicholson, K. Stockinger and F. Zini, “Evaluating Scheduling and Replica Optimisation Strategies in OptorSim”, in 4th International Workshop on Grid Computing (Grid2003), Phoenix, Arizona, USA, 2003.
A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, A. Iamnitchi, C. Kesselman, P. Kunszt, M. Ripeanu, B. Schwartzkopf, H. Stockinger, K. Stockinger and B. Tierney, “Giggle: A Framework for Constructing Scalable Replica Location Services”, in 15th International IEEE Supercomputing Conference (SC 2002), Baltimore, USA, 2002.
A. Chervenak, N. Palavalli, S. Bharathi, C. Kesselman and R. Schwartzkopf, “Performance and Scalability of a Replica Location Service”, in 13th IEEE Symposium on High Performance and Distributed Computing (HPDC-13), Honolulu, Hawaii, USA, 2004.
I. Clarke, S.G. Miller, T.W. Hong, O. Sandberg and B. Wiley, “Protecting Free Expression Online with Freenet”, IEEE Internet Computing, Vol. 6, No. 1, pp. 40–49, 2002.
R. Dingledine, M.J. Freedman and D. Molnar, Peer-To-Peer: Harnessing the Benefits of a Disruptive Technology, Chapter “Free Haven”, pp. 159–187, O’Reilly: 2001.
I. Foster, J. Frey, S. Graham, S. Tuecke, K. Czajkowski, D. Ferguson, F. Leymann, M. Nally, T. Storey, W. Vambenepe and S. Weerawarana, “Modeling Stateful Resources with Web Services”, in Globus World 2004, San Fransisco, California, USA, 2004.
I. Foster, C. Kesselman, J.M. Nick and S. Tuecke, “The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Technical report, Global Grid Forum, 2002.
G. Kan, Peer-To-Peer: Harnessing the Benefits of a Disruptive Technology, Chapter “Gnutella”, pp. 94–122. O’Reilly: 2001.
P. Kunszt, E. Laure, H. Stockinger and K. Stockinger, “Replica Management with Reptor”, in 5th International Conference on Parallel Processing and Applied Mathematics, Czestochowa, Poland, 2003.
H. Lamehamedi, Z. Shentu, B. Szymanski and E. Deelman, “Simulation of Dynamic Data Replication Strategies in Data Grids”, in 12th Heterogeneous Computing Workshop (HCW2003), Nice, France, 2003.
“LCG: The LHC Computing Grid”. http://cern.ch/LCG/
K. Ranganathan and I. Foster, “Identifying Dynamic Replication Strategies for a High Performance Data Grid”, in 2nd International Workshop on Grid Computing (Grid2001), Denver, Colorado, USA, 2001.
H. Stockinger, F. Donno, E. Laure, S. Muzaffar, P. Kunszt, G. Andronico and P. Millar, “Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project”, in Computing in High Energy Physics (CHEP 2003), La Jolla, California, USA, 2003.
H. Stockinger, A. Samar, S. Muzaffar and F. Donno, “Grid Data Mirroring Package (GDMP)”, Scientific Programming Journal (Special Issue: Grid Computing), Vol. 10, No. 2, pp. 121–134, 2002.
I. Terekhov, R. Pordes, V. White, L. Lueking, L. Carpenter, H. Schellman, J. Trumbo, S. Veseli and M. Vranicar, “Distributed Data Access and Resource Management in the D0 SAM System”, in 10th IEEE Symposium on High Performance and Distributed Computing (HPDC-10), San Francisco, California, USA, 2001.
D. Thain, J. Basney, S. Son and M. Livny, “The Kangaroo Approach to Data Movement on the Grid”, in 10th IEEE Symposium on High Performance and Distributed Computing (HPDC-10), San Francisco, California, USA, 2001.
“The Jakarta Project”. http://jakarta.apache.org/
R.A. van Engelen and K.A. Gallivan, “The gSOAP Toolkit for Web Services and Peer-To-Peer Computing Networks”, in 2nd IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2002), Berlin, Germany, 2002.
S. Vazhkudai, S. Tuecke and I. Foster, “Replica Selection in the Globus Data Grid”, in 1st IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2001), Brisbane, Australia, 2001.
W3C. “Web Services Activity”. http://www.w3c.org/2002/ws/
“Web Service Definition Language”. http://www.w3.org/TR/wsdl/
R. Wolski, N. Spring and J. Hayes, “The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing”, Journal of Future Generation Computing Systems, Vol. 15, Nos. 5–6, pp. 757–768, 1999.
Author information
Authors and Affiliations
Additional information
This work was partially funded by the European Commission program IST- 2000-25182 through the European DataGrid Project.
Rights and permissions
About this article
Cite this article
Cameron, D., Casey, J., Guy, L. et al. Replica Management in the European DataGrid Project. J Grid Computing 2, 341–351 (2004). https://doi.org/10.1007/s10723-004-5745-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-004-5745-x