[go: up one dir, main page]

Academia.eduAcademia.edu
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy Dendrochronologia 30 (2012) 249–251 Contents lists available at SciVerse ScienceDirect Dendrochronologia TECHNICAL NOTE The DCCD: A digital data infrastructure for tree-ring research Esther Jansma a,b,c,∗ , Rowin J. van Lanen c , Peter W. Brewer d , Rutger Kramer e a Netherlands Cultural Heritage Agency (Rijksdienst voor het Cultureel Erfgoed), P.O. Box 1600, 3800 BP Amersfoort, The Netherlands Faculty of Geosciences, Utrecht University, Utrecht, The Netherlands c Netherlands Centre for Dendrochronology RING, Amersfoort, The Netherlands d The Malcolm and Carolyn Wiener Laboratory for Aegean and Near Eastern Dendrochronology, Cornell University, Ithaca, NY 14853, USA e Data Archiving and Networked Services (DANS), The Hague, The Netherlands b a r t i c l e i n f o Article history: Received 24 April 2011 Accepted 7 December 2011 Keywords: Dendrochronology Collaboration Internationalization Research infrastructure a b s t r a c t Existing on-line databases for dendrochronology are not flexible in terms of user permissions, tree-ring data formats, metadata administration and language. This is why we developed the Digital Collaboratory for Cultural Dendrochronology (DCCD). This TRiDaS-based multi-lingual database allows users to control data access, to perform queries, to upload and download (meta)data in a variety of digital formats, and to edit metadata on line. The content of the DCCD conforms to EU best practices regarding the long-term preservation of digital research data. © 2012 Elsevier GmbH. All rights reserved. Introduction In Europe, dendrochronological research is carried out by laboratories in the public and private sectors and in academic settings. Large data collections have been developed since the 1960s and many of these are stored and managed locally. Overviews are scarce and incomplete, and researchers interested in working with these data are insufficiently aware of their existence because many of these substantial data collections are unavailable through existing on-line databases such as the ITRDB (Grissino-Mayer and Fritts, 1997), the European DendroDB (http://dendrodb.cerege.fr/dendrodb webpage.htm) and the WSL dendro database (Schmatz et al., 2001). At the same time, the demand for well-documented and comprehensive dendrochronological datasets derived from cultural-heritage studies has increased. Humanities-based dendrochronology has turned from object and site-level research to interregional comparisons of tree-ring data, to answer questions about the former (cultural) landscape, climate, economy, the wood-processing industry and wood technology. Such research depends on the accessibility of large, interregional and well-documented datasets. Following consultation with dendrochronologists from Belgium, France, Germany and the Netherlands it is clear that the ∗ Corresponding author at: Netherlands Cultural Heritage Agency (Rijksdienst voor het Cultureel Erfgoed), P.O. Box 1600, 3800 BP Amersfoort, The Netherlands. Tel.: +31 6 25 00 00 55. E-mail address: e.jansma@cultureelerfgoed.nl (E. Jansma). 1125-7865/$ – see front matter © 2012 Elsevier GmbH. All rights reserved. http://dx.doi.org/10.1016/j.dendro.2011.12.002 existing dendro data repositories have a number of limitations (although not all limitations are applicable to all of the repositories) that are preventing users from contributing their data. These include: the inability for contributors to control data access permissions; language specific interfaces and metadata; restriction to a single data format for upload/download; limited metadata capabilities; and domain specific (typically dendroclimatology) design. This paper presents a new digital infrastructure including online and stand-alone functionality for the storage, enrichment, analysis, exchange and publication of dendrochronological data and metadata. The repository can be accessed at http://dendro.dans.knaw.nl. This repository is the culmination of the Digital Collaboratory for Cultural Dendrochronology (DCCD) project and builds upon three earlier products. The first is the Tree-Ring Data Standard (TRiDaS, Jansma et al., 2010), an XML-based data standard for the comprehensive description of dendro data and metadata. Next is the universal dendro data conversion tool TRiCYCLE (Brewer et al., 2011), which enables users to convert data back and forth between any one of twenty-two different data formats. Finally, there is the standalone dendro metadatabase – TRiDaBASE (Jansma et al., 2012) – for the local preparation and analysis of descriptive metadata. In combination with these products the new DCCD repository provides a leap forward in dendro (meta)data management and sharing. Author's personal copy 250 E. Jansma et al. / Dendrochronologia 30 (2012) 249–251 Requirements of the DCCD During the preparation stage for the DCCD project, many dendrochronologists were consulted about their data archiving and sharing requirements through workshops, interviews and questionnaires. The results of these consultations provided a number of clear specifications for the DCCD repository: 1. Multidisciplinary data model: The DCCD data model should fit the needs of research and data preservation in the humanities, e.g. archaeology, architectural history and conservation/restoration studies. Since dendrochronology is a multidisciplinary field it should also fit requirements of other research domains. The model should also be flexible enough to be adapted to future changes in requirements. 2. Reliable content: The DCCD should store data and descriptive/interpretative metadata of research projects, as well as associated documents related to these projects. It should offer users the opportunity to correct erroneous content and to add new scientific interpretations and data reviews. The future digital preservation of the content of the DCCD should be guaranteed. 3. Reliable search functionality: The DCCD should enable users to extract the information they are looking for by offering reliable online search functionality as well as functionality to query downloaded datasets locally. 4. Compatibility with existing data formats: The DCCD should be able to import and export dendrochronological information in all common dendrochronological data formats. 5. Good data protection: The DCCD should enable users to protect their (meta)data from unauthorized use by others. Users should be able to determine the degree to which their research projects are visible to, and can be searched by, other users. Users should also be able to determine which other users have full authorization to specific data, including the right to download such data. 6. Functionality to foster collaboration: The DCCD should include a digital collaborative environment for the exchange of (future) DCCD-related functionality and ideas, the implementation of research collaborations, educational activities, and out-reach to wider audiences. 7. Dissemination of results: The code written for the DCCD should follow open standards and be open source, so that the wider community can profit from this project, deliver feedback, and create improvements. In addition the DCCD architecture should allow linkage of this repository to other digital archives and infrastructure. Implementation It offers users the opportunity to improve the metadata of uploaded project files (which have a ‘draft’ status) before they are archived. We are also developing functionality similar to many social networking systems, that allows users to comment on the quality of datasets within the system. The DCCD repository is developed and hosted by Data Archiving and Networked Services (DANS; http://www.dans.knaw.nl), an institute of the Royal Netherlands Academy of Arts and Sciences (KNAW) and the Netherlands Organisation for Scientific Research (NWO). The long-term durability of the DCCD is guaranteed by structuring it to conform to the requirements of the international ‘Data Seal of Approval’.1 Search functionality Visitors of the DCCD repository website can query the public metadata stored in the repository. These are: (a) project title (the human-readable title under which a project is stored; (b) project type (e.g., dating, anthropology, forest dynamics); (c) the head of the laboratory or lead investigator on the project; (d) period of research (the chronological timeframe during which the research took place); (e) type of material (the source of the material that was studied, e.g. archaeology, ship archaeology, standing building, present vegetation); (f) laboratory, department, or institution (details of the agency that produced the data); (g) object title (the human-readable title given to the studied object(s) in the file). Access to all other information in the DCCD is user-defined (see ‘Data protection’ section). To optimize the search functionality we have implemented controlled vocabularies based on existing thesauri such as the Art and Architecture Thesaurus Ter(http://www.getty.edu/research/tools/vocabularies/aat/). minology for these fields has been defined in Dutch, English, French and German; users need to select one of these languages when uploading data. A query for the English object term ‘church’ therefore will result in a list of projects that also contain the equivalent Dutch, French and German terms. DCCD users can initiate additions and refinements through the Digital Collaboration Platform for Dendrochronology (see ‘Collaborative forum’ section). Results of on-line queries are presented on screen as lists of object files that can be scrutinized further by opening them in the web application. GIS and timeline functionality allow users to search among others for specific object types (e.g., water well, ship wreck, forest) dating from specified time intervals and view the results on a geographical map. For local querying of metadata, TRiDaBASE (the Microsoft Access-based standalone metadatabase developed as part of the DCCD project) can be used (Jansma et al., 2012). TRiDaBASE provides users with a desktop tool that they can adapt and extend to perform customized queries not yet provided by the DCCD web interface. The DCCD data model Interaction with legacy data formats The DCCD leverages the flexibility of the Tree-Ring Data Standard, TRiDaS (Jansma et al., 2010) to provide a comprehensive and standardized method for describing not only the dendro data itself, but also the descriptive metadata that is becoming increasingly important in many sub-disciplines of dendrochronology. The content of the repository The DCCD stores integrated project files (time series and metadata) and associated files (e.g., correspondence, research reports, photos). The quality and detail are the responsibility of the data owners. The DCCD validates uploaded (meta)data against TRiDaS. The interaction of the DCCD with existing dendrochronological data formats has been realized using code written for TRiCYCLE (Brewer et al., 2011). The flexible nature of the TRiCYCLE architecture means that the underlying logic used in the TRiCYCLE desktop application can also be used internally by the DCCD repository. Users are therefore able to upload data to the DCCD in any one of 22 different data formats that are commonly used in the dendrochronology community. Likewise, when retrieving the results of 1 See http://assessment.datasealofapproval.org/seals/. Author's personal copy E. Jansma et al. / Dendrochronologia 30 (2012) 249–251 queries performed on the repository, users can choose to download data in any of the supported formats. This means data generated by many different researchers, using many different applications, can be combined and downloaded by researchers then analyzed using their preferred software without having to manually convert or manipulate the data. Data protection One has to become a member of the DCCD in order to upload and download data and to query for information that is not classified as public metadata. Members that upload data are automatically assigned as the repository manager and maintain full rights to the data. They also control the extent to which others can see, query and download their data. The permission levels of the DCCD are structured according to the TRiDaS data model. Data owners determine the permission levels by setting a default access level for all data they upload to the DCCD. It is also possible to assign more permissive access to specific DCCD members. Access permissions can be altered by the repository manager at any time. Members of the DCCD who consult or download data sets of which they are not the owner agree to follow a well-defined set of rules and guidelines regarding data and research integrity (see http://dendro.dans.knaw.nl/termsofuse for the full conditions of use). We suggest that DCCD members in need of more detailed (for example time-specific) data contracts organize these among themselves. Over time it is possible that data contributed to the DCCD may become ‘orphaned’ as owners move to other fields or retire and laboratories disband. Repository managers can transfer ownership to other managers within their own organization. In the case of a whole laboratory disbanding, ownership can be transferred to another organization with the assistance of DANS. Following the terms of the data upload agreement, in cases where data are orphaned and unclaimed, all data are transferred to the public domain. This procedure will help to ensure that data are not lost, as has been the unfortunate case in the past. Collaborative forum The durability of the new infrastructure can be ensured only if participants continue to exchange ideas and products, and develop research collaborations based on the new infrastructure. This is why Utrecht University in collaboration with the Netherlands Cultural Heritage Agency (RCE) has set up the Digital Collaboration Platform for Dendrochronology at http://www.uu.nl/vkc/dendrochronology. This Virtual Knowledge Centre (VKC) includes wiki, blog and forum functionality and is used for international research collaboration, education, information, and discussion. The platform offers protected environments for research and other activities and links through to the DCCD web application. All DCCD-related functionality can be found through this platform. 251 Dissemination of results Access to the dendro data has been discussed above, but the source code of the DCCD repository is also a valuable resource. Therefore, the computer code of the DCCD follows open standards and is open source. This ensures that other groups or organizations that would like to build upon the products developed during the DCCD project can do so. Conclusion The DCCD repository and its associated infrastructure offer a substantially new and improved method for archiving, exchanging and standardization of dendro (meta)data. The DCCD has already been chosen as the central data repository by 15 laboratories in 9 countries. In time, we are certain that the DCCD will prove not only to be an excellent technical resource, but also a fertile platform for the discussion and development of new techniques and projects. As data holdings continue to increase, the DCCD will enable exiting new analyses that are not possible with current technologies and infrastructure. Acknowledgements The DCCD has been funded by The Netherlands Organization for Scientific Research (NWO, section Humanities), the Netherlands Cultural Heritage Agency (RCE, Ministry of Education, Culture and Science; the Netherlands), the patrons of The Malcolm and Carolyn Wiener Laboratory for Aegean and Near Eastern Dendrochronology (Cornell University, Ithaca, New York), Data Archiving and Networked Services (DANS; the Netherlands), the Epison Group (Ithaca, New York) and the participating laboratories. References Brewer, P., Murphy, D., Jansma, E., 2011. TRiCYCLE: a universal conversion tool for digital tree-ring data. Tree-Ring Research 67, 135–144. Grissino-Mayer, H., Fritts, H., 1997. The International Tree-Ring Data Bank: an enhanced global database serving the global scientific community. The Holocene 7 (2), 235–238. Jansma, E., Brewer, P., Zandhuis, I., 2010. TRiDaS 1.1: the tree-ring data standard. Dendrochronologia 28 (2), 99–130. Jansma, E., van Lanen, R., Sturgeon, K., Mohlke, S., Brewer, P.,2012. TRiDaBASE: a stand-alone database for storage, analysis and exchange of dendrochronological metadata. Dendrochronologia, http://dx.doi.org/10.1016/j.dendro.2011.09.002, in press. Schmatz, D., Ghosh, S., Heller, I., 2001, September. Tree ring web and alternative chronologies. In: Kaennel, M., Bräker, O. (Eds.), International Conference Tree Rings and People. Swiss Federal Research Institute WSL, Birmensdorf, p. 120.