Significant Properties Of Spreadsheets
Remco van Veenendaal (National Archives Netherlands), Frederik Holmelund Kjærskov (Danish National Archives), Kati Sein (National Archives of Estonia), Jack O’sullivan (Preservica), Anders Bo Nielsen (Danish National Archives), Philip Mike Tømmerholt (Danish National Archives) and Jacob Takema (National Archives of The Netherlands)
In this extended abstract, the Open Preservation Foundation’s Archives Interest Group reports on our ongoing investigation of significant properties of spreadsheets. Using the InSPECT methodology for investigating significant properties of electronic content, our goal is to get hands-on experience in investigating the significant properties of deposited spreadsheets by adding a Spreadsheet Testing Report to the InSPECT Testing Reports lore. An additional result of the AIG investigation is a Spreadsheet Complexity Analyser tool that extracts spreadsheet-specific properties and can be used to calculate the complexity of a spreadsheet based on the values of those properties.
Preferred, obsolete or in-between? Developing a criteria catalogue for AV-Material - Preservation Planning at the German National Library of Science and Technology (TIB)
Merle Friedrich (German National Library of Science and Technology)
The born-digital audio-visual (AV) holdings of the German National Library of Science and Technology are analyzed regarding the present file formats. The most frequent AV file formats are examined in terms of suitability as preservation format based on a catalogue of criteria. Furthermore their risk of obsolescence is evaluated using view paths. The examined file formats are not preferred as preservation formats, but they are not obsolete either.
Engaging Decision makers: An Executive Guide on Digital Preservation
Sarah Middleton (DPC) and Sharon McMeekin (DPC)
The Executive Guide on Digital Preservation provides practitioners with a combination of generic and specific messages and motivators designed to communicate with senior executives, legislators and budget holders, as well as decision and policy makers, with a view to embedding the value of digital preservation at the core of every organization.
A Bayesian model of Digital Preservation risk for the disruptive Digital Archive
Alec Mulinder (The National Archives), Sonia Ranade(The National Archives) and David Underdown (The National Archives)
Our digital heritage is rich, complex and fragile. It faces increasing threats over time as our organisations, technology and societies change, but archives and memory institutions must confidently preserve it for the future. This poster proposes a disruptive approach to safeguarding our digital archival heritage. We offer the iPres community an overview of work at The National Archives (UK) to map and explain the complex and shifting digital preservation risk environment using Bayesian networks. The project aims to evaluate a Bayesian statistical approach to understanding, managing and reducing digital preservation risk. We envisage a new digital preservation practice that complements our existing (standards-based) methods and supports evidence-led decision making by the archive.
This poster focuses on the range of technical work undertaken during the project, highlights the potential benefits of the approach and identifies areas for further investigation and wider collaboration. The poster will also touch on the broader implications of this work for inclusion, trust and transparency in the archive and make the case for more extensive adoption of data-driven approaches in digital preservation.
Enhancing Services to Preserve New Forms of Scholarship
Kate Wittenberg (Portico), Karen Hanson (Portico), David Millman (New York University), Craig Van Dyck (CLOCKSS) and Susan Doerr (University of Minnesota)
The advance in technologies for publishing digital scholarship has outpaced the development of technologies for reliably preserving it. Authors and publishers are creating increasingly sophisticated products without realizing that some of their enhancement choices might put preservability--and valuable scholarship--at risk. The poster describes the in-progress work and findings of a collaboration between preservation organizations, libraries, and publishers that are creating enhanced digital publications. The work aims to identify what can be effectively preserved with existing technologies, and to produce a recommended set of practices to help authors and publishers prioritize and plan their enhanced digital products for maximum preservability.
The Australasia Preserves Story: Building a digital preservation community of practice in the Australasian region
Jaye Weatherburn (University of Melbourne)
Building capacity through collaboration is essential to drive successful ongoing digital curation and digital preservation practice. This poster highlights the growth of the Australasia Preserves digital preservation community of practice, an initiative aiming to increase collaborative opportunities for varied institutions and individuals.
The Web Curator tool relaunch
Jeffrey van der Hoeven (KB National Library of The Netherlands) and Ben O'Brien (National Library of New Zealand)
This poster will highlight the new features of the Web Curator Tool (WCT), added from January 2018 onwards through collaboration between the National Library of New Zealand (NLNZ) and the National Library of the Netherlands (KB-NL). One of the themes from the collaboration has been to future proof the WCT. This involves learning the lessons from the previous development and recognising the advancements and trends occurring in the web archiving community and realizing technical uplift. On this foundation a new revamped WCT has been developed and released under version 2.x. The poster will entail the latest developments and outlines the roadmap and community building for the WCT the coming years.
ARCHIVER - Archiving and Preservation for Research Environments
João Fernandes (CERN), Jamie Shiers (CERN), Bob Jones (CERN) and Sara Pittonnet Gaiarin (TRUST-IT)
Do you need to acquire standards-based, cost-effective archiving and preservation services? Are ingest rates, data volume and long-term support important to you? The ARCHIVER project aims to introduce significant improvements in these areas of archiving and digital preservation services, supporting the IT requirements of European scientists developing end-to-end archival and preservation services for data generated in the context of scientific research. With a total procurement budget of 3.4 million euros, the project will use a Pre-Commercial Procurement (PCP) approach to competitively procure R&D services from firms in three stages covering design, prototyping and pilot, over a 3-year period (Jan 2019 - Dec 2021). The resulting services will become part of the catalogue of the European Open Science Cloud (EOSC) initiative funded by the European Commission (EC). This contribution will showcase the results obtained during the project phases up to iPRES2019, providing an overview of the PCP process for the supply side and how the wider demand side community can benefit from the ARCHIVER resulting services.
Software Preservation Services in Cultural Heritage Organizations
Wendy Hagenmaier (Georgia Institute of Technology), Christa Williford (Council on Library and Information Resources), Monique Lassere (University of Arizona), Lauren Work (University of Virginia), Jessica G. Benner (Carnegie Mellon University) and Seth Erickson (The Pennsylvania State University)
Preserving software is a prerequisite for preserving and providing access to digital cultural heritage and research. The recent formation of the Software Preservation Network (SPN) has provided momentum for a better understanding of the landscape of software preservation activities. This poster discusses preliminary results from a study undertaken by SPN’s Research Working Group. Our specific research questions are: What software preservation services are cultural heritage professionals currently providing? What are the gaps in services? What are the opportunities for future service provision? Our Service Provider Study focuses on software preservation activities happening in libraries, archives and museums. This study will inform a foundational agenda that SPN members and other cultural heritage professionals can use to conduct further research on sustainable software preservation services.
Creating Continuity for Digital Preservation Projects
Edith Halvarsson (Bodleian Libraries Oxford) and Sarah Mason (University of Oxford)
This poster abstract summarises how the Digital Preservation at Oxford and Cambridge Project self-archived its research outputs, with the aim of extending the impact of digital preservation activities at the end of the project.
Science Europe Core Requirements and Domain Protocols for Research Data Management
Peter Doorn (DANS)
Science Europe (S.E.), the European organization of Research Funding and Research Performing Organizations, has recently adopted a set of core requirements to be included in data management plans. It also formulated criteria for trustworthy data repositories, as a temporary measure while more repositories get certified.
This will lead to a much better international alignment of data management policies and practices, and will provide clarity to researchers what demands to expect and where they can safely deposit their data for preservation and sharing.
In addition to this, S.E. favors the formulation of data protocols for domains by the research (data) communities themselves, which will ease the process of formulating DMPs.
Long-Term Preservation of PDF Files in Institutional Repositories in Japan
Teru Agata (Asia University) , Yosuke Miyata (Teikyo University) and Atsushi Ikeuchi (University of Tsukuba)
In the open access environment, many textual resources have become available in the PDF format on the Web. This research aims to survey PDF files in Japanese institutional repositories (IRs) to address the problems encountered during their long-term preservation. With that aim, 1.5 million PDF files collected from Japanese IRs were analyzed with regard to file format, encryption, and metadata. Most PDF files did not conform to PDF/A. A total of 30.5% of PDFs were encrypted and many PDFs did not have embedded metadata. These results imply that PDF files in Japanese IRs have several serious problems for their long-term preservation.
Concept of Preservation of PDF files in institutional preservation systems for scientific experiments in HPC
Kyryll Udod (Ulm University), Volodymyr Kushnarenko (Ulm University) and Stefan Wesner (Ulm University)
This poster represents a concept of a preservation system for computations on High Performance Computing (HPC) resources. It covers some important challenges and possible solutions related to the preservation of scientific experiments on HPC systems for their further reproduction. Storage of the experiment as only a code with some related data is not completely enough for its future reproduction, especially in the long term. Preservation of the whole experiment’s environment (operating system, used libraries, environment variables, input data, etc.) using containerization technology (e.g. Docker, Singularity) is proposed as a suitable solution for that. This approach allows to preserve an entire environment, but leaves a problem, how to deal with the commercial software that was used within the experiment. As a solution authors propose to replace during the preservation procedure all commercial software with their open source analogues, what should allow future reproduction of the experiment without any legal issues. The prototype of such a system was developed, the poster provides a scheme of the system and the first experimental results.
Malware Threats in Digital Preservation: Extending the evidence base
Maureen Pennock (British Library), Michael Day (British Library) and Evanthia Samaras (Sidney University of Technology)
Virus checking is an established process in most pre-ingest digital preservation workflows. It is typically included as part of a general threat model response and there has to date been relatively little research into the virus checking function specifically within a long term context. The British Library recently began a small research project to explore this issue, using data from a legacy digital collection established by the ‘Flashback’ project and supplementary data provided by the UK Web Archive. Our poster presents this research and findings to date, raising questions about the overhead of virus checking at scale, when organizations should virus-check content, and the legacy capabilities of anti-virus software.
Digital Preservation in a high security environment: Student Records, Encryption, and Preservation
Annalise Berdini (Princeton University)
For the past five years, Princeton University Library – specifically the University Archives – has striven to create a robust digital preservation program for its born-digital and digitized records. Along with standard archival digital preservation needs, a major concern and requirement for the archives was the capability to safely encrypt born-digital student records as part of any long-term digital preservation solution. Based on these requirements, the Library chose to partner with a third party digital preservation service, Arkivum, which would provide a robust and customizable digital preservation system that could accommodate high-security records.
This poster will discuss the process and strategies used to gain support from University administration for digital preservation of highly sensitive records, how to work with a vendor to develop a repository-specific solution for digital preservation, and the process of investigating and developing options for an encryption key management system that protects student records while maintaining preservation goals. It will be useful to other institutions and practitioners seeking buy-in for their own systems, whether in-house or third-party, and will address the question of long-term encryption key management.
Videotex Art Restoration: Technical and Conceptual Challenges
John Durno (University of Victoria)
This poster will discuss the technical and conceptual challenges associated with achieving an authentic restoration of videotex art, in the context of a project currently underway to recover Canadian Telidon videotex artwork from the early 1980s. Strengths and weaknesses of various strategies will be discussed, including emulation, format migration, software reconstruction, and the use of period hardware. Goals of the poster include showcasing the strategies employed to date, and inviting criticism and comment from others with relevant experience to share, so as to refine and improve our methodology going forward.
CD-ARK: A Tool for Cooperative Processing of Optical Discs
Zdeněk Hruška (Moravian Library)
Optical discs are a valuable and unique part of library collections, but they are vulnerable to a variety of risks. CD-Ark is a collaborative software tool for processing and storing data from optical data discs in Czech libraries. It creates data packages that include ISO disc image, technical and bibliographic metadata, and checksums. This helps protect data stored on optical discs.
Safe Havens for Archives At Risk - Guidelines, Principles and Approaches
Afelonne Doek (IISH) and Tim Gollins (National Records of Scotland)
The poster will report on the development of the “Guiding Principles for safe havens for archives at risk”, as published and endorsed by the ICA. It will provide background to their development, and report on the continued work to develop Commentaries on the Guiding Principles, which involves the authors. The principles are format neutral; the poster will highlight specific digital considerations that have emerged in developing the Commentaries. The poster will also include details to enable institutions to declare an interest in becoming a Safe Haven. The main goal will be to raise awareness among the audience for the need for (digital) safe havens for archives at risk.
Archiving the Scholarly Git Experience
Vicky Steeves (New York University) and Genevieve Milliken (New York University)
Our poster will reflect our recent efforts to understand the workflows and policies needed for the long-term preservation of code, annotations, and other scholarly ephemera from Git hosting platforms. We undertook an environmental scan of the existing processes and tools for capturing and actively archiving Git data and their associated, supplemental materials. We will present the results of this massive environmental scan, covering a wide variety of approaches, organizations, and workflows that could possibly be used to create a baseline on which to build and expand archival tools. Our efforts are geared toward acquiring, archiving, and providing permanent access to source code, and the materials around it, and argue that the whole should be considered part of the scholarly record.
A versatile solution for Long-Term Preservation of Research Data
Pierre-Yves Burgi (Université de Genève), Hugues Cazeaux (Université de Genève) and Lydie Echernier (Université de Genève)
Developed in the context of the 2017-2020 Swiss national programme "Scientific information: Access, processing and safeguarding", the DLCM solution (dlcm.ch) consists of an open and modular architecture for long-term preservation of research data. Backed by multiple Swiss data centers, this solution, compliant with the OAIS standard, allows researchers to safely manage, publish and archive their data. Users have the possibility to either ingest data directly from their working environments through APIs or to deposit them manually through a user-friendly portal. A multilingual National coordination desk offering tailored support, consulting and training to the academic community through a network of experts is another main outcome of this project, which contributes to the success of the approach.
Introduction on Authorized Preservation Practice of the National Digital Preservation Program in China
Chao Wang (National Science Library, Chinese Academy of Science), Zhenxin Wu (National Science Library, Chinese Academy of Science) and Jiancheng Zheng (National Science Library, Chinese Academy of Science)
This work will introduce the structure of the National Digital Preservation Program in China，and why do we establish the form of authorized preservation.
Preservation Metadata Dictionary - PREMIS implementation in practice
Marjolein Steeman (The Netherlands Institute for Sound and Vision) and Yvette Hollander (The Netherlands Institute for Sound and Vision)
This poster tells the story of designing the PMD in a way that is fully conformant with PREMIS, the leading standard on preservation metadata. It will give insight in the main structure of the PMD and it will illustrate its practical use with some examples.