DAIS - Digital Archive of the Serbian Academy of Sciences and Arts: Preservation plan

From TRAP-RCUB

Revision as of 07:46, 27 March 2021 by 62.240.24.4 (talk) (→‎Data integrity and authenticity)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

See also:

Preservation policy

SASA, SASA institutes and RCUB are committed to the long-term care of items deposited in its repository and strive to adopt the current best practices in digital preservation. Access to repository administration functions is strictly limited to authorized staff. All staff involved with repository maintenance and daily operations have well defined roles and are familiar with relevant policies and their roles in assisting and in implementing the preservation policy.

Metadata and files deposited in the repository are stored permanently. Content may be removed only in exceptional circumstances. Records may be withdrawn from the repository in case of:

  • Proven copyright violation;
  • Plagiarism;
  • Falsified research;
  • Research containing major errors;
  • Threat to national security.

Withdrawn items are not deleted per se, but are removed from public view. The metadata of withdrawn items will not be searchable. Withdrawn items' identifiers/URLs are retained indefinitely.

Data integrity and authenticity

Only registered users can deposit items and the status of registered users is granted only to internal users. Accordingly, SASA and SASA Institutes are responsible for verifying the user identity. Provenance information is saved for each item. Once the item is approved, only repository managers are able to change the metadata and content files. The deposited items are reviewed by qualified staff to ensure the metadata quality and completeness, the compliance of data formats, best practice and preservation requirements, data integrity and quality, and potential legal issues. Changes to submitted and approved content files are not supported. If necessary, users may deposit a new version. Relations are established in the metadata between the old and new versions.

DSpace ensures the integrity of both data and metadata over time regardless of possible changes in the physical storage media. To verify that a digital object has not been altered or corrupted, the repository periodically checks the integrity of the data. The checks include the verification of md5 checksums and metadata integrity, and testing that URLs are working.

Operational continuity and disaster recovery

DAIS is hosted by the University of Belgrade Computer Centre on a virtual machine in a Proxmox environment under a CentOS operating system. Hardware resources are incrementally adjusted to the database size and the number of visitors. The repository database is stored on a PostgreSQL 9.5 server inside the production-level virtual machine. Database export is enabled.

The software platform of DAIS is based on DSpace 5.10. The core DSpace code and Java code have not been modified to facilitate the implementation of DSpace upgrades. Major modifications have been made to the configuration, localization files and the XMLUI configuration. The system has been enriched with additional applications (displaying citation counts from the Web of Science, Scopus, Dimensions and Altmetric Attention Scores; full ORCID integration; displaying human-readable funding information in the selected interface language). The source code of the customized version of DSpace and all additional applications is stored on a local Git server accessible only to the repository development team. Detailed documentation about software, installation, configuration, maintenance, and troubleshooting is available on Confluence. This enables easy replication of procedures and ensures continuity in case of staff changes.

Back-up is regularly performed at the virtual machine level, on a RAID storage physically separated from the virtualization server. The monitoring and alerting service MONIT, developed by the RCUB team, constantly monitors the operation of the repository and sends alerts to system administrators in case of unexpected events. DAIS is protected by a restrictive IPTABLES firewall provided by the NREN AMRES (Academic Network of Serbia). The repository follows a regular upgrade cycle and, where possible, existing and widely accepted best practices.

In case of major software configuration changes or updates, the virtual machine is cloned and all changes are tested on the clone. Before any intervention on the production machine, a snapshot is created in the virtualization system, to enable roll-back and prevent data loss. End-users are duly informed about planned changes and upgrades.

DAIS has the right to copy, transform, store and provide access to the data. This right is granted by contributors upon submission. The repository has the right to convert file formats if this is necessary to ensure permanent access to a resource.

Continuity of Access

According to the law, SASA is the national academy and the most prominent scholarly institution in Serbia. The institutes are independent legal entities but their work is closely tied with the mission and the activities of SASA (e.g. joint projects, co-publishing projects, joint conferences, etc.). In line with their mission and the role of publicly funded institutions, SASA, SASA institutes, and RCUB (as the Outsource Partner) seek to provide reliable and secure archiving for diverse outputs of SASA and SASA institutes, while ensuring an easy access and widest dissemination of the Open Access content. The current level of funding is sufficient to maintain and develop DAIS. Development and maintenance, as well as data security, are ensured through a SLA with RCUB.

SASA and SASA institutes are able to preserve data access in case of unexpected emergency budget cuts. DAIS is easy to keep running and service costs are not high. All repository managers are employed under regular contracts at participating institutions and their activities related to repository management do not incur any additional cost. The SLA with RCUB foresees Post-Cancellation Service Time, i.e. a period of time after the termination of SLA during which the repository will be available with the minimum maintenance services provided. Accordingly, even in case of funding disruption, the services will be kept running, providing sufficient time to find a sustainable solution.