DAIS - Digital Archive of the Serbian Academy of Sciences and Arts: Preservation plan: Difference between revisions

From TRAP-RCUB

Line 12: Line 12:


== Preservation policy ==
== Preservation policy ==
SASA, SASA institutes and RCUB are committed to the long-term care of items deposited in its repository and strive to adopt the current best practices in digital preservation. Access to repository administration functions is strictly limited to authorized staff. All staff involved with repository maintenance and daily operations have well defined roles and are  familiar with relevant policies and their roles in assisting and in implementing the preservation policy.
SASA, SASA institutes and RCUB are committed to the long-term care of items deposited in its repository and strive to adopt the current best practices in digital preservation. They aim at preserving the repository content for re-use, while retaining authenticity and ensuring readability of data files. Efforts are also made to mitigate the risk of deterioration, damage, data loss and corruption, as well as the obsolescence of file formats, storage or dissemination means.


In order to ensure this, depositors should meet a set of requirements during the submission step:
* Submitted data must fit into the scope of the [https://repowiki.rcub.bg.ac.rs/index.php/DAIS_-_Digital_Archive_of_the_Serbian_Academy_of_Sciences_and_Arts:_General_information#Content_policy_and_organization Content policy];
* Data formats should be suitable for long-term preservation (see [https://repowiki.rcub.bg.ac.rs/index.php/DAIS_-_Digital_Archive_of_the_Serbian_Academy_of_Sciences_and_Arts:_General_information#Preferred_file_formats Preferred file formats]);
* Sufficient metadata shall be provided (see [[DAIS - Digital Archive of the Serbian Academy of Sciences and Arts: Metadata|Metadata]]);
* Legal issues are addressed (see [https://repowiki.rcub.bg.ac.rs/index.php/DAIS_-_Digital_Archive_of_the_Serbian_Academy_of_Sciences_and_Arts:_General_information#Rights Rights] and [[DAIS - Digital Archive of the Serbian Academy of Sciences and Arts: Distribution Licence|Distribution License]]).
In order to help depositors in meeting the requirements, training and consultations are provided prior to data submission. This helps in ensuring data and metadata quality, resolving legal issues, and reducing costs linked to data ingest and curation.
Access to repository administration functions is strictly limited to authorized staff. All staff involved with repository maintenance and daily operations have well defined roles and are  familiar with relevant policies and their roles in assisting and in implementing the preservation policy.
=== Retention ===
Metadata and files deposited in the repository are stored permanently. Content may be removed only in exceptional circumstances.
Metadata and files deposited in the repository are stored permanently. Content may be removed only in exceptional circumstances.
Records may be withdrawn from the repository in case of:
Records may be withdrawn from the repository in case of:
Line 21: Line 33:
* Research containing major errors;
* Research containing major errors;
* Threat to national security.
* Threat to national security.
Withdrawn items are not deleted ''per se'', but are removed from public view. The metadata of withdrawn items will not be searchable. Withdrawn items' identifiers/URLs are retained indefinitely.
Withdrawn items are not deleted ''per se'', but are removed from public view. The metadata of withdrawn items will not be searchable. Withdrawn items' Handles and URLs are retained indefinitely.


== Data integrity and authenticity ==
=== Data integrity and authenticity ===
Only registered users can deposit items and the status of registered users is granted only to internal users. Accordingly, SASA and SASA Institutes are responsible for verifying the user identity. Provenance information is saved for each item. Once the item is approved, only repository managers are able to change the metadata and bitstreams. Submissions are reviewed by qualified staff to ensure the metadata quality and completeness, the compliance of data formats, best practice and preservation requirements, data integrity and quality, and resolve potential legal issues. Changes to submitted and approved items (metadata and bitstreams) by end users are not supported. If necessary, users may deposit a new version. Each version is assigned a unique and persistent identifier (Handle). Relations are established in the metadata between various versions.
Only registered users can deposit items and the status of registered users is granted only to internal users. Accordingly, SASA and SASA Institutes are responsible for verifying the user identity. Provenance information is saved for each item. Once the item is approved, only repository managers are able to change the metadata and bitstreams. Submissions are reviewed by qualified staff to ensure the metadata quality and completeness, the compliance of data formats, best practice and preservation requirements, data integrity and quality, and resolve potential legal issues. Changes to submitted and approved items (metadata and bitstreams) by end users are not supported. If necessary, users may deposit a new version. Each version is assigned a unique and persistent identifier (Handle). Relations are established in the metadata between various versions.


DSpace ensures the integrity of both data and metadata over time regardless of possible changes in the physical storage media. To verify that a digital object has not been altered or corrupted, the repository periodically checks the integrity of the data. The checks include the verification of md5 checksums and metadata integrity, and testing that URLs are working.
DSpace ensures the integrity of both data and metadata over time regardless of possible changes in the physical storage media. To verify that a digital object has not been altered or corrupted, the repository periodically checks the integrity of the data. The checks include the verification of md5 checksums and metadata integrity, and testing that URLs are working.
DAIS has the right to copy, transform, store and provide access to the data. This right is granted by contributors upon submission. The repository has the right to convert file formats if this is necessary to ensure permanent access to a resource.
=== Independent understandability of data ===
Data is described at the individual resource level using metadata. The metadata schema is generic and sufficiently flexible to preserve various resources from a wide range of research disciplines. It is also possible to establish relations in the metadata to other publications and  related data. Metadata properties can be mandatory, recommended or optional.
Since the primary designated community is multidisciplinary, repository managers work in close cooperation with the depositors, as described under [https://repowiki.rcub.bg.ac.rs/index.php/DAIS_-_Digital_Archive_of_the_Serbian_Academy_of_Sciences_and_Arts:_Workflows#Reviewing_submissions Reviewing submissions]. Repository managers perform metadata quality checks and enhance metadata as described under [https://repowiki.rcub.bg.ac.rs/index.php/DAIS_-_Digital_Archive_of_the_Serbian_Academy_of_Sciences_and_Arts:_Workflows#Curation Curation]. 
Curently, the quality of metadata in DAIS varies.


== Operational continuity and disaster recovery ==
== Operational continuity and disaster recovery ==
Line 38: Line 59:


In case of major software configuration changes or updates, the virtual machine is cloned and all changes are tested on the clone. Before any intervention on the production machine, a snapshot is created in the virtualization system, to enable roll-back and prevent data loss. End-users are duly informed about planned changes and upgrades.
In case of major software configuration changes or updates, the virtual machine is cloned and all changes are tested on the clone. Before any intervention on the production machine, a snapshot is created in the virtualization system, to enable roll-back and prevent data loss. End-users are duly informed about planned changes and upgrades.
DAIS has the right to copy, transform, store and provide access to the data. This right is granted by contributors upon submission. The repository has the right to convert file formats if this is necessary to ensure permanent access to a resource.


== Continuity of Access ==
== Continuity of Access ==

Revision as of 22:41, 26 September 2021

This public wiki is about the DAIS – Digital Archive of the Serbian Academy of Sciences and Arts

See also:

Preservation policy

SASA, SASA institutes and RCUB are committed to the long-term care of items deposited in its repository and strive to adopt the current best practices in digital preservation. They aim at preserving the repository content for re-use, while retaining authenticity and ensuring readability of data files. Efforts are also made to mitigate the risk of deterioration, damage, data loss and corruption, as well as the obsolescence of file formats, storage or dissemination means.

In order to ensure this, depositors should meet a set of requirements during the submission step:

In order to help depositors in meeting the requirements, training and consultations are provided prior to data submission. This helps in ensuring data and metadata quality, resolving legal issues, and reducing costs linked to data ingest and curation.

Access to repository administration functions is strictly limited to authorized staff. All staff involved with repository maintenance and daily operations have well defined roles and are familiar with relevant policies and their roles in assisting and in implementing the preservation policy.

Retention

Metadata and files deposited in the repository are stored permanently. Content may be removed only in exceptional circumstances. Records may be withdrawn from the repository in case of:

  • Proven copyright violation;
  • Plagiarism;
  • Falsified research;
  • Research containing major errors;
  • Threat to national security.

Withdrawn items are not deleted per se, but are removed from public view. The metadata of withdrawn items will not be searchable. Withdrawn items' Handles and URLs are retained indefinitely.

Data integrity and authenticity

Only registered users can deposit items and the status of registered users is granted only to internal users. Accordingly, SASA and SASA Institutes are responsible for verifying the user identity. Provenance information is saved for each item. Once the item is approved, only repository managers are able to change the metadata and bitstreams. Submissions are reviewed by qualified staff to ensure the metadata quality and completeness, the compliance of data formats, best practice and preservation requirements, data integrity and quality, and resolve potential legal issues. Changes to submitted and approved items (metadata and bitstreams) by end users are not supported. If necessary, users may deposit a new version. Each version is assigned a unique and persistent identifier (Handle). Relations are established in the metadata between various versions.

DSpace ensures the integrity of both data and metadata over time regardless of possible changes in the physical storage media. To verify that a digital object has not been altered or corrupted, the repository periodically checks the integrity of the data. The checks include the verification of md5 checksums and metadata integrity, and testing that URLs are working.

DAIS has the right to copy, transform, store and provide access to the data. This right is granted by contributors upon submission. The repository has the right to convert file formats if this is necessary to ensure permanent access to a resource.

Independent understandability of data

Data is described at the individual resource level using metadata. The metadata schema is generic and sufficiently flexible to preserve various resources from a wide range of research disciplines. It is also possible to establish relations in the metadata to other publications and related data. Metadata properties can be mandatory, recommended or optional.

Since the primary designated community is multidisciplinary, repository managers work in close cooperation with the depositors, as described under Reviewing submissions. Repository managers perform metadata quality checks and enhance metadata as described under Curation.

Curently, the quality of metadata in DAIS varies.

Operational continuity and disaster recovery

DAIS is hosted by the University of Belgrade Computer Centre on a virtual machine in a Proxmox environment under a CentOS operating system. Hardware resources are incrementally adjusted to the database size and the number of visitors. The repository database is stored on a PostgreSQL 9.5 server inside the production-level virtual machine. Database export is enabled.

The software platform of DAIS is based on DSpace 5.10. The core DSpace code and Java code have not been modified to facilitate the implementation of DSpace upgrades. Major modifications have been made to the configuration, localization files and the XMLUI configuration. The system has been enriched with additional applications (displaying citation counts from the Web of Science, Scopus, Dimensions and Altmetric Attention Scores; displaying recommended citation; full ORCID integration; displaying human-readable funding information in the selected interface language). The source code of the customized version of DSpace and all additional applications is stored on a local Git server accessible only to the repository development team. Detailed documentation about software, installation, configuration, maintenance, and troubleshooting is available on Confluence. This enables easy replication of procedures and ensures continuity in case of staff changes.

Backups are regularly performed at the virtual machine level. Both live instances and their passive backups reside on hardware-enabled and redundant RAID setups. The monitoring and alerting service MONIT, maintained by the RCUB team, constantly monitors the operation of the repository and sends alerts to system administrators in case of unexpected events. Local firewall appliances, such as Iptables and Fail2ban, are used to protect and restrict access to the DAIS instance. The repository follows a regular upgrade cycle and, where possible, existing and widely accepted best practices.

In case of major software configuration changes or updates, the virtual machine is cloned and all changes are tested on the clone. Before any intervention on the production machine, a snapshot is created in the virtualization system, to enable roll-back and prevent data loss. End-users are duly informed about planned changes and upgrades.

Continuity of Access

According to the law, SASA is the national academy and the most prominent scholarly institution in Serbia. The institutes are independent legal entities but their work is closely tied with the mission and the activities of SASA (e.g. joint projects, co-publishing projects, joint conferences, etc.). In line with their mission and the role of publicly funded institutions, SASA, SASA institutes, and RCUB (as the Outsource Partner) seek to provide reliable and secure archiving for diverse outputs of SASA and SASA institutes, while ensuring an easy access and widest dissemination of the Open Access content. The current level of funding is sufficient to maintain and develop DAIS. Development and maintenance, as well as data security, are ensured through a SLA with RCUB.

SASA and SASA institutes are able to preserve data access in case of unexpected emergency budget cuts. DAIS is easy to keep running and service costs are not high. All repository managers are employed under regular contracts at participating institutions and their activities related to repository management do not incur any additional cost. The SLA with RCUB foresees Post-Cancellation Service Time, i.e. a period of time after the termination of SLA during which the repository will be available with the minimum maintenance services provided. Accordingly, even in case of funding disruption, the services will be kept running, providing sufficient time to find a sustainable solution.

Hardware security

The computer hardware that runs the repository is the property of RCUB. A dedicated team at RCUB takes care of the configuration, maintenance, security, software updates and development. RCUB has a dedicated team responsible for infrastructure security. RCUB security officers are responsible for general network security, server security, and service maintenance and they collaborate closely with the repository development team. Servers and network devices are kept in a dedicated area with physical access strictly limited to authorized staff. Access to the backup facilities is strictly limited access. The premises are equipped with fire alarms and a fire retardant system. Uninterrupted power supply is ensured by means of an automatic stand-by electric power generator. Dedicated staff members are physically present on the premises 24/7. Remote security services are also provided.