CHIN Digital Preservation Case Study – Medalta Museum

The following document is a write-up of the digital preservation case study of the Medalta Museum of Medicine Hat, Alberta. This write-up includes links to related documents, including Medalta’s Digital Preservation Plan and Digital Preservation Policy.

Related documents

Canadian Heritage Information Network Digital Preservation Toolkit

Medalta Digital Preservation Policy

Medalta Digital Preservation Plan

Medalta Digital Asset Retention and Disposition Schedule Markup

Table of contents

Acknowledgements

The Canadian Heritage Information Network (CHIN) would like to thank Barry Finkelman, Executive Director of Medalta Museum, for volunteering the museum as a CHIN case study so that all may learn from our experience. CHIN would also like to thank Jenna Stanton and Nicole McIntosh of Medalta Museum for the significant effort both have put towards this project. Thank you also to Owen Thompson and Claire Neily of the Alberta Museums Association for their collaboration. Finally, CHIN would like to thank members of the Digitization and Digital Preservation Discussion Group (DDPDG) Footnote 1 , based in the National Capital Region but composed of members across Canada, for their valuable feedback.

Introduction and project background

In 2011, CHIN conducted a survey on the state of digital preservation in Canadian Museums and found that while museums often held digital assets, almost no museum had a formal policy or plan for the long-term preservation of these assets. In response, CHIN developed a Digital Preservation Toolkit, consisting of resources produced by both CHIN and its partners, which helped museums create these policies, plans and procedures.

However, the toolkit did not provide concrete examples of what these documents should look like. While such finished products will be different for each museum, CHIN understands that one of the best ways to learn is by example. CHIN also wanted to learn how to improve the toolkit, so it invited museums to take up these tools and provide feedback. The 8th Hussars Museum, representing a typical volunteer-run Canadian museum, was first to commit, and the details of their experience can now be found online. Medalta Museum has also committed, and this organization is an excellent example of a fast-growing mid-sized Canadian museum.

Background of Medalta Museum

Medalta Museum, a living museum celebrating the history of Medicine Hat’s brick, clay and pottery industry, is one of many operations which function as part of the Medicine Hat Clay Industries National Historic District. While the Medalta factory is central to this district and is the location of the museum’s main office, other sites which are now part of Medalta’s operations include the Shaw International Centre for Contemporary Ceramics (a purpose-built 12,000-square-foot studio for international ceramicists visiting on residency); the non-operational Hycroft China Ltd. Factory, which contains over 70,000 non-accessioned objects, including a factory inventory of molds and products; and the Medicine Hat Brick & Tile Plant, which only recently ceased commercial operation. In total, Medalta’s operations in this historic district consist of over 7 acres of buildings on 150 acres of land.

While the museum has significant holdings and growth, it is still best described as a mid-sized museum; core funding has been approximately $1.6 million per annum, and there are approximately 17 full-time and 20 part-time staff onsite. This may change, as the museum has been proactive in establishing financial partnerships and in developing new initiatives at all levels from local to international. Medalta is also a cultural centre for the city of Medicine Hat; it hosts various functions and events, both private and public. It stages professional exhibits and installations, and its ceramics centre draws top-level artists from around the world.

The entire Medicine Hat Historic Clay District is located in a floodplain, and in June of 2014, Medalta and the surrounding sites were overcome by flood. Medalta received financial resources to digitize its holdings as part of its flood recovery funding, and the museum approached CHIN to help ensure that these new assets would be preserved according to existing standards and best practices.

Discussion of project activities

The Open Archival Information System (OAIS) reference model is an International Standards Organization (ISO) standard for the archiving of digital assets. The model focuses on the activity needed at each stage of managing a digital asset, and this work can be realized in various possible ways. A trusted digital repository (TDR) is a model of a digital archive proposed by the Research Libraries Group (RLG) and Online Computer Library Center (OCLC) which includes broader criteria relating to the organization, its finances, operations, security and the like. All TDRs must also comply with the OAIS model. Both these models are generally accepted standards in the digital archival community.

By the time the Medalta digital preservation planning got underway, CHIN had already learned that the OAIS model (let alone the TDRmodel) was too burdensome for Canada’s small to mid-sized museums. A solution was needed which benefited from standards and best practices already observed in the archival community, but which respected the resources at the museum’s disposal. This was central to the work that took place in this study.

Using the Digital Preservation Toolkit

CHIN’s Digital Preservation Toolkit provides a framework that allows any cultural institution of any size to develop a digital preservation plan, policy and procedures to fit its needs and resources. The workflow for using this kit can be summarized in the following diagram:

Workflow diagram for the digitization of audio tapes

The steps laid out in this diagram are described in detail below.

Step 1 – Taking stock of the digital inventory held by Medalta Museum

The first step in the above diagram is to take stock of what digital assets the museum holds and to consider the risk and impact of losing access to this material. While this is normally done using the content inventory template found in CHIN’s Digital Preservation Toolkit, the Excel version co-produced by CHIN and the Canadian Museum of History Footnote 2 was considered easier to complete.

An interesting note is that much of what was added to this inventory template for Medalta did not yet exist. Medalta has very little in the way of current digital assets, but the museum does have funding to digitize materials by various means, including photography, flatbed scanning and possibly 3D scanning. This is an important point; it is never too late to implement a digital preservation policy and plan, but the best time is prior to creating the content. By developing procedures upfront for the content creators, the task of preserving this content can be more efficient and effective. Some generic tips for content creators can be found in the InterPARES 2 Brochure on Creator Guidelines.

A summary of Medalta’s digital assets (current and anticipated) can be found in the following table:

The following is a summary of the inventory taken for Medalta Museum:
Name of Digital Asset GroupBrief Description of GroupApproximate Number of Digital Assets in the GroupApproximate Amount of File Space Required to Store GroupMinimum Number of Copies of Assets in this Group (if multiple copies are kept)

Group 1 – MS Access Database - soon to be PastPerfect.

This is Medalta’s collections management system. It is being converted to PastPerfect Online with the help of the Alberta Museums Association (AMA).

27,139 records

Not determined (a few GB)

2 copies – the master is on the Medalta office machine. A second copy is with AMAin Calgary.

Group 2 - Medalta Newsletter

Museum newsletter in word-searchable pdf format.

26 issues in total

Under 1 GB

2 copies – Hard copies and office hard drive, both in Medalta office.

Group 3 - Other Medalta Publications

MS Word copies of publications produced by Ron Getty, mostly documenting his knowledge of the brick and pottery industry in Southeast Alberta. Some publications with images. Some text only. One additional publication by Bruce Douglas.

9 text-only publications

3 publications with images

Under 1 GB

Hard drive on office machine.

Group 4 - Word Documents of Collections

Word documents outlining items in the collections room and the associated manufacturers.

Unknown quantity

Under 1 GB

Hard drive on office machine.

Group 5 - Archaeology Collection: Talva

Database and documents relating to building #13.

Unknown number of records

Under 1 GB

On hard drive.

Group 6 - Interviews

2004 video interview with Bill Yuil – requires digitization.

Interviews with past workers on CD.

1 video tape

At least 3 CDs

A few GB for interviews on CD

A few MB for tape once it has been digitized

Medalta main office.

Group 7 - Future 3D scanning

Not yet produced. Funding is in place. Posting a job description. AMAflood project cash to hire two people full-time for a year (6 months with Hycroft / 6 months with Medalta), mostly to do accessioning. Some of this would also be 3D scanned. The Shaw Centre is currently working with the Canada Research Council to obtain an 8' X 8' foam cutting machine to scale up molds.

Not yet produced

Not yet produced. 70,000 objects in Hycroft alone. At 20 MB/scan, a single scan per object would amount to 1.4 TB for Hycroft (Medalta has another 27,000, and the Shaw Centre has fewer still, but these are more likely to be scanned.)

Not yet produced.

Group 8 - Future digitization (photographing) of museum holdings.

Not yet produced. Medalta is considering digital scanning and photography of its holdings. PastPerfect (web version) allows one to choose which records to share online. No timelines set, but it is a near-term goal.

Not yet produced.

Not yet produced. At 10 MB per object, potentially 0.5- 1 TB for existing objects, although only a fraction of these are likely to be digitized.

Not yet produced.

Group 9 - Historical Corporate Documents:

Corporate documents from defunct operations purchased by Medalta: Business archives, old ledgers, sales bills, catalogues of what was sold. Would like to have a record of the paper, but not necessarily scan the paper. Mostly Hycroft and the Brick & Tile factories. Some from Medalta potteries.

16 soft documents

100 of boxes of hard copy documents

Under 1 GB for soft. Likely a few GB for a database recording of the hard copies.

One copy of each at distributed locations in the District.

Group 10 - Medalta Museum Operational Documents

All soft documents required for the regular operation of the museum, including email, word documents, financial statements, etc. Museum retail data (Point of Sale data, and Inventory). Documents are also being generated by the Shaw Centre.

Mostly soft copies, some hard copy materials

Unknown (a few GB at most).

One copy each. Hard drives and filing cabinets of Medalta and Shaw Centre offices.

One of the most striking things about this inventory assessment is the highly variable amount of content that may (or may not) exist in the immediate future. A minimum estimate (should scanning not take place) is 20 GB, but should scanning occur on a full scale, this amount could easily run upwards of 2 TB within a few years. Any solution will need to be flexible to storage capacity needs, and it should be done in a way that is affordable if larger capacities are not required.

Another item of note is the current collections management database. As of February 2015, the database was in MS Access. But by mid-2015, AMAhad succeeded in migrating content over to PastPerfect Online. Letting PastPerfect Online manage the preservation of this material will greatly simplify Medalta’s own preservation plan. However, Medalta should obtain assurances from PastPerfect regarding the preservation of this material, and it should have a contingency plan in place (a backup of the database records in an accessible format).

Other items of note include the lack of duplicate copies and the lack of distributed locations for storing duplicates (in the event of another flood or a fire for instance). Both these factors increase the risk of access being lost. The preservation plan will have to weigh this risk and the impact of loss against the cost of duplicating this material, all the while keeping existing resources and priorities in mind.

Step 2 – Drafting of a digital preservation policy

The next step was to draft a digital preservation policy. The policy document states what is to be accomplished and why. It answers questions such as what should be preserved, for how long and who should have access to the content. It acknowledges existing archival standards and takes into account administrative and organizational viability, financial limitations and similar factors. The policy document also identifies risks that may arise as a result of its implementation. Because the risks are (in part) a product of whatever plan is developed, the final draft of the policy should not be completed until a plan is selected. The policy is always drafted first, but the drafting of both documents is an iterative process.

The most important goal of drafting this (or any) policy is to ensure that all involved and in particular those who manage the finances and the human resources (i.e. top administration, boards of directors and the like) are onside. Because digital preservation is never a project (it is ongoing and, therefore, part of regular operations) financing cannot be project-based; it must be core funding.

As with the 8th Hussars Museum, Medalta used the Digital Preservation Policy Framework: Development Guideline Version 2.1 produced by Nancy McGovern. This template is meant for digital archives that will adhere to trusted digital repository (TDR) standards (a process too onerous for a museum of this scale). Even at five pages, the resulting document, Medalta Museum Digital Preservation Policy, is heavy and will likely be revised downwards.

Some of the items of note include:

  • An open acknowledgement that OAIS-compliance will not be possible by any solution managed by the museum, but that many principles could be observed, including:
    1. Digital content will be created with preservation in mind (populating metadata for the content, where possible).
    2. Preservation copies of material will be made on a regular basis.
    3. An automated process will be used to create basic preservation metadata as preservation copies are made.
    4. Preservation copies will be kept in multiple locations.
    5. Preservation copies will be periodically verified for data integrity.
    6. Preservation copies will be refreshed to new media on a regular basis.
    7. Preservation copies will be migrated to new formats as required.
    8. Access to preservation copies will be limited to specific museum staff.
  • All asset groups were to be preserved indefinitely with the exception of group 10 (administrative documents) which would be kept for a minimum of seven years.
  • A select group of museum staff would have access to all preserved content, and the technology chosen to preserve the content should not fetter the employee’s access (e.g. offsite, non-networked access would not be acceptable).

Some of the challenges and risks identified within the policy were:

  1. Limited time and resources: OAIS compliancy could not be managed internally. Also, the digitization (and hence digital preservation) of many physical objects would have to be prioritized.
  2. Existing physical media: Some recordings over 10 years old are on analog magnetic tape. Others are on CD of an unknown type. Digitization from older analog magnetic media needs to be addressed immediately, as does refreshing of content from CDs of unknown quality.
  3. Training: In spite of the simplified process, training will be required. CHIN has offered to assist with this step.
  4. Unknown risks: Because the chosen procedures (discussed further on) do not follow the traditional OAIS model, there is a risk that an important feature of this model may be overlooked.

This last risk is an example of how the policy needs to be revisited after a plan has been selected. In fact, both the plan and policy are living documents, indefinitely.

Step 3 – Drafting of a digital preservation plan

The next, and longest step, is the drafting of a digital preservation plan. The plan itself should be treated as a case study document. In other words, the plan starts by stating a problem and by incorporating all background information (e.g. Medalta’s background, its digital preservation needs, reference to the policy and the digital asset inventory, the organization’s resources, existing standards, etc.). It then considers several options and weighs the advantages and disadvantages of each. Finally, it selects a course of action (i.e. an Action Plan) and justifies this selection. Procedures are then drafted around the selected action plan, and for Medalta, these were kept as appendices in the plan.

The template used for drafting this plan was the Digital Preservation Plan Framework for Museums, created by CHIN. Five options were considered:

Option 1Option 2Option 3Option 4Option 5

Multiple Backups: Preservation copies using a checksum generator.

Multiple Backups: Preservation copies using a checksum generator.

Windows Storage Spaces disk pools.

Multiple Backups: Preservation copies using a checksum generator.

Cloud for preservation copies.

OAIS model managed by Medalta

OAIS model managed externally.

Note that these options approach preservation on various layers; the first three focus on the storage technology (i.e. where the content will be stored) whereas the last two focus on the preservation architecture (in this case, the software that might be used). Consideration needs to be given to all aspects of a chosen solution, but in the cases above, it is the physical layer that sets the first three options apart and the architecture that sets the last two apart. Hence the focus on these.

The first option considered is similar to that adopted by the 8th Hussars Museum and consists of a series of “intelligent” backups to external drives from a centralized machine on which all content is stored. For files that cannot easily be renamed (e.g. some databases), weekly and monthly copies are kept then overwritten; copies of the last 4 weeks and the previous 12 months are always available. For files that can be renamed, version control is practised, and files that are deleted on the working drive are automatically archived by the backup software. Preservation copies of all files are made per annum and retained indefinitely. Checksum information is generated and stored with these copies.

Option 1 is sufficient for Medalta’s needs today, but given the potentially large amount of content the organization may generate in the near future, it is conceivable that storage needs will outstrip the capacity of available hard drives. For that reason, a second version of option 1 was considered. Option 2 is identical to option 1 except that it takes advantage of Windows Storage Spaces. Storages Spaces is a Windows 10 feature which allows users to easily pool disks to appear as one large virtual disk. As additional capacity is needed, disks can be added to the pool. Storage Spaces also has multiple methods of protecting files against drive failure (i.e. “resiliency”). For archival purposes, the ideal resiliency setting is “Parity” mode. This mode maximizes useable storage space in a pool but writes slightly slower than other modes.

The smallest number of physical disk drives that can be used to create a pool of disks for a “Parity Storage Space” is three. However, it is recommended that Medalta begin with option 1 and migrate to option 2 only if required. One cannot add a drive with existing content into a pool without losing the contents of that disk. Thus, three new drives should be purchased for each Storage Space (one Storage Space to replace each of the two external “Preservation Copy” hard drives). Once a pool of three disks has been created, contents from the original external hard drive can be copied to it, and then the original disk can be reformatted and added to the pool. Drives can then be added to the pool as required, and if a single disk fails, Windows will inform the user, and it can be replaced without loss of content.

A third option was to use the cloud, or online servers, as storage space. Dropbox and Google Drive are both examples of this service, and while the cost is more than options 1 and 2, it is not significant. Upload time was not considered a barrier, given Medalta’s relatively low amount of data, but even if it did become a barrier, the cost of investing in a higher speed Internet connection could be considered or, barring this, reverting to options 1 or 2 if necessary. On principle, Medalta is reticent to adopt this option due to most commercial cloud services having servers located in the United States, meaning its content may be accessed without its expressed consent. However, Medalta may revisit this option at a future date.

Option 4 was to consider an Open Archive Information Systems architecture managed by Medalta. This was rejected outright for the same reasons that it was not seriously considered for the 8th Hussars Museum; namely, the added value of the model (i.e. being able to search across various forms of content for specific objects) does not justify the in-house software and skilled labour necessary to maintain such an archive.

Finally, a fifth option, the management of Medalta’s digital assets via an external trustworthy digital archive (TDR) is now becoming a possibility. Canadiana has a TDRin place and will soon be making it commercially available. Medalta is encouraged to give this consideration for digital assets (other than collections management records, which are already managed in PastPerfect Online) as further details are made available.

The advantages and disadvantages of each option were weighed (the details of which can be found in the Preservation Plan document), and ultimately, option 1 (backups and preservation copies to two external hard drives) was recommended, followed by option 2 (replacing individual drives with storage pools).

As with the 8th Hussars Museum Digital Preservation Case Study, the use of hard drives for all three copies over optical CD for at least one version was considered acceptable, even though it is a violation of the 3-2-1 rule Footnote 3 , given the potentially limited lifespan of optical CD.

Option 5 was considered a possibility, but further information is required, and migration to it can take place at a later date, if desired.

A more detailed list of procedures to follow for options 1 and 2 can be found in Appendices A and B of the Preservation Plan, but a summary of key points is made here.

Summary of procedures for option 1

  • Make working copies of all digital assets on a centralized, shareable working drive.
  • Acquire two external drives to be used for backups and preservation copies.
  • Use Bvckup 2 intelligent backup software to make weekly and monthly backups to the first external drive.
    • Treat files that cannot easily be renamed (database files that are part of a commercial package for instance) differently by keeping rolling backups (i.e. weekly snapshots over the previous 4 weeks, and monthly snapshots over the previous 12 months).
    • Use version control for all other files.
    • Let Bvckup 2 archive any file that has been deleted from the working drive.
  • Make annual preservation copies that include checksum (i.e. fixity) metadata (MD5summer is recommended), also to the first external drive.
  • Keep the second external drive offsite, out of the floodplain, and update it with the contents from the first external hard drive once a month.
  • Replace drives every five years.

Summary of procedures for option 2

To migrate from option 1 to option 2:

  • Acquire six external drives identical in size to those already in use. Acquire also two USB hubs with an external power source, each having a minimum of four USB ports (ideally more).
  • Upgrade all machines in the office from Windows 7 to Windows 10.
  • Follow instructions to create a new Storage Spaces disk pool for three disks (with the resiliency type set to “parity” and volume capacity set to 15 TB; larger if the current version of Windows will allow it).
  • Copy content from external drive 1 to the new Storage Space (using the Windows drag and drop or copy and paste feature).
  • Add the original drive to the pool (this will reformat the original drive).

Repeat the above process for the second external drive, and replace drives only as the system indicates it is necessary.

Current status of the preservation plan

Both the current draft of the policy and plan are under review with Medalta, and the museum is weighing the pros and cons of options 1, 2 and (possibly) 3. Once a solution has been selected, the Digital Preservation Plan (a living document) will be revised and the revision date will be added to the front cover. Updates on recommendations and selected solutions will also be made to the Digitization and Digital Preservation Discussion Group, and feedback will be sought. Also, as part of this case study, CHIN has offered to visit Medalta to help install hardware and software and to review procedures; this step is anticipated sometime in 2017.

Page details

Date modified: