Digitization guidelines

Table of contents

Introduction

Many Government of Canada institutions are choosing to scan analogue records in order to save on physical storage space or make their records more accessible to users.

However, just scanning records is not enough. The goal is to create digitized records that are:

Library and Archives Canada strives to receive the best archival records, records that will serve as witness to the history of Canada and be accessible to the public over the long term. This includes digitized archival records.

Like original source records, authoritative digitized records provide evidence and serve as historical proof. However, the best technology and the highest resolution are of no benefit if the authority of the digitized record is not established. The International Organization for Standardization (ISO), in ISO 15489-1:2016 Information and documentation — Records management — Part 1: Concepts and principles, describes an authoritative record as one that has authenticity, reliability, integrity, and useability.

Producing authoritative records is not only about the technical specifications of digitization, but also about ensuring that the process of digitizing is documented and auditable.

The digitization process involves more than just capturing images. It also includes planning, assessing, preparing, digitizing, compiling metadata, running quality assurance checks, and storing and managing the digitized records. It is necessary to have in place policies and procedures, and to fully plan and document the digitization process. Each of the sections below relates to one aspect of the digitization process, and sets out best practices to follow and requirements to consider.

Purpose

These guidelines are written to assist Government of Canada (GC) institutions ensure they have defensible digitization processes. This means that institutions should be able to demonstrate that a digitized record is a true and accurate version of the source record. This also means that the digitization process is documented in order to show that institutions have adhered to all requirements for producing an authoritative digitized record.

These guidelines set out best practices for digitization activities. For some institutions, digitization activities mean an occasional project to digitize records; for others, this means a formal digitization program. In both cases, decisions should be made regarding all considerations before digitization work begins. This is necessary to ensure the authenticity, reliability, integrity, and useability of every digitized record. Following these guidelines will ensure that institutions’ digitized records will fulfill the same ongoing business needs and meet future requirements as did the source record.

These guidelines do not presume to be a comprehensive method for the digitization process. Rather, they are a starting point to guide GC institutions in establishing digitization programs and projects that will ensure authoritative digitized records. If your institution does not plan to dispose of the source originals (i.e., the scanned version will be an access copy used only for reference), these guidelines do not apply since the authoritativeness of the digitized version is not as critical when the original version still exists as the official copy.

Policy and planning

All GC institutions should have a broad internal policy in place to guide all digitization projects and programs. The fundamental policy principles should include the following:

The policy should state the purpose of digitization, specify when digitization is appropriate, and set the institution’s criteria for document selection. The policy should outline what criteria need to be set, approved, and documented for each digitization project.

Best practices provide that a departmental procedures manual be created and implemented to ensure that all digitization projects adhere to the same process. Additionally, a comprehensive plan should be put in writing for each digitization project. This includes an outline of the digitization plan, details regarding the project-specific benchmarks, and a list of the required approvals. As well, all steps taken must be documented.

It is important to distinguish between the policy, which will apply to all projects (i.e., your digitization program), and the requirements that will be specific to a particular project. For example, the policy should state that each digitization project must have set quality assurance criteria and that the criteria must be project-specific, based on business needs, and recorded in a digitization manual.

Document selection

Criteria for the selection of records suitable for digitization should be included in your institutional policy. Consider user requirements and document attributes when selecting documents.

Records are suitable for digitization if:

Some records will require more time and effort—and will be more costly—to scan: those in a very large or very small format; files containing a variety of documents; records needing extra preparation because they are stapled, bound, rolled, etc.; fragile records requiring specialized handling and care.

Records NOT suitable for digitization include the following:

Outsourcing digitization

The choice of in-house or outsourced digitization will be based on many factors, including whether digitization will be a one-time project or an ongoing program requirement.

Outsourcing provides many benefits: institutions do not pay the up-front cost of technology; the budget is more contained; economies of scale can be achieved from the volume that vendors manage; and vendors may have expertise that the institution lacks.

However, when digitization is outsourced, consideration needs to be given to the security of the records during transport and digitization, and ongoing communication between the GC institution and the vendor is highly recommended throughout the process.

When choosing an outsourced vendor, other considerations include (but are not limited to): a vendor’s ability to scan according to digitization or evidence standards; use of appropriate technology; and ability to handle particular formats.

Additionally, the vendor should adhere to quality assurance practices, and provide certification of assurance for all digitization activities.

Public Service and Procurement Canada (PSPC) offers a document imaging service for all levels of government. Please consult the PSPC webpage for more information.

Roles and responsibilities

Governance should be defined for all digitization projects. Documenting approvals and accountability helps establish the authenticity and reliability of the record.

Digitization projects require a combination of skills from staff with different areas of expertise. Clear roles and responsibilities, well-defined reporting lines, and detailed communication plans will ensure that the project runs smoothly and that the final product is authentic and reliable.

Remember: Authorization to destroy source records once they have been digitized can be given only through LAC’s disposition authorizations (for example, an institution specific Disposition Authorization or MIDA 2018/013 Disposition Authorization for the Destruction of Source Records following Digitization). Your LAC archivist can assist with related questions.

Risk assessment

GC institutions should assess the risks involved in digitizing (or not digitizing) records before they undertake any digitization work. Institutions should document the level of risk they are prepared to accept and specify any planned mitigation. The following are examples of risk:

Format, indexing, and metadata requirements

The final format of the digitized image should be determined on the basis of business needs and legal requirements. Annex A provides a chart of recommended technical requirements, including best practice choices for format. Records identified as archival will need to be in a format acceptable for transfer to Library and Archives Canada. Please see LAC’s Guidelines on File Formats for further information.

There are three types of digitization of paper records, each allowing a different degree of access:

  1. Page images—the digitized record is static. It cannot be changed, and its contents cannot be searched.
  2. Full text—the digitized record is transformed into machine-readable text through either manual keying or use of an Optical Character Recognition (OCR) program.
  3. Encoded text, or full text with mark-up—the digitized record has the same options as the full-text, with further annotations to increase search functionality.

The choice of format and the type of access needed will determine how much indexing is required or possible.

The type of digitization chosen for other formats will vary. For example, for spatial information you may use manual processes such as tablet digitizing or heads-up digitizing, or an automated process such as scanning and vectorization. Each institution should determine the best way to proceed with digitization based on the specific format of the record.

Indexing ensures that records will be reliable in the future, that records are accessible and retrievable, and that they are appropriately stored and managed throughout their lifecycle. Digitized records must be indexed; otherwise, they will not be found by users.

Indexing can occur at several points in the digitization process: image capture and recapture, quality assurance, and transfer into a designated corporate repository. All previous audits and metadata/indexing associated with the source system should be preserved so that the integrity of the record can always be established. The indexes created should be retained for at least as long as the records to which they relate.

Bibliographic indexing relates to the contents of the record and the management thereof; this should align with the metadata required for any electronic records. For further information about GC metadata requirements, please see the GC Standard on Metadata and LAC Minimum Metadata Set for Digital Archival Government Records.

Biographic indexing refers to the digitization process and needs to be captured at the time of digitization. Institutions should determine the biographic metadata they need.

Biographic metadata can include the following:

Preparation of records

The proper preparation of documents is necessary to ensure the highest-quality digitized images. The amount of preparation will depend on the condition and format of the documents being digitized. Typically, only basic preparation of source records is required to ensure efficient digitization processes. However, in some cases, such as for folded, rolled or fragile documents, or for damaged tapes, more extensive work may be required. In some circumstances (such as large format items requiring specialized equipment), it may be necessary to scan related records separately, and to ensure their original order is documented in the metadata.

Personnel who prepare the documents should identify potential issues with the documents so the person operating the scanner can make the appropriate adjustments.

Quality assurance

Quality assurance is the process of verifying whether the digitized record meets requirements. It involves checking the operation and output of digitization processes against agreed benchmarks to ensure that these benchmarks have been met and that the digitized images are acceptable as a substitute for the original record. The quality of the image is subjective, and the criteria for whether an image is acceptable should be determined prior to digitization and documented for each project. The required image quality is based on the purpose of the digitized record (i.e., whether the digitized record will serve as the official version of the record), though it is recommended to always scan to the highest possible quality to ensure that the record will be accepted as authentic.

As well as determining the criteria for an acceptable image, each project or program should determine:

Quality assurance activities should be logged, and each batch of digitized images should be certified as having passed quality control. It is recommended that someone other than the equipment operator perform the quality assessment.

The quality control process should at a minimum:

Errors can be major or minor, and include such things as the image being skewed, insufficient contrast, illegibility of characters, and speckle on the image. Quality control should be performed at several intervals throughout the digitization process: image capture, recapture, indexing, quality assurance and transfer of images. Quality control inspections should also be performed regularly on all the equipment being used to ensure proper function and calibration.

AIIM TR34-1996, Sampling Procedures for Inspection by Attributes of Images in Electronic Image Management (EIM) and Micrographics Systems provides further guidance on sample size and acceptable error ratio.

Classification and integration into the electronic document and records management system (EDRMS)

Classification ensures that the digitized images can be integrated into the file classification system and the corporate electronic document and records management system (EDRMS). The digitized records need to be identified with the file number and the retention and disposition information. These specifications should be aligned for analogue, digital and digitized records. It is the contents of a record, not the format, that determines retention and disposition, sensitivity, and access permissions, and the digitized versions should have the same characteristics as the source records.

For various reasons, derivative copies may be created along with the official version of the record. Derivative copies may include:

These should be clearly identified as copies through the use of naming conventions in the title of the file.

Security requirements

Institutions should plan for both the physical security of the records during the digitization process and the security of the information. Records that have a sensitivity of Protected B or higher should be digitized in an environment that protects the records against unauthorized access, disclosure and removal.

Identification of protected and secret information should be added into the metadata of digitized records. It should be possible to apply the same access restrictions to the digitized image as were applied to the source record. If the source records are being destroyed, they should be destroyed in a manner consistent with their security level.

Follow any GC policy instruments and procedures regarding the security of electronic information.

Transfer of source records and digitized images

The digitization process necessitates physically moving records, either within the department itself for in-house digitization or to an off-site vendor. Departments should define procedures for the transfer of the source documents and digitized files to ensure that records remain secure and that their authenticity has been maintained during transfer.

When any transfer is implemented, information such as the date of transfer, the name of the courier, the location, the receipt of records, and the same information for the return of the source records and digitized files should be documented.

Whatever method is used for transferring digitized records, it is important to ensure that the digitized files are not altered during transfer by using fixity information, such as a checksum.

Storage and preservation

If active, records need to be stored in a designated corporate repository that meets all requirements for the management of the records throughout the full lifecycle.

If the digitized records are being put in dormant storage, departments should ensure that the storage solution, whether online or on physical carriers, has search and access capacity, can manage the records over long-term—including their disposition—and protects the authenticity of the record. Digitized records, like all other records, are subject to ATIP and litigation requests, even when in storage. When choosing storage solutions for dormant digitized records, consider the response times required by end users if access is needed.

The life expectancy of the technology may be much shorter than the required retention period of the records. As a result, they would necessitate active management and migration of the records over time. Departments should plan for the necessary storage and a schedule for migration and/or conversion of the digital records as technology changes; this applies for both active and dormant records.

The costs associated with the medium- to long-term maintenance and accessibility of digital records, including for responding to ATIP requests, are often overlooked and should be included in planning. These costs may be more than required to physically store the original source records, and should be considered as a factor when making the decision to digitize.

Disposition of source records

Source records may be kept after digitization. More frequently, however, they are destroyed. Remember: GC institutions may dispose of original source records only as prescribed in their institution-specific Disposition Authorization or MIDA 2018/013 Disposition Authorization for the Destruction of Source Records following Digitization). Source records must not be destroyed if LAC has identified them as archival records that must be transferred in their original format. Always consult with your LAC archivist before disposition of archival source records.

There is a risk in destroying source records, even those records not identified as archival. Institutions should ensure that risk has been assessed and documented.

Failure to articulate a policy and procedures for the disposal of source records may give the appearance that these records have been disposed of in bad faith.

Some records may need to be maintained in their original format for legal reasons, even if the digitized version is sufficient for business needs. GC institutions should seek legal advice before destroying source records.

Original source records subject to a preservation order relating to litigation (including records that are scheduled for destruction) should not be destroyed while the preservation order is in place.

Before destroying source records, sufficient time should be allowed to ensure that quality control and indexing are fully completed and that the digitized records have been accurately and completely transferred to the EDRMS or secure storage.

As with all records, disposition actions should be fully documented.

If the source records are being kept, they should have the same retention and disposition actions as the digitized version and their accessibility and preservation over time should be managed.

Documentation requirements

Adequate documentation is key to defensibility. If the authenticity of the digitized record is challenged, institutional policy and procedures and evidence that they were followed contribute to proving the authenticity of the records.

Most directives on digitization recommend the creation of a project manual for each digitization initiative, which should include all the necessary information, documented in one place. While institutional policy and procedures on digitization should outline criteria for making decisions on requirements, such as metadata, quality control, and format, the manual should document the specific choices made for the records in the current project.

The necessary documentation includes the following:

Conclusion

These guidelines were written to provide a high-level overview of best practices for digitization activities. Producing authoritative records is not only about the technical specifications of digitization; it is also about ensuring that the process of digitizing is documented and auditable for as long as needed for business, legal, and regulatory reasons.

Institutions are encouraged to consult the sources listed in Annex C for additional information. For clarification about the value of records, including archival value, please consult your LAC archivist.

Annex A: Technical specifications
  Resolution Scanning ratio Colour profile Bit depth Compression Format
Textual documents –Black-and-white 300 ppi to 600 ppi
4000 pixels across longest dimension
1:1 greyscale 8 lossless Tagged Image File Format (TIFF)
PDF/A
Textual documents –Colour 300 ppi to 600 ppi
4000 pixels across longest dimension
1:1 red-green-blue (RGB) 24 lossless TIFF
PDF/A
Photographs – Black-and-white 35 mm 2700 ppi 1:1 greyscale 8 lossless TIFF
4 x 5; 5 x 7 800 ppi
8 x 10 400 ppi
4000 pixels across longest dimension
Photographs –  Colour 35 mm 2700 ppi 1:1 RGB 24 lossless TIFF
4 x 5;
5 x 7
800 ppi
8 x 10 400 ppi
4000 pixels across longest dimension
Maps, architectural plans, blueprints 300 ppi to 600 ppi
6000 pixels to 8000 pixels across longest dimension
1:1 greyscale 16 lossless TIFF
PDF/A
GeoTIFF
RGB 24
Microfilm and microfiches 300 ppi to 600 ppi 1:1 greyscale 8 lossless TIFF
PDF/A
JPEG 2000
Negatives – Black-and-white 35mm 2400 ppi 1:1 greyscale 8 lossless TIFF
PDF/A
JPEG 2000
4 x 5;
5 x 7
600 ppi
8 x 10 300 ppi
3000 pixels across longest dimension
Negatives – Colour 35 mm 2400 ppi   1:1 RGB 24 lossless TIFF
PDF/A
JPEG 2000
4 x 5;
5 x 7
600 ppi
8 x 10 300 ppi
3000 pixels across longest dimension

Annex B: Terminology

Access copy: A file that captures the minimum amount of information in order to meet basic demands to view the informational content of a record. Source: Operational Standards for Digitization, Library and Archives Canada, Digital Operations and Preservation Branch.

Authenticity: An authentic record is one that can be proven to be what it purports to be; has been created or sent by the agent purported to have created or sent it; and was created or sent when purported. Source: International Organization for Standardization (2016). ISO 15489-1:2016 Information and documentation — Records management — Part 1: Concepts and principles.

Bibliographic Information: Information regarding the content and context of a document. It is created by the organization (possibly obtained from the Source Record) and aids in the retrieval of an image. Source: CAN/CGSB-72.11-93 Microfilm and Electronic Images as Documentary Evidence.

Biographic Information: Information regarding image capture that may include the date captured, the time, the operator identification, the capture device identification and location and details of modification, if any. Source: CAN/CGSB-72.11-93 Microfilm and Electronic Images as Documentary Evidence.

Digitization: The process of converting analog records into digital format. The process broadly includes: selection, assessment, prioritization, project management and tracking, preparation of originals for digitization, metadata creation, collection and management, digitizing (the creation of digital objects from physical originals), quality management, submission of digital resources to delivery systems and into a repository environment, and assessment and evaluation of the digitization effort. Source: Operational Standards for Digitization, Library and Archives Canada, Digital Operations and Preservation Branch.

Digitization project: Retrospective, back-capture of existing sets of non-digital records to enhance accessibility and maximize re-use. Note 1 to entry: In such projects, the business action has been completed on non-digital form of the record prior to digitization and for ongoing management purposes the non-digital record on which the business action took place, or which evidences the action, remains the official record of action. Note 2 to entry: The non-digital source records for both forms of digitization should be subject to an assessment process to determine whether there are good reasons to retain them prior to any consideration of disposition. Once non-digital records are converted into digital records, many of the management and preservation issues for born-digital records apply. Source: International Organization for Standardization (2010). ISO/TR 13028:2010 Information and documentation - Implementation guidelines for digitization of records.

File format: Specific pattern or structure, which organizes and defines data. Some formats contain only one stream of uncompressed data, others may contain codecs to encode and compress the data, and others still may support several streams of media. Source: Operational Standards for Digitization, Library and Archives Canada, Digital Operations and Preservation Branch.

Format: The arrangement of information. Source: Operational Standards for Digitization, Library and Archives Canada, Digital Operations and Preservation Branch.

Integrity: A record that has integrity is one that is complete and unaltered. Source: International Organization for Standardization (2016). ISO 15489-1:2016 Information and documentation — Records management — Part 1: Concepts and principles.

Intrinsic value: The usefulness or significance of a record derived from its physical or material qualities, inherent in its original form and generally independent of its content, that are integral to its nature and would be lost in reproduction. Intrinsic value is often associated to the rarity or age of the support as well as its artistic or esthetic quality. Source: Destruction of Source Records following Digitization, Library and Archives Canada, MIDA 2018/013.

Quality assurance: Part of quality management focused on providing confidence that quality requirements will be fulfilled. Source: International Organization for Standardization (2015). ISO 9000:2015 Quality management systems — Fundamentals and vocabulary.

Reliability: A reliable record is one whose contents can be trusted as a full and accurate representation of the transactions, activities or facts to which they attest, and; which can be depended upon in the course of subsequent transactions or activities. Source: International Organization for Standardization (2016). ISO 15489-1:2016 Information and documentationRecords managementPart 1: Concepts and principles.

Source Record: A record from which a digitized version has been created. Source: Destruction of Source Records following Digitization. Library and Archives Canada, 2018/013.

Transitory records: Records that are not of business value. They may include records that serve solely as convenience copies of records held in a government institution repository, but do not include any records that are required to control, support, or document the delivery of programs, to carry out operations, to make decisions, or to provide evidence to account for the activities of government at any time. Source: Disposition Authorization for Transitory Records Library and Archives Canada, 2016/001.

Useability: A useable record is one that can be located, retrieved, presented and interpreted within a time period deemed reasonable by stakeholders. Source: International Organization for Standardization (2016). ISO 15489-1:2016 Information and documentationRecords managementPart 1: Concepts and principles.

Annex C: Bibliography

Alberta. Digitization Process Standard (2015).

AIIM TR34-1996, Sampling Procedures for Inspection by Attributes of Images in Electronic Image Management (EIM) and Micrographics Systems.

ANSI/AIIM TR15-1997, Planning Considerations Addressing Preparation of Documents for Image Capture.

Archives of Manitoba. Digitizing Records (2018).

Canadian General Standards Board. CAN/CGSB-72.34-2017, National Standard of Canada. Electronic Records as Documentary Evidence. This standard is available at no cost from the Canadian General Standards Board.

Government of the Northwest Territories. Office of the Chief Information Officer (2018). Guideline – Digitization.

International Organization for Standardization (2010). ISO/TR 13028:2010 Information and documentationImplementation guidelines for digitization of records.

International Organization for Standardization (2012). ISO 13008:2012 Information and documentationDigital records conversion and migration process.

International Organization for Standardization (2016). ISO 15489-1:2016 Information and documentationRecords management — Part 1: Concepts and principles.

Library and Archives Canada. 2018/013 Destruction of Source Records following Digitization.

Library and Archives Canada. Operational Standards for Digitization. Library and Archives Canada, Digital Operations and Preservation Branch.

Newfoundland and Labrador. Office of the Chief Information Officer. Guideline – Record Imaging Services (2015).

Provincial Archives of New Brunswick. Digitization Standard (2019).

Page details

2025-10-16