Guidance on Assessing Metadata Needs

Date of publication: January 10, 2024

On this page

Preamble

This guidance provides advice to departmental officials on how to assess their metadata needs in respect of section L.2.2.1 of “Appendix L: Standard for Managing Metadata”of the Directive on Service and Digital.

1. Key concepts

1.1 What is a metadata needs assessment?

A metadata needs assessment is a systematic process for determining the gaps between the current state of metadata within the organization and the state that is desired. It can be used to clarify problems and identify appropriate metadata-related solutions.

Why is a metadata needs assessment important?

A metadata needs assessment can yield measurable insight into the metadata, metadata reference standards, and metadata-related practices and accountability structures that a department needs to have in place to manage its information and data efficiently and effectively for all purposes these assets may serve. The objective of a metadata needs assessment is practical: to identify and understand the metadata (that is, according to schema, elements, content, values, exchange formats) that departments need to implement to enable the utility, interoperability, reuse and sharing of information and data in delivering Government of Canada (GC) programs and services while minimizing duplication and risks to security and privacy. It can also reveal and document metadata-related responsibilities and accountabilities to be assigned to users to support this process. A metadata needs assessment can reveal risks to the open, strategic, and secure management of information and data. In evaluating metadata needs, departmental officials are advised to identify practical options and pursue a metadata management plan for mitigating the adverse impacts of allowing gaps in departmental metadata and metadata management practices to persist.

2. Metadata needs assessment

2.1 Timing

It is recommended that metadata needs assessments be performed routinely by departmental officials whenever projects, programs and services – including internal services – are planned or undergo revision. It may be useful to append a metadata needs assessment to a planning or development document, such as a Treasury Board submission, Enterprise Architecture Review Board proposal or a Major Project Investment Board submission. It is best if metadata needs are documented in time to support the design and implementation of information and data architectures and the information technology (IT) infrastructure (that is, systems, individual applications) that supports them. Also, metadata needs assessments can be performed to guide the ongoing development, implementation, improvement, maintenance, closing and/or decommissioning of systems and the activities where initial planning and design have already been completed. Metadata needs assessments can also be used during the process of program evaluation.

2.2 Scoping

Departmental officials are advised to define the scope of each metadata needs assessment in terms of a system or process for which it is being performed and the business requirements to be supported by the information and/or data managed within it. This approach supports departments in planning for and performing migrations between systems, including ensuring that related metadata is copied along with the migrated information or data.

2.3 Key considerations

A range of considerations factor into a metadata needs assessment. This section recommends a series of questions to help departmental officials document their information and data management priorities and the metadata ecosystems that support them by revealing the use cases, interoperability requirements, and responsibilities for metadata that must be in place to deliver GC projects, programs and services. The questions are not meant to be exhaustive but instead to help departmental officials elicit from stakeholders (for example, business owners and business, information, data and technology architects) the metadata and metadata management practices they need to have in place to ensure the open and strategic management of the GC information and data for which they are responsible.

Each question may not be relevant in all cases or carry the same significance from one context to the next. Therefore, it is left to the discretion of departmental officials to decide if, when and how to apply each question and to respond by either accepting, avoiding, mitigating or sharing the risks identified through the process.

Enabling search and discovery

Information and data need to be findable and discoverable, that is, via the various search platforms deployed within the department and any federated search applications built on top of them. Users rely on metadata to be able to search digital files in their various formats. Consider:

  • What type of content is being searched and in what context?
  • What are the defined user communities (internal and external)? What are the information needs of those defined user communities? How do the current metadata values adhere to the expectations of the defined user communities?
  • Where will users expect to find metadata? Will users expect metadata to be embedded within the file itself (for example, in a Microsoft Word file) or will they expect to find metadata in a separate file or system?
  • Who will be searching for information and data? Will it be primarily staff (that is, humans) searching for information and data, or will the information and data need to be searchable by machines as well?
  • How do the intended users search? Will they use keywords (tags) or defined vocabularies and identifiers?
  • Will users be able to access content from their search results or will a location (link, text field and so on) be required to direct users to the content?
  • How frequently does content need to be indexed?
  • How will the system provide access to search logs and usage analytics to improve search over time?
  • What kinds of searches will users be completing? Will users be repeating the same searches, looking for updated content?
  • Which search engines do clients use?
  • What tools are available within the repository to make information and data findable and usable?
Good practices to facilitate search and discovery
Describe information and data
  • Use metadata to extensively and richly describe assets managed within systems that manage information and data so users can find and discover them.
  • Provide more context to make it easier for users to locate information and data. Apply this principle to both the embedded metadata (that is, metadata recorded as part of a digital object itself) and the associated metadata (that is, metadata recorded separately outside of the digital object described).
  • Ensure the capacity to appropriately accrue metadata as the information or data it describes is accessed, used, reused and so on.
  • Consider indexing information and data.
Use standard terminology
  • Choose and use defined controlled vocabularies and standardized terminologies to enable discovery. Use GC-controlled vocabularies where possible to promote interoperability. Additionally, allow for non-controlled vocabulary fields to improve findability.
  • Identify those within the department who can develop thesauruses, ontologies and controlled vocabulary lists for new (or mapping to existing) departmental projects.
  • Enforce the use of standard terminology in departmental workflows. For example, use lookup tables with field level constraints in databases, or use XML (Extensible Markup Language) validation to ensure standard vocabularies are leveraged. This enforcement improves accuracy and machine readability for metadata and reduces the risk of ambiguous data.
  • Leverage tools such as application programming interfaces (APIs) to obtain information from authoritative sources, reducing the risk associated with manual data entry.
  • Create a plan to update the standard terminology when metadata reference standards change.
  • Share vocabularies to enable interoperability, reuse and standardization across different departmental platforms.
Use persistent identifiers (PIDs)
  • Use a globally unique and persistent identifier to help humans and machines find content, for example, digital object identifiers (DOIs), Open Researcher and Contributor IDs (ORCIDs) or Legal Entity Identifiers (LEIs). Information assets often change system location, and managing the link between data and metadata is critical. Persistent identifiers can support this linking of data and metadata and assist in making data more accessible to different systems.

Enabling content sharing, access and accessibility throughout its life cycle

Metadata can describe information and data in a way that facilitates the sharing of content. It can also be used to communicate the extent to which, when and how information and data can be accessed within and outside of the organization, including over the long term. In terms of realizing a barrier-free Canada, metadata can proactively serve to identify and remove barriers to the accessibility of GC information and data. Consider:

  • How will the information or data have to meet Open Government requirements?
  • Who needs access to view the information and data stored in the system?
  • What metadata is needed to ensure that users have long-term access and use of file formats?
  • Who needs permission to edit the content in the system? How are permissions managed, and who is responsible for managing them (for example, will permission information be viewable for users of the system)?
  • Can metadata be used to support data loss prevention? If so, how?
  • Are there any specific actions pertaining to access, use or reuse (for example, access outside the department or the GC, printing, emailing) that must be restricted to certain users or conditions or altogether prevented? If so, what are they?
  • How will actions pertaining to metadata reuse be documented?
  • In which departmental repository or registry should metadata describing the information or data in the system (including metadata products such as technical reference guides) be centrally stored?
  • What are the accessibility requirements for metadata? Are there any accessibility requirements for the information or data it describes, and how are those requirements described through metadata?
  • How will metadata indicate whether the file format being used meets accessibility requirements?
  • How will metadata support interoperability and integration of diverse datasets?
Good practices to enable content sharing, access and accessibility
  • Document usage rights, including who is allowed to view or modify the information, data or metadata and under what conditions. These are often called usage rights statements and can include requirements for use, requirements for attribution or other requirements.
  • Ensure the system displays access and usage permissions and makes it easy to update and keep a history of those permissions.
  • Use GC preferred or acceptable digital file formats to ensure interoperability across systems, accessibility and long-term access to information and data.
  • If the system contains information or data that should be open, ensure that the protocol to obtain the information is widely used and well understood (that is, http). Avoid protocols that involve human intervention. In cases where automation is not possible, provide contact information (that is, a program or business owner email or phone number) to enable access to the data or information.
  • If the system contains information or data that should not be open, ensure that the access rights metadata clearly states the requirements for access.

Safeguarding and restricting access, use and reuse (copyright, licensing compliance, privacy, confidentiality, sensitivity, security)

Metadata can help define the access, use and reuse restrictions applied to information and data. Consider:

  • Where does the content originate? Are any copyrighted or licensed materials being stored?
  • What are the data sovereignty considerations for the information, data or metadata, if any?
  • How will the organization’s Access to Information and Privacy program interact with this content?
  • How secure and/or safe and/or private does this metadata need to be? How is this answer the same or different for the information or data that the metadata describes and the IT environment in which the content is being hosted?
  • What is the security categorization of this metadata?
  • How will metadata be de-identified if necessary?
  • If the data that the metadata is describing contains personal information, what measures are in place to protect the metadata in line with privacy requirements?
  • In the event of a privacy or security breach of the metadata, how will one be identified and contained and future breaches mitigated?
Good practices to safeguard and restrict access
  • Ensure metadata documents what the department or users can access, use and share from the information asset. When possible, share information about the licence to use the information or data, as well as the terms and conditions for doing so.
  • Assign a security category to all information and data commensurate with the degree of injury that could reasonably be expected as a result of its compromise at the time of creation or collection.
  • Wherever it is feasible to do so, automatically populate the sensitivity level of the information and data for consistency and to alleviate end-user burden.
  • Be sure that the level of sensitivity as captured through metadata is clear and unambiguous to both humans and machines.
  • Confirm with the authority responsible for creating the metadata that the security categorization of the information or data was assigned in consideration of the impact of aggregation.
  • Consult with the institution’s subject matter experts to ensure that personal information and the metadata that describes personal information is appropriately protected throughout all uses and disclosures (including sharing and interoperability) of metadata. Add the management of metadata to existing or new Privacy Impact Assessments, as required, to identify and mitigate risks to privacy of the program and its related metadata. Consider establishing a privacy protocol for the collection, creation, use or disclosure of metadata that is or is about personal information.

Enabling all users to understand what metadata represents

Departmental officials are encouraged to consider where metadata may be reused and possibly repurposed outside of the organization or certain domains, and to identify any metadata “crosswalks,” “mappings” or other tools needed to support end users (humans or machines) in interpreting, applying and appropriately reusing metadata. Consider:

  • Will defined elements (fields), content standards or externally produced vocabularies be used for metadata input into the system? If not, why, and do the benefits outweigh the costs of doing so?
  • What plans are there to develop unique vocabularies? Have existing metadata schemas been considered that may meet the needs of users? What opportunities are there to work with organizations with similar content to jointly develop a controlled vocabulary?
    • How will the organization maintain any new controlled vocabularies? What plans are there for the automated classification of content?
    • What is the plan to share and register these vocabularies?
  • How effectively can both humans and machines interpret and action the metadata?
  • What are the language requirements for metadata? Are there multilingual requirements? Does the metadata support Indigenous language requirements, specifically with respect to special characters?
Good practices to enable all users to understand what metadata represents
  • Keep the user’s perspective in mind and add detailed descriptive metadata to information and data. Provide details about context, quality and conditions of information and data.
  • Use metadata reference standards or departmental metadata reference standards (see Guidance on Prescribing Metadata Reference Standards for more information). Generally, metadata reference standards should exist that suit the needs of the information and data being described. If an existing metadata reference standard doesn’t quite meet departmental needs, it can be extended or shortened. In cases where adjustments to an existing metadata reference standard are made (that is, a profile is developed) these must be documented. Create data dictionaries that outline definitions, formats used (that is, dates), and business rules to document changes.

Enabling data exchange and migration

Metadata can be used to facilitate the exchange of information and data among internal users, the GC or externally. Therefore, it is important that metadata can be parsed and is as interoperable and actionable as possible. Departments may need to consider whether they rely on metadata produced externally or legacy metadata that may need to be translated into current reference standards and practices for interoperability. Consider:

  • What systems will be interacting with the content? At what frequency will the systems be interacting with the content? Will it be a scheduled interaction or on an ad hoc basis?
  • What stakeholders in other jurisdictions (that is, provincial, territorial, international) need to be considered for the potential sharing of information or data? How has interoperability with these stakeholders been considered?
  • Will the metadata for content have to conform to one or more established metadata reference standards? If so, which ones?
  • Are there international scientific, security, economic or regulatory partners with whom information and data must or should be exchanged? How has interoperability with these partners been considered?
  • Who will develop and maintain any maps or crosswalks required to support data exchange?
  • By what process or protocol will metadata, or the information or data it describes, be shared?
  • What geospatial compatibilities need to be considered (for example, raster- versus vector-based)?
Good practices to enable data exchange and migration
  • Ensure departmental metadata can be easily exchanged within systems in the department’s environment or with partners. Using metadata reference standards is essential for interoperability to be successful.
  • Ensure a good data model and a well-defined framework to structure the metadata and assist in exchanges between systems. Some examples to consider include OWL Web Ontology Language Guide or RDF (Resource Description Framework) Primer.
  • Ensure that metadata is sufficiently interoperable to adapt to system changes in the future.

Understanding direct operational needs for the business system or process owner

Good metadata design is informed by, and supports, business needs. Departmental officials are advised to ensure that use cases for metadata design are clear and well understood. Consider:

  • What are the use cases for the information and/or data stored in the system?
  • Who owns or will own the information and/or data (to be) managed within the system? Who is or will be responsible or accountable for this information and data? Is someone or will someone be the steward of this information or data on behalf of another group (for example, such that there may be implications with regards to data sovereignty), and who will be responsible for that? How is this documented and where?
  • Who are the current users of the system? Which users create metadata for the information and/or data managed within the system?
  • Have all potential users of the system been identified? What are their roles, points of access and any security limitations? Which potential users of the system have not yet been identified?
    • Who will be entering information and/or data into the system? What user guides or training materials are or will be provided to define the various fields in the system and instructions for completion?
  • What metadata will the users of the system need to enter versus what will automatically be entered? What opportunities are available to reduce the burden on users to manually enter metadata?
  • How will metadata be updated when information or data assets change?
  • What metadata describing the system is needed (that is, to support its management and maintenance)? Who is responsible for creating it, and where is it stored and accessed?
  • What metadata is needed in order to comply with Appendix J: Standard on Systems that Manage information and Data of the Directive on Service and Digital?
  • Which, if any, types of metadata should be prioritized over others to support operational needs?
Good practices to understand direct operational needs for the business system or process owner
  • Where required and appropriate, ensure that information and data are updated in a timely manner that meets operational needs, and update the metadata accordingly.
  • Treat metadata in systems with the same importance as content. Consider metadata a strategic output that needs to be maintained.
  • Consider the costs associated with maintaining and evolving metadata as the system evolves.
  • Identify key intersections between IT users and workpoints and metadata management activities by cross-referencing to the IT user profiles identified in section D.1.2 of Appendix D: Standard on Information Technology User and Workpoint Profiles of the Directive on Service and Digital.
  • Identify opportunities for metadata management activities to intersect with the planning, implementation, monitoring, control and closure of government project and programs by cross-referencing to the Directive on the Management of Projects and Programmes.

Meeting legal, audit, reporting and information and data management requirements

Metadata is one means by which departments document their information and data management practices to respond to legal, audit, reporting and other requirements. Consider:

  • What legal, audit, reporting and records management requirements are there? For example, what metadata is needed to support retention specifications?
  • Does information need to be displayed about legislative requirements for the information? How will users need to be able to search this content?
    • Evidence of disposition and preservations actions?
    • Evidence supporting digital signatures?
    • Date stamps?
    • Evidence of transactions or task execution?
    • Validation of identity?
    • Fixity or other integrity checks?
  • Will metadata enable compliance with “Annex B: Metadata” of Electronic Records as Documentary Evidence (CAN/CGSB-72.34:2017e)? Why or why not?
  • Who will be required to make changes in the system? What responsibilities will information owners or data stewards have in order to make changes to metadata?
  • What are the retention and disposition schedules for the metadata that describes the department’s information and data assets? How do these differ from the retention and disposition schedules for the corresponding information and data? How does the disposition of metadata need to be documented?
Good practices to meet legal, audit, and information and data management requirements
  • Discuss requirements with the departmental audit and evaluation, legal services, and information management teams.
  • Consider whom to contact if there are questions about data or metadata.

Analytical or decision-making needs across the department, or external entities (if necessary)

Aside from the main business needs for metadata, other departmental business units and external users may leverage a department’s information and data. Therefore, it is important to engage stakeholders in assessing and planning for metadata. Consider:

  • How have key stakeholders been consulted to identify needs as part of this assessment process?
  • What other organizational units within the department or external organizations need to leverage information or data from the system, and by what process will the information or data be shared?

Metadata quality assurance (demonstrating quality dimensions and controls)

Metadata quality may be critical to success, particularly where it is relied upon to indicate dimensions of information and data quality. Departmental officials must be assured that the metadata involved in and supporting GC policy- and decision-making processes is fit to serve this and any other purpose in delivering services to Canadians. Consider:

  • How will metadata be checked for adherence to GC data quality guidance and how frequently? What metadata quality controls are or will be in place? Who will be responsible for completing these quality checks?
  • Is or will metadata be checked for adherence to other quality indicators, principles or reference standards? If so, which ones and how frequently?
  • What plans or obligations are there to document and share information about metadata quality?
  • What authoritative sources are there from which metadata must be drawn to describe information or data?
Good practices to enable metadata quality assurance
  • Ensure metadata is complete, consistent, credible and as fulsome, accurate and comprehensive as possible. Semantic and structural values and elements should be represented consistently across the different records or aggregates in the system. Careful planning is needed, not only for metadata design, but also for users in completing the data appropriately for the metadata schema.
  • Record information about quality control and make it available to users. Describe the quality control methods applied to information and data assets and any assumptions in the metadata. Be sure to note what software was used to perform quality analysis, and be sure to make code available to users for transparency. Document who conducted the quality control, when it was done, and what changes were made.
  • Schedule regular quality control checks, and create a plan to resolve gaps and errors.

Understanding when and how metadata are altered (manipulated or transformed)

It is recommended that departmental officials consider the conditions that warrant metadata to be edited and altered in any way, including by means of manipulation, transformation or deletion. Given the evidentiary role that metadata can play in certain contexts, it is important for departments to be able to account for where, when and how metadata evolves or changes over time. Consider:

  • What information is there about the source of the metadata? How will the provenance of the metadata be tracked? What specific reference standard(s) do business owners need to follow when documenting the provenance of the metadata? How can metadata transformations be tracked back to the original form of the metadata record?
  • How will the history of when metadata has been created or modified within the system be documented?
  • Which users (will) have permission to add or alter (including deleting) metadata within the system?
  • How must the chain of custody of the information or data asset be documented?
  • What metadata can and cannot be automatically updated, given potential evidentiary considerations? When the resource described by the metadata changes, does the metadata update accordingly? When additional metadata becomes available or when metadata reference standards change, does the metadata associated with the resource change and how?
  • What metadata alterations, manipulations or transformations need to be documented, and how, to meet applicable digital evidence standards?
  • When metadata is altered, how is the security categorization assessed to ensure it is still appropriate for the sensitivity of the metadata? If the sensitivity level of the metadata needs to be downgraded or upgraded, how is the relevant, responsible authority consulted before modifying the security categorization of the metadata?
Good practices to understand when and how metadata is altered
  • Capture information about where the metadata originated, how it was collected or generated, by whom, and when. Such details would be included as part of a basic metadata schema.
  • Make note of any technologies (instruments, code, software and so on) that were used to alter, manipulate or transform the metadata.

Additional considerations

  • How does the file format of the information or data support metadata requirements? If not immediately, what opportunities exist to address these metadata requirements in the future?
  • How can the outcomes of a gender-based analysis plus (GBA+) be incorporated into metadata to improve the quality of information and data and advance employment equity, diversity and inclusion priorities?
  • How will metadata be stored? How could it be structured to optimize environmental and energy use impact? How does the corresponding metadata management plan respond to Greening Government Strategy: A Government of Canada Directive?

© His Majesty the King in Right of Canada, represented by the President of the Treasury Board, 2024,
ISBN: 978-0-660-69826-7

Page details

Date modified: