What to Consider When Calibrating Evaluations

Notice to the reader

The Policy on Results came into effect on July 1, 2016, and it replaced the Policy on Evaluation and its instruments.

Since 2016, the Centre of Excellence for Evaluation has been replaced by the Results Division.

For more information on evaluations and related topics, please visit the Evaluation section of the Treasury Board of Canada Secretariat website.

Acknowledgements

The Centre of Excellence for Evaluation would like to thank the departments and agencies that assisted in the development of this document, including all the federal evaluators who participated in the Working Group on Calibrating Evaluations.

Catalogue No. BT53-20/2011E-PDF
ISBN 978-1-100-18884-3

1.0 Introduction
2.0 Flexibility Within Policy Requirements
3.0 Preparing for Calibration: Assessing Risk
4.0 Calibrating an Evaluation
5.0 Additional Information
Appendix A: Calibration Checklist
Appendix B: Bibliography

1.0 Introduction

Calibration is the process of adjusting an item (e.g., a tool or instrument) to the sensitivity required to suit a particular function. In this document, calibration refers to the process of adjusting how evaluations are conducted, based on a number of different factors, in ways that produce quality evaluations cost-effectively. Depending on the particular evaluation, calibration can involve adjustments that increase or decrease the required level of effort, scope or depth of analysis.

This document is intended to assist departments and agencies in planning and implementing calibrated evaluations. Calibrated evaluations allow for the effective utilization of evaluation resources while maintaining the credibility and usability of the evaluation results. The document is designed to be useful for evaluation managers as well as other evaluators and evaluation directors. The primary audience comprises those involved in evaluation for federal departments and agencies.

2.0 Flexibility Within Policy Requirements

Successful expenditure management requires departments to generate and use comprehensive, credible and current information on government programs. Deputy heads are responsible for ensuring that this information is available to support ministers and Cabinet in their decision-making processes. The Directive on the Evaluation Function (2009) identifies five core issues to be addressed in evaluations to provide evidence about the relevance and performance of government programs, as required under section 5.2 of the Policy on Evaluation. All evaluations that are intended to count toward the coverage requirement of section 6.1.8 of the Policy on Evaluation must address all these issues.

Addressing Core Evaluation Issues

To ensure that evaluations provide evidence about the value for money of programs, evaluations must address a set of core evaluation issues, as outlined in Annex A of the Directive on the Evaluation Function:

Continued need for program:: Assessment of the extent to which the program continues to address a demonstrable need and is responsive to the needs of Canadians.
Alignment with government priorities:: Assessment of the linkages between program objectives and (i) federal government priorities and (ii) departmental strategic outcomes.
Alignment with federal roles and responsibilities:: Assessment of the roles and responsibilities of the federal government in delivering the program.
Achievement of expected outcomes:: Assessment of progress toward expected outcomes (including immediate, intermediate and ultimate outcomes) with reference to performance targets, program reach and program design, including the linkage and contribution of outputs to outcomes.
Demonstration of efficiency and economy:: Assessment of resource utilization in relation to the production of outputs and progress toward expected outcomes.

Within these parameters, Annex A of the Directive on the Evaluation Function (2009) gives departments the flexibility to “determine the evaluation approach and level of evaluation effort in accordance with the program’s risk and characteristics, and the quality of performance information available for each individual program.” This allows heads of evaluation the opportunity to use evaluation resources cost-effectively.

When departments choose to calibrate their evaluation, it is important that the credibility and utility of the evaluation is maintained. As such, departments must ensure that:

Any aspect of calibration does not affect the ability of the evaluation to fulfill its objectives and meet legislative and policy requirements (e.g., the Financial Administration Act, the Policy on Evaluation, the Directive on the Evaluation Function and the Standard on Evaluation for the Government of Canada);
A clear and reasonable rationale is provided for the calibration made in a particular evaluation, particularly where the resources have been minimized in a particular area of the evaluation; and
Any consequences arising from calibrations are identified in the evaluation report in a specific section on limitations.

3.0 Preparing for Calibration: Assessing Risk

The Directive on the Evaluation Policy states that heads of evaluation are responsible for “identifying and recommending to the deputy head and the Departmental Evaluation Committee a risk-based approach for determining the evaluation approach and level of effort to be applied to individual evaluations comprised in the five-year departmental evaluation plan, and the appropriate level of resources required to conduct individual evaluations included in the plan.”

There are a number of overall risk factors that departments may take into account when calibrating an evaluation. Factors that may be considered in the risk assessment include, but are not limited to, the following:

The potential risks related to the health and safety of the public or the environment, including both the degree and magnitude of negative consequences associated with program failure, as well as the probability of the risk materializing;
User information needs, including, for example, whether the evaluation is being done to inform program renewal or whether there are significant senior management information needs;
The materiality of the program, including, for example, the percentage of the department’s main estimates the program comprises;
Public/stakeholder interest and/or political sensitivity, both current and potential, including media, parliamentary or ministerial interests;
The size of the population affected or targeted by the program;
Known problems, challenges or weaknesses in the program, ideally based on a previous evaluative assessment or arising from performance monitoring;
Program complexity, including program design, delivery and expected results, as well as the complexity of measuring performance. For example, decentralized program delivery from many regions or locations can yield varying degrees of success in different locations, while programs involving a large number of partners can add to the complexity of delivery—both of which can require more evaluation resources to examine; and
The quality and extent of performance data and evaluation or review information available on the program.

During the initial conception of an evaluation, existing information on the likelihood and potential impact of program risks should be reviewed. This can include information in an organization’s risk assessment profile, in the Performance Measurement Strategy, in the Departmental Evaluation Plan, or in other sources. If information is not available or sufficient, a risk assessment should be built into the planning of the evaluation.

For more information on an evaluation risk assessment, please see the Guide to Developing a Departmental Evaluation Plan, Appendix E.

4.0 Calibrating an Evaluation

Once the level of risk has been established for an evaluation, evaluators can determine how best to calibrate their evaluation. Calibration can be undertaken through adjustments to the following elements of an evaluation:

Evaluation scope;
Evaluation approach and design;
Data collection methods;
Reporting; and/or
Project governance and management.

These elements are described in the following sections, along with examples of factors to consider in addressing each of these elements and different ways of calibrating evaluations. Appendix A presents this information in the form of a checklist.

4.1 Evaluation Scope

The scope of an evaluation comprises the parameters of what is being assessed. This can include 1) the parameters of the programming being examined, and 2) the questions being addressed in the evaluation.

When calibrating an evaluation based on the scope, the following may be considered:

Information needs / interests of management:

How will the information from the evaluation be used (e.g., to inform program renewal, to address specific senior management questions, to meet Financial Administration Act and/or Policy on Evaluation requirements)? Are there any core issues that are not as pertinent and can justifiably be addressed with comparatively few resources?

When trying to limit the level of resources for an evaluation, it may be useful to assess the relative importance of evaluation issues and questions to the evaluation users. Limiting the number of evaluation questions can reduce the amount of resources required for an evaluation, or provide evaluators with the opportunity to explore a smaller number of questions in greater depth.
Evaluators may choose to reduce the level of effort expended on addressing certain core issues, based on a number of factors (e.g., reducing the effort expended on core issue 1, “Continued need for program,” if there have been no or few changes to the programming context).

Issues raised in previous evaluations:

Have previous evaluations raised issues that require further examination? Did past evaluations include recommendations, and did management adequately respond to these recommendations (i.e., have the action items been properly implemented)? What impacts have been observed based on the response to recommendations, if any (i.e., did the recommendations have the desired effects)?

As argued by Pawson and Tilley, evaluations benefit from building on the knowledge generated from previous evaluations and other research in a process of cumulation.^{Footnote 1}

Evaluations of programs for which significant issues and/or recommendations were identified in past evaluations may choose to expend additional effort to examine whether the identified issues persist and/or recommended actions have addressed these issues.

Elements/components of the program

To be included in the evaluation: Are there different levels of risk associated with different elements/components of the initiative being evaluated? Have there been any changes to the Program Alignment Architecture that affect the scope of what is being evaluated?

Evaluators may decide to focus greater attention on assessing the results achieved from components of the program with the most funding.

Elements of the results chain (logic model) to evaluate:

Does the program have a strong, well-developed logic model and program theory? What elements of the results chain or logic model are appropriate to evaluate?

Employing theory-based evaluations, such as contribution analysis, can help in calibration by focusing questions and data collection on specific mechanisms of change.
It may be sensible to focus more resources on examining outputs and immediate outcomes for evaluations of newer programs, whereas evaluations of more mature programs may focus more attention on assessing the achievement of longer-term outcomes.

Time frame to be evaluated:

What period of the program’s life span is being examined? Does the life span need to include any years that precede the current funding cycle?

Examining outcomes over longer time frames can require more evaluation resources, but can be important in assessing overall program impacts. Continually limiting the time frame to the most recent funding cycle can result in multiple evaluations focused solely on short- and medium-term outcomes.

Longevity of the program and contextual stability:

Has the program been in existence for a long time? Does it have a recent track record of demonstrated achievement of relevant outcomes? Has the program experienced any significant shifts in context ^{Footnote 2} that may affect performance and relevance?

Once a program has been evaluated once, it may require less effort in certain areas of its evaluation, as, for example, baselines have been established and performance measurement has been tested and adjusted.
Programs that have clearly demonstrated good performance over long periods of time may justify fewer resources to evaluate their performance.
For programs in which the context has not significantly changed, evaluations may choose to establish if the previous analysis of relevance still stands or whether there are elements of program relevance that should be assessed.
Programs that have experienced changes in the context may require more resources to explore those changes and their potential impacts on the program.

4.2 Evaluation Approach and Design

The evaluation approach is the high-level conceptual model used in undertaking the evaluation. There is a wide range of evaluation approaches (e.g., theory-based evaluation, participatory evaluation, contribution analysis, utilization-focused evaluation) to choose from, and each should be considered in light of the needs and circumstances of the program being evaluated. Different evaluation approaches require different levels of effort and resources—within a context of cost containment, the availability of resources (including time, people with the required experience and skills, and financial resources) may be a primary factor determining the evaluation approach.

Determining the evaluation approach will also require considering the most appropriate research design, including whether the evaluation will use an experimental, quasi-experimental, or non-experimental designs.

When calibrating an evaluation based on the evaluation approach or design, the following may be considered:

Suitability of an evaluation approach to the scope and the characteristics of the program being evaluated:

Do the evaluation scope and the characteristics of the program being examined call for a specific evaluation approach/design?

It is often advantageous for evaluators to work with programs at the program design stage, or at least well in advance of the evaluation, to identify an appropriate evaluation approach or design. This can facilitate setting up appropriate performance measurement and data collection (such as longitudinal data, for example) in a cost-effective manner.
There is a range of effort-reducing approaches/designs that may be adopted depending on the particular characteristics of the program.
- Where the nature of the program and its risks warrant it, the target group can itself be the comparison group when using “pre/post” designs.
- Where secondary data on similar programs or groups are available for comparison purposes, the need for control groups and/or baselines may be reduced or eliminated.^{Footnote 3}
- In programs with relatively uniform/consistent activities and outputs the evaluation may be able to rely on a small number of representative case studies.^{Footnote 4}

Appropriateness of the combination of data sources in an evaluation:

Does the evaluation have an appropriate combination of primary, secondary, quantitative and qualitative data to address the evaluation scope and questions/issues? What is the ideal balance between subjective data (i.e., personal feelings, attitudes and perceptions) and objective data (i.e., facts)? Is the number of data sources justified?

As stated in the Standard on Evaluation for the Government of Canada, evaluators should ensure that “multiple lines of evidence, including evidence produced from both quantitative and qualitative analysis, are used to ensure reliable and sufficient evidence” (section 6.2.2.c).

Organizational/program preferences/context:

Are there benefits to the organization/program that can be derived from a particular evaluation approach? What are the preferences of the organization regarding the depth and level of information?

Evaluation planning should examine whether the benefits associated with any particular organizational preferences in the evaluation approach would justify potentially higher costs.

4.3 Data Collection Methods

Data collection methods are the actual techniques used to gather data (e.g., surveys, interviews).

When calibrating an evaluation based on data collection methods, the following may be considered:

Existing information:

To what extent can each evaluation question be answered by using existing data? To what extent is the performance measurement strategy being effectively implemented, and are the data being collected reliable, credible and relevant? What new data sources are required to answer the evaluation questions and/or to supplement existing data?

Evaluators should examine and make full use of existing data before planning additional data collection. If existing data are fully used, new data collection can often just involve filling in gaps in information.
Information from previous evaluations may be useful, particularly if the program context has not changed.
Information from audits or other research can help to supplement or inform evaluation findings.

Saturation point:

To what extent can the amount of data being collected be limited while still addressing the evaluation questions? Does a specific data source tend to produce the same results over and over?

Throughout the data collection phase, evaluators should conduct an analysis of what they have gathered so they can judge whether they need to collect additional data in order to arrive at valid and credible conclusions.
A saturation point can be reached for secondary data, as well as for primary data. For example, in conducting document and literature reviews, it is sometimes not necessary to review documents once substantive evidence on an issue has been collected through a select number of key documents.
Evaluators may consider reducing the overall size of samples when stakeholders (including even those with competing interests) appear to be in general agreement about an issue.

The saturation point can, however, be difficult to determine, and evaluators should not be too quick to terminate data collection, particularly for higher-risk or complex programs.

Sampling strategies:

Is the sampling strategy (e.g., for surveys and/or interviews) appropriate?

The sampling strategy employed in an evaluation can have an impact on the level of effort required for evaluations. For surveys, sample sizes are most effectively determined based on the desired effect size and statistical power.^{Footnote 5}

There are a number of different ways in which the level of effort can be minimized through sampling strategies:

Strategic or purposeful sampling: Sampling only the most relevant subjects based on the need for specific information and/or their ability to comment authoritatively on a specific topic.
Cluster sampling: Concentrating on inherent groupings (e.g., sampling people who live in a proximate geographic area) when selecting samples can reduce the costs associated with travel.
Reducing the level of data disaggregation can limit sample sizes and reduce the effort required to process and interpret data.^{Footnote 6}

Options in conducting interviews or surveys:

There are a number of interview or survey approaches available to reduce the effort and resources required to complete an evaluation:

Shorter data collection instruments (e.g., survey questionnaires, interview guides) can reduce the time required for data collection.
Group interviews: In certain cases (e.g., when interviewees can be reasonably expected to provide their honest opinions in a group setting without fear of negative consequences), group interviews with a few (i.e., two or three) interviewees may be a cost-effective and efficient alternative to individual interviews.
Focus groups: In situations where data collection can be focused on a limited number of questions, focus groups may be a cost-effective and efficient method of collecting data.
Telephone interviews: Telephone interviews can reduce the cost and effort associated with travel for face-to-face interviews.
Self-administered questionnaires: Though not appropriate for all evaluations, self-administered questionnaires can save time and effort in collecting data.

Technology:

Is there an opportunity to use technology to increase efficiency in data collection and/or analysis?

Where appropriate, technology-based data collection (e.g., online surveys, online video interfaces for conducting interviews) can greatly reduce the time, effort and cost associated with data collection and collation.
Quantitative and qualitative data management and analysis software can also reduce the time and effort required to process and analyze data.^{Footnote 7}
Use of video or photographic evidence can be a cost-effective means of collecting and/or presenting evidence.

Expert consultations:

Is there an opportunity to reduce the level of effort required for data collection through consulting with experts?

Expert consultations can include informal communication, formal interviews, or engaging experts as members of the evaluation team or advisory committee.

For example, consulting with experts during evaluation planning can help to identify essential documents for review, which can entail less effort than reviewing a large, exhaustive body of information.

Project reports as a means of data collection:

Can outcome data be collected through project reports?

Where evaluation planning is undertaken early, such as at program implementation, some outcome information may be collected from program clients as part of project reporting. However, compiling information from individual project reports can sometimes require significant effort and time.
Internal program reporting may, however, be subject to issues of bias or reduced reliability.

4.4 Reporting

As outlined in the Standard on Evaluation for the Government of Canada, evaluation reports must present the findings, conclusions and recommendations in a clear and objective manner. Evaluation reports should be written so that senior managers and external readers can readily focus on and understand the important issues being reported.

When calibrating an evaluation based on reporting requirements, the following may be considered:

Appropriateness of the level of detail in the evaluation report:

Does the report provide the appropriate level of detail to adequately inform the intended audiences (central agencies, deputy heads, parliamentarians, Canadians) of the methodology, content, limitations and what was learned from the evaluation? Is the evidence supporting the findings summarized succinctly?

In some cases, it may be possible to simplify or shorten the evaluation report by presenting the key findings that answer only the evaluation questions or issues outlined in the evaluation matrix/framework, and/or adding adequate references to other monitoring or methodology reports.

Timing to begin drafting of the report:

When is the appropriate time to begin drafting the evaluation report?

The development of a skeleton report early in the evaluation may be beneficial. A skeleton report is the evaluation report minus the content (i.e., showing headings/sections and brief descriptions of what information goes where).^{Footnote 8} This annotated report can help to ensure that resources are not wasted collecting unneeded data, that enough evidence is provided to answer the evaluation questions, and that every piece of evidence is relevant to answering the evaluation questions.
Some sections of the report can be prepared before analysis is completed, such as the introduction, methodology, program profile, etc.

It is important to ensure that evaluation reporting is consistent with section 6.4 of the Standard on Evaluation for the Government of Canada. Furthermore, a clear and reasonable rationale for calibration, and any consequences arising from the calibration, should be discussed in the evaluation report.

4.5 Project Governance and Management

Project management is the application of knowledge, skills and techniques to execute projects effectively and efficiently.^{Footnote 9} The primary challenge of evaluation management is to achieve all of the evaluation’s goals and objectives while honouring the typical constraints of time and resources.

When calibrating an evaluation based on project management, the following may be considered:

Complexity of project management:

How broad an understanding of the program environment is needed? What is the appropriate level of stakeholder/program involvement in the evaluation?

Extensive evaluation planning, including broad stakeholder involvement, can be resource-intensive and may not be as important for programs that have recently been evaluated.

Human resources:

Are human resources allocations being optimized for the evaluation project? Has the assignment of tasks taken into consideration evaluation team members’ strengths and weaknesses? Are human resources reallocations required at different points in time over the course of the evaluation?

More experienced evaluators may be better positioned to complete evaluations where time is more limited, or which have less data on which to arrive at conclusions.
It may be more appropriate to involve more junior evaluators more extensively in lower-risk evaluations, which can be cost-effective.
Some evaluations will require external subject matter expertise due to the complexity of the subject matter, which will typically require the additional time and effort required for contracting. This can increase the level of resources required for the evaluation.
In some evaluations it may be possible to employ non-experts to collect data, which can, in some cases, be cost-effective. However, this can sometimes require extensive training, oversight and data validation, which can result in minimal or no cost savings.
In some cases, it may be appropriate to have external resources undertake data collection and have internal evaluators focus on analysis and reporting.

Project governance:

How is the evaluation being managed? Is it a horizontal evaluation being led by the department or by another organization? Is there an advisory committee or other body involved in reviewing and/or approving deliverables? If so, at which points must these bodies be involved, and for what specific tasks?

Horizontal evaluations can often require a significant amount of effort to manage, depending on the number of organizations involved. Managers should factor into their planning the often considerable time and effort required for consultations, feedback and approvals in horizontal evaluations.

Project deliverables:

Is the number of project deliverables appropriate? Can the evaluators minimize the activities needed to complete these deliverables?

Calibrate the number of deliverables to the program risk. In some cases, individual reports for each line of evidence, which can be resource-intensive, may not be necessary.

Project monitoring:

What level of project monitoring is required during the evaluation?

Limiting what is monitored (time or level of effort) and how often these variables are monitored can reduce the level of effort expended in project monitoring. For example, producing frequent progress reports can be resource-intensive.

Project communication:

How is information about the evaluation being communicated to key stakeholders?

It can be important to inform stakeholders (e.g., senior managers) of situations where evaluations have limited the level of effort in some areas, and the reasons for this, so that they are not surprised to see this reflected in the evaluation report.

5.0 Additional Information

Efficiencies in the Evaluation Function

Efficiencies can be gained through practices employed at the level of the evaluation function. For example:

Support from evaluators to programs in developing and testing performance measurement strategies can reduce the effort required during evaluations to collect data.
Developing generic evaluation tool templates that can be adjusted for individual evaluations can save time and effort. These can include templates for evaluation plans, surveys and interview guides, reports, and other documents.

For more information on evaluations and related topics, please visit the Secretariat’s Centre of Excellence for Evaluation (CCE) and Management, Resources and Results Structure websites.

You may also contact the CEE at:

The Centre of Excellence for Evaluation
Expenditure Management Division
Treasury Board of Canada Secretariat

Email: evaluation@tbs-sct.gc.ca

Appendix A: Calibration Checklist

Evaluation Scope
*The parameters of the evaluation*
Evaluation Element	Examples of Factors to Consider	Examples of Calibration
Information needs/interests of management	How will the information from the evaluation be used? Are there any core issues that are not as pertinent and that can justifiably be addressed with comparatively few resources?	Reduce the number of evaluation questions to focus the evaluation. Adjust the level of effort expended on addressing specific core issues, based on a number of factors.
Issues raised in previous evaluations	Have previous evaluations raised issues that require further examination? Did past evaluations include recommendations, and did management adequately respond to these recommendations? What impacts have been observed based on the response to recommendations, if any?	Evaluations of programs for which significant issues and/or recommendations were identified in past evaluations may choose to expend the additional effort required to examine whether the identified issues persist and/or recommended actions have addressed these issues.
Elements/components of the program to be included in the evaluation	Are there different levels of risk associated with different elements/components of the initiative being evaluated? Have there been any changes to the Program Alignment Architecture that affect the scope of what is being evaluated?	Target evaluation effort to elements/components of highest materiality or risk.
Parts of the logic model to be evaluated	Which elements of the results chain are appropriate to evaluate?	Employ theory-based evaluations to focus evaluation questions and data collection on specific mechanisms of change. Focus on outputs and immediate outcomes for newer programs; evaluations of more mature programs may focus more attention on longer-term outcomes.
Time frame to be evaluated	What period of the program’s life span is being examined? Does the evaluation need to include any years that precede the current funding cycle?	Examining outcomes over longer time frames can require more evaluation resources, but can be important in assessing overall program impacts.
Longevity of program and contextual stability	Has the program been in existence for a long time? Does it have a recent track record of demonstrated achievement of relevant outcomes? Has the program experienced any significant shifts in context that may affect performance and/or impact the program’s relevance?	It may require fewer resources to evaluate a program that has already been evaluated as, for example, baselines have been established and performance measurement has been tested and adjusted. Programs that have clearly demonstrated good performance over long periods of time may justify fewer resources to evaluate their performance. For programs in which the context has not significantly changed, establish whether the previous analysis of relevance still stands or whether there are elements of program relevance that should be assessed. Evaluations of programs that have faced contextual changes may warrant more resources to explore the impacts of changes.

Evaluation approach and design
*The conceptual model for undertaking the evaluation*
Evaluation Element	Examples of Factors to Consider	Examples of Calibration
Suitability of an evaluation approach to the scope and the characteristics of the program being evaluated	Do the scope and the program characteristics call for a specific evaluation approach? Are there benefits to the organization that can be derived from a particular evaluation approach, and do these benefits justify a potentially higher level of resources for the evaluation?	Working with programs at the program design stage and/or well in advance of the evaluation to determine the appropriate approach and design can be cost-effective. In some cases, the target group can also be the comparison group when using “before and after” designs. Secondary data on similar programs or groups can reduce/eliminate the need for control groups and/or baselines. Evaluations of programs of more uniform/consistent activities and outputs may be able to rely on a small number of representative case studies.
Combination of data sources	Does the evaluation have an appropriate combination of primary and secondary, quantitative and qualitative data? What is the ideal balance between objective and subjective data? Is the number of data sources justified?	As stated in the Standard on Evaluation for the Government of Canada, evaluators should ensure that “multiple lines of evidence, including evidence produced from both quantitative and qualitative analysis, are used to ensure reliable and sufficient evidence” (section 6.2.2.c).
Organizational preferences/context	Would the organization benefit from a particular evaluation approach? What are the preferences of the organization regarding the depth and level of information?	Evaluation planning should examine whether the identified organizational benefits and preferences justify any additional resources required for the evaluation.

Data collection methods
*Techniques used to gather data*
Evaluation Element	Examples of Factors to Consider	Examples of Calibration
Existing information	To what extent can each evaluation question be answered by using existing data? Are reliable, credible and relevant performance measurement data being collected? What new data sources are required to answer the evaluation questions and/or to supplement existing data?	Examine and make full use of existing data before planning the collection of additional information. The existence of information in program documents/databases and other federal government documents can reduce the level of effort required for primary data collection during evaluation. Information from previous evaluations may be useful, particularly if the program context has not changed. Information from audits or other research can help to supplement or inform evaluation findings.
Saturation point	To what extent can the amount of data being collected be limited, while still addressing the evaluation questions? Does a specific source tend to produce the same results over and over?	Conduct an analysis throughout the data collection phase of what has been gathered to judge whether there is a need to collect additional data in arrive at conclusions. Consider reducing the overall size of samples when stakeholders (including even those with competing interests) appear to be in general agreement about an issue.
Sampling strategies	Is the sampling strategy appropriate?	The level of effort can be minimized through sampling strategies, including strategic or purposeful sampling and cluster sampling. Reducing data disaggregation can limit the effort required to process and interpret data.
Options in conducting interviews/surveys	Are interviews/surveys being conducted appropriately?	Reduce the time required for data collection through shorter data collection instruments. The level of effort can be minimized through different approaches to interviewing, including group interviews and focus groups. Conduct interviews by telephone, instead of in person, to reduce the level of effort involved in evaluations. Self-administered questionnaires can save time and effort.
Technology	Is there an opportunity to use technology to increase the efficiency in data collection and/or analysis?	Technology-based data collection (such as through online surveys) can reduce the time effort associated with data collection and collation. Quantitative and qualitative data management and analysis software can also reduce the time and effort required to process and analyze data. Photographic and video evidence can be a cost-effective means of collecting and presenting evidence.
Expert consultations	Is there an opportunity to reduce the level of effort required through consulting with experts in the field being evaluated?	Consulting with experts can help to identify essential documents for review, which can reduce the level of effort required in reviewing a large, exhaustive body of information.
Project reports as a means of data collection	Can outcome data be collected through project reports?	Collect outcome information from program clients as part of project reporting to reduce data collection at the time of the evaluation. Internal program reporting may, however, be subject to issues of bias or reliability.

Reporting
*Presenting the findings, conclusions and recommendations in a clear and objective manner*
Evaluation Element	Examples of Factors to Consider	Examples of Calibration
A clear and reasonable rationale for calibration, and any consequences arising from the calibration, should be discussed in the evaluation report.
Level of detail in the evaluation report	Does the report provide the appropriate level of detail to adequately inform the intended audiences of the methodology, limitations and what has been learned from the evaluation? Is the evidence supporting the findings summarized succinctly and appropriately?	Present the key findings that answer only the evaluation questions or issues outlined in the evaluation matrix/framework and/or adding adequate references to other monitoring or methodology reports.
Timing of drafting of report	When is the appropriate time to begin drafting the report?	A skeleton report developed early in the evaluation can reduce time and effort.

Project governance and management
*Application of knowledge, skills and techniques to execute projects effectively and efficiently*
Evaluation Element	Examples of Factors to Consider	Examples of Calibration
Complexity of project management	How broad of an understanding of the program environment is required? What is the appropriate level of stakeholder involvement in the evaluation?	Extensive evaluation planning, including broad stakeholder involvement, can be resource-intensive and may not be as important for programs that have recently been evaluated.
Human resources	Are human resources allocations being optimized for the evaluation project? Have the assigned tasks taken into consideration evaluation team members' strengths and weaknesses? Some evaluations will require external subject matter expertise due to the complexity of the subject matter, which will usually require the time and effort required for contracting. Are human resources reallocations required at different points in time over the course of the evaluation?	More experienced evaluators may be better positioned to complete evaluations where time is more limited or that have less comprehensive data on which to arrive at conclusions. Low-risk evaluations may be appropriate for more junior evaluators. In some evaluations, it may be possible to employ non-experts to collect data, which can be cost-effective. In some cases, it may be appropriate to have external resources undertake data collection and have internal evaluators focus on analysis.
Project governance	How is the evaluation being managed? For example, is it a horizontal evaluation being led by the department or another organization? Is there an advisory committee or other body involved in reviewing and/or approving deliverables? If so, at which points must these bodies be involved, and for what specific tasks?	Managers should factor into their planning the often considerable time and effort required for consultations, feedback and approvals in horizontal evaluations.
Project deliverables	Is the number of deliverables appropriate? Can the evaluators minimize the activities needed to complete these deliverables?	Calibrate the number of deliverables to the program risk. In some cases, individual reports for each line of evidence, which can be resource-intensive, may not be necessary.
Project monitoring	What level of project monitoring is required during the evaluation?	Consider whether it is possible to limit what is monitored (time, level of effort) and how often these variables are monitored.
Project communication	How is information about the evaluation being communicated to key stakeholders?	Inform stakeholders (e.g., senior managers) of situations where evaluations have limited the level of effort in some areas, and the reasons for this, so they are not surprised to see this reflected in the final report.

Appendix B: Bibliography

Bamberger, Michael, Jim Rugh and Linda Mabry. Real World Evaluation: Working Under Budget, Time, Data, and Political Constraints. Thousand Oaks, California: Sage Publications, 2006.
Davidson, Jane. “AEA365: A Tip-a-Day by and for Evaluators.” Accessed online September 30, 2013.
Pawson, Ray and Nick Tilley. Realistic Evaluation. 1997. Los Angeles: Sage Publications.
Project Management Institute. “What is Project Management?” Accessed online September 30, 2013.
Treasury Board of Canada Secretariat. Policy on Evaluation 2009. Accessed online October 2, 2013
Treasury Board of Canada Secretariat. Theory-Based Approaches to Evaluation: Concepts and Practices. 2012. Accessed online November 5, 2013.

Footnotes

Footnote 1

Ray Pawson and Nick Tilley. Realistic Evaluation. 1997. Sage Publications.

Return to footnote 1 referrer

Footnote 2

The context can include both internal and external factors. Internal factors include program delivery, activities, resource levels and processes. External factors include stakeholder attitudes to a particular subject, partners, legal changes, economic conditions, political changes and the manifestation of new risks.

Return to footnote 2 referrer

Footnote 3

Bamberger et al. (2006), pp. 109, 143.

Return to footnote 3 referrer

Footnote 4

Bamberger et al. (2006), pp. 323, 352.

Return to footnote 4 referrer

Footnote 5

Bamberger et al. (2006), p. 59.

Return to footnote 5 referrer

Footnote 6

Bamberger et al. (2006), p. 239.

Return to footnote 6 referrer

Footnote 7

Bamberger et al. (2006), p. 84.

Return to footnote 7 referrer

Footnote 8

“Jane Davidson on Evaluation Reporting”

Return to footnote 8 referrer

Footnote 9

Project Management Institute

Return to footnote 9 referrer

Page details

Date modified:: 2015-11-30

Language selection

Search

What to Consider When Calibrating Evaluations

Notice to the reader

Acknowledgements

Table of Contents

1.0 Introduction

2.0 Flexibility Within Policy Requirements

Addressing Core Evaluation Issues

3.0 Preparing for Calibration: Assessing Risk

4.0 Calibrating an Evaluation

4.1 Evaluation Scope

4.2 Evaluation Approach and Design

4.3 Data Collection Methods

4.4 Reporting

4.5 Project Governance and Management

5.0 Additional Information

Appendix A: Calibration Checklist

Appendix B: Bibliography

Page details

What to Consider When Calibrating Evaluations

We have archived this page and will not be updating it.

We have archived this page and will not be updating it.

Notice to the reader

Acknowledgements

Table of Contents

1.0 Introduction

2.0 Flexibility Within Policy Requirements

Addressing Core Evaluation Issues

3.0 Preparing for Calibration: Assessing Risk

4.0 Calibrating an Evaluation

4.1 Evaluation Scope

4.2 Evaluation Approach and Design

4.3 Data Collection Methods

4.4 Reporting

4.5 Project Governance and Management

5.0 Additional Information

Appendix A: Calibration Checklist

Appendix B: Bibliography

Page details