Estimating chronic disease rates in Canada: which population-wide denominator to use? - HPCDP: Volume 36-10, October 2016
Volume 36 · Number 10 · October 2016
Estimating chronic disease rates in Canada: which population-wide denominator to use?
J. Ellison, MPH Footnote 1; C. Nagamuthu, MPH Footnote 2; S. Vanderloo, MSc Footnote 3; B. McRae Footnote 1; C. Waters, BSc 1
This article has been peer reviewed.
Correspondence: Joellyn Ellison, Centre for Chronic Disease Prevention, Public Health Agency of Canada, 785 Carling Avenue, 6th Floor, 623 A-3, Ottawa, ON K1A 0K9; Tel: 613-797-8721; Fax: 613-941-2057; Email: email@example.com
Introduction: Chronic disease rates are produced from the Public Health Agency of Canada's Canadian Chronic Disease Surveillance System (CCDSS) using administrative health data from provincial/territorial health ministries. Denominators for these rates are based on estimates of populations derived from health insurance files. However, these data may not be accessible to all researchers. Another source for population size estimates is the Statistics Canada census. The purpose of our study was to calculate the major differences between the CCDSS and Statistics Canada's population denominators and to identify the sources or reasons for the potential differences between these data sources.
Methods: We compared the 2009 denominators from the CCDSS and Statistics Canada. The CCDSS denominator was adjusted for the growth components (births, deaths, emigration and immigration) from Statistics Canada's census data.
Results: The unadjusted CCDSS denominator was 34 429 804, 3.2% higher than Statistics Canada's estimate of population in 2009. After the CCDSS denominator was adjusted for the growth components, the difference between the two estimates was reduced to 431 323 people, a difference of 1.3%. The CCDSS overestimates the population relative to Statistics Canada overall. The largest difference between the two estimates was from the migrant growth component, while the smallest was from the emigrant component.
Conclusion: By using data descriptions by data source, researchers can make decisions about which population to use in their calculations of disease frequency.
Keywords: Canadian Chronic Disease Surveillance System, denominator, census, population estimates Canada, disease surveillance, measures of disease frequency, administrative health data
- Accurate population estimates are important when measuring the burden of chronic diseases.
- The authors calculated the major differences between the Canadian Chronic Disease Surveillance System and Statistics Canada's population denominators to find the best sources and the reasons for any differences between them.
- The unadjusted CCDSS denominator was 34 429 804, 3.2% higher than Statistics Canada's estimate of population in 2009. After the CCDSS denominator was adjusted for the growth components, the difference between the two estimates was reduced to 431 323 people, a difference of 1.3%.
- The largest difference between the two estimates was from the migrant growth component, while the smallest was from the emigrant component.
Many countries, including Canada, Australia, France and Italy, have administrative health databases that are established and/or supported by governments that provide universal medical care.Footnote 1 Administrative health data refer to data routinely collected through the administration of health care services.Footnote 2 These data can be used for health service planning, reporting performance evaluations, clinical decision making and answering research questions.Footnote 2 Administrative data can also be used to conduct disease surveillance.Footnote 3-7 Measures of disease frequency, such as prevalence, incidence and mortality rates, can be used to describe the burden of disease among a population. With this information, policy, public health and health economics professionals can make informed decisions. Therefore, it is important that researchers choose appropriate denominators to calculate these measures.
Calculations of incidence, prevalence and mortality rates are composed of the numerator and denominator. Improper selection of the total population can lead to biased estimates of the rates of occurrence of disease and death.Footnote 8 Estimates of populations are often the most appropriate estimate available of the number of people at risk for an outcome.
One source of denominator estimates is Statistics Canada. Statistics Canada conducts a census every five years and collects data from citizens (including permanent residents), non-permanent residents and their families living in Canada.Footnote 9 Most of the population self-enumerates by completing census surveys by mail or electronically.Footnote 9
The objective of the census is to provide information about the demographic and social characteristics of the Canadian population.Footnote 9 Estimates are then derived from the census and adjusted for under- and overcoverage.Footnote 10
Another source for denominator estimates is the Public Health Agency of Canada's (PHAC) Canadian Chronic Disease Surveillance System (CCDSS). The CCDSS is a network of federal and provincial/territorial health insurance surveillance systems supported by PHAC. Denominator estimates are based on the number of people who hold valid health insurance at any given time during the fiscal year. The CCDSS is used to determine the number of Canadians living with chronic disease via interactions with the health care system, based on diagnostic and procedural codes; it adds to the breadth of information about disease burden in Canada. The system includes aggregate information for the following chronic diseases: diabetes, hypertension, ischemic heart disease, acute myocardial infarction, heart failure, mental illness, osteoporosis, asthma, chronic pulmonary disease, multiple sclerosis and parkinsonism.
This study is the first to compare the CCDSS denominator derived from national administrative health data with estimates of population from Statistics Canada. However, an Alberta Health and Wellness provincial-level study compared Alberta population counts, as covered under the Alberta Health Care Insurance Plan (AHCIP) Registry, and the 2006 census.Footnote 11 The study showed that the 2006/07 AHCIP Registry was underestimated by 0.0988% (3249/3 287 101) compared to Statistics Canada's estimate of population. However, as of June 2015, the AHCIP estimate was higher than those from Statistics Canada.Footnote 12 PHAC conducted a similar analysis in 2012, comparing the CCDSS denominator (fiscal year 2006/2007) and Statistics Canada's estimate of population (census year 2006), and found an overestimate for the CCDSS denominator of about 3.9% (955 358/24 258 902). However, this difference was difficult to interpret as data for Quebec and Newfoundland and Labrador were excluded for data quality reasons; Statistics Canada estimates were used instead. For Quebec, the data for Canadians without the disease were not available at the Institute National de Santé Publique du Québec (INSPQ), and for Newfoundland and Labrador the health insurance cards prior to 2008 did not have an expiration date, resulting in duplicate records in the medical care plan database.
The purpose of our study was to calculate the major differences between the CCDSS and Statistics Canada's estimate of population and to identify the sources or reasons for the potential differences between the CCDSS and Statistics Canada population denominators. Our objective was to inform researcher and analyst decisions about which estimate of population or denominator to use in disease frequency calculations.
The CCDSS uses provincial/territorial administrative databases to track chronic diseases among Canadians.Footnote 4-7 The CCDSS was developed by linking three administrative databases with individuals' unique lifetime identifiers.Footnote 7 The three databases include the (1) health insurance registry file; (2) fee-for-service and some shadow-billed physician services file; and (3) hospital files that capture hospital-based acute care interactions through diagnostic and procedural codes.Footnote 7 These health databases were records of health care interactions for residents who are eligible to receive provincial/territorial health care.Footnote 7 The health insurance file contains demographic information, including a unique lifetime identifier, linking the three databases together.Footnote 13
This health insurance registry also included a record for anyone who was alive and eligible to receive health care at any point in the fiscal year. Therefore, people who have died are captured in the year of their death. The federal government funds programs to provide subgroups of the population (First Nations and Inuit people, refugee protection claimants, eligible veterans, federal penitentiary inmates and serving members of the Canadian forces and the RCMP [Royal Canadian Mounted Police]) with health services and benefits that are not insured by provincial/territorial governments. These include coverage for dental and vision care, medical supplies and certain drugs.Footnote 14Footnote 15 First Nations and Inuit people still hold provincial health insurance and were captured by the CCDSS through the health insurance registry file.
Generally, members of the Canadian forces, RCMP and inmates of federal prisonsFootnote 15 are not captured by the CCDSS (approximately 110 000 people per year). The CCDSS denominator counts were obtained from each province's and territory's data submissions to PHAC as of November 2015, up to the 2009/2010 fiscal year. Data from this fiscal year were used as they were the latest available at the time of analyses. The aggregate datasets were composed of residents with valid health care insurance at any point during the fiscal year and included residents who died within the same year, by disease and demographic variables, such as age and sex.
Statistics Canada's estimates of population
Statistics Canada's estimates of population are derived from census data. Although Statistics Canada aims to enumerate the Canadian population on census day by collecting data at a single point in time (a cross-sectional view of the population) the census misses and over-counts some fraction of the population (2.7%) (868 657/32 500 000).Footnote 16 Some people may not be counted because they were away during the enumeration period or lived in a collective dwelling that provides care or assistance services, while others were counted more than once (e.g. students living away from home who were enumerated by themselves and their parents).Footnote 16 This is described as under- and over-coverage.Footnote 17Footnote 18
Statistics Canada conducts postcensal coverage studies using a representative sample of people to determine the number of people missed or counted more than once during enumeration.Footnote 19 The results of these studies are combined with the census estimates to produce current estimates of population, using postcensal and intercensal estimates.Footnote 10 Intercensal estimates are estimates of the population during the period between two censuses.Footnote 10 These adjustments render near complete estimates of population coverage.
We compared the CCDSS denominator from fiscal year 2009/10 with Statistics Canada's estimate of population for 200920 by age group. The age-at-reference date for the CCDSS denominator was March 31 and July 1 for the Statistics Canada estimate of population. The CCDSS denominator was adjusted for the growth components (births, deaths, emigration and immigration) using files from census data (Table 1).Footnote 21 The CCDSS denominator data included interprovincial migration (i.e. residents who migrate between provinces and territories were counted more than once), whereas Statistics Canada's estimates of population were already adjusted for interprovincial migration, net immigration and deaths.Footnote 26
|Growth component||Adjustment||Magnitude of the adjusted result|
|BirthsFootnote 23||379 373/3||126 457|
|DeathsFootnote 23||237 138/3||79 046|
|ImmigrantsFootnote 24||270 581/3||90 193|
|EmigrantsFootnote 24||52 335/3||17 445|
|Net non-permanent residentsFootnote 24||34 531||34 531|
|MigrantsFootnote 24Footnote 25||Out: (259 234 × 1.5) - In: (259 234)/3||302 440|
Abbreviation: CCDSS, Canadian Chronic Disease Surveillance System.
After CCDSS adjustment, the difference between the two denominators is 431 296 (1.3%).
We ran analyses using SAS Enterprise Guide version 4.1 (SAS Institute Inc., Cary, NC, USA). To illustrate the differences between the CCDSS and Statistics Canada's estimate of population, our analyses were conducted with and without the growth components for the CCDSS denominator. See Table 2 for a summary of the data definitions by source.
|Data sources||PHAC's CCDSS denominator||Statistics Canada's estimates of population|
|Estimate type||Period estimate (Canadians with valid health insurance during the fiscal year)||Point in time estimate (estimate of the number of Canadians from the Census)|
|Inclusions||1) Deaths that occur during the fiscal year 2) Canadians who migrate in Canada during the fiscal year (double-counting)||1) A representative sample of Canadians 2) Adjustments for growth components (births, deaths, emigration, and immigration) and those missed (away during enumeration) or double-counted (students away from home)|
|Exclusions||Canadians covered under federal insurance||Canadians who are away during enumeration or lived in a collective dwelling, but adjusted for|
|Use/role||Is a companion for the CCDSS numerator (Canadians exposed to the health event are included in the denominator)||Is a companion for a numerator consisting of Canadians who were exposed to a health event|
Abbreviations: CCDSS, Canadian Chronic Disease Surveillance System; PHAC, Public Health Agency of Canada.
The overall percent difference between the CCDSS denominator and Statistics Canada's estimate of population, for Canada and by province/territory, was calculated as
[(CCDSS denominator-Statistics Canada's estimate of population) / Statistics Canada's estimate of population] × 100.
To examine the largest impact on the rates, among all age groups, using both data sources, we compared diabetes prevalence, incidence and mortality rates that were calculated using the CCDSS denominator and Statistics Canada's estimate of population. We used the CCDSS data to count the number of all-cause deaths across the provinces/territories. We calculated rates for estimates of population from both data sources for the population denominator;
- Diabetes prevalence = [Total number of individuals with a case date during the capture period or prevalent cases / Total number of individuals with valid health insurance during the capture period] × 100 or Diabetes prevalence = [Total number of individuals with a case date during the capture period or prevalent cases / Statistics Canada's estimate of population] × 100
- Diabetes incidence = [Total number of incident cases / (Total number of individuals with valid health insurance during the capture period - Prevalent cases at the beginning of the fiscal year)] × 1000 or Diabetes incidence = [Total number of incident cases / (Statistics Canada's estimate of population - Prevalent cases at the beginning of the fiscal year) ] × 1000
- All-cause mortality = [Total number of CCDSS deaths / Total CCDSS population] × 100 000 or All-cause mortality = [Total number of CCDSS deaths / Statistics Canada's estimate of population] × 100 000
These rates were calculated using SAS macros (pre-programmed codes), and the counts randomly rounded.Footnote 26 Mortality data for Statistics Canada's estimate of population were obtained from vital statistics files for 2009 (to match the CCDSS 2009/10 fiscal year).Footnote 27 To assess the impact of mortality rates with both sources of denominators on the life expectancy calculation, we compared the CCDSS denominator life expectancy by sex and age group for each disease tracked by the CCDSS to Statistics Canada's estimate of population life expectancy. We stratified the CCDSS data by sex and 18 standard age groups (1-4, 5-9,...80-84, ≥ 85 years). Next we created a life table for the number of people who had a diagnostic code for a disease and the number of people who did not. The Gompertz function was used to provide an accurate estimate of life expectancy for the last open-ended age interval (≥ 85), to close the life table.Footnote 28Footnote 29
Since disease status was not available for infants younger than one year, the 2004 to 2006 sex-specific death rates for the Canadian population, from Statistics Canada, were used to model the mortality experience of infants with and without the disease. The 0- to 1-year sex-specific death rates, used to construct the life table, represented infants without the disease. Infant death rates for the disease were unavailable. Because the 0 to 1 age group experienced a high rate of mortality, we assumed that the number of infants with and without the disease would be about the same.
Our findings show that CCDSS and Statistics Canada estimates of population differ. Specifically, the CCDSS overestimates the population relative to Statistics Canada overall.
The largest difference between the two estimates was contributed by the migrant growth component (302 440), while the smallest was from the emigrant component (17 445) (Table 1). When deaths were included in the CCDSS denominator, we observed 1 081 408 more people in the CCDSS denominator (34 429 804) compared to Statistics Canada's estimate of population (33 348 396), a +3.2% (1 081 408/33 348 396) difference (Table 3).
|Age groups in years||CCDSS denominatorFootnote a||Statistics Canada's estimates of populationFootnote b||Percent difference, %|
|1-4||1 540 368||1 464 423||5.2|
|5-9||1 845 877||1 798 812||2.6|
|10-14||2 009 792||1 972 894||1.9|
|15-19||2 264 299||2 250 692||0.6|
|20-24||2 304 740||2 322 497||-0.8|
|25-29||2 358 661||2 348 492||0.4|
|30-34||2 310 455||2 258 092||2.3|
|35-39||2 377 282||2 297 458||3.5|
|40-44||2 512 797||2 480 011||1.3|
|45-49||2 858 523||2 787 129||2.6|
|50-54||2 686 876||2 573 413||4.4|
|55-59||2 322 492||2 215 710||4.8|
|60-64||1 999 094||1 888 212||5.9|
|65-69||1 480 822||1 406 971||5.2|
|70-74||1 137 703||1 080 535||5.3|
|75-79||947 527||909 136||4.2|
|80-84||728 883||676 759||7.7|
|≥85||743 613||617 160||20.5|
|All ages||34 429 804||33 348 396||3.2|
After the CCDSS denominator was adjusted for the growth components, we observed 431 296 more people in the CCDSS denominator (33 779 692) compared to Statistics Canada's estimate of population (33 348 396), a +1.3% (431 296/33 348 396) difference (Table 1). The largest difference between the two was observed among the 85-and-older age group (+20.5%; 126 453/617 160) and the smallest among the 20 to 24 age group (-0.8%; 17 757/2 322 497) (Table 3).
By province/territory, the largest difference between the two was observed for Northwest Territories (+13.0%; 5600/42 965), whereas the smallest was for Quebec (+0.5%; 37 595/7 737 335) (Table 4). A similar pattern was observed after excluding deaths from the CCDSS denominator. However, after adjustment for the growth components, the difference was reduced to 431 296 people (Δ1.3%; 431 296/33 348 396) (Table 1).
|Province/Territory||CCDSS denominatora||Statistics Canada's estimates of populationb||Percent difference, %|
|Newfoundland and Labrador||537 862||504 141||6.7|
|Prince Edward Island||148 911||139 593||6.7|
|Nova Scotia||989 707||931 622||6.2|
|New Brunswick||755 910||742 506||1.8|
|Quebec||7 774 930||7 737 335||0.5|
|Ontario||13 563 855||12 928 815||4.9|
|Manitoba||1 239 544||1 204 232||2.9|
|Saskatchewan||1 067 733||1 015 590||5.1|
|Alberta||3 711 026||3 621 681||2.5|
|British Columbia||4 524 374||4 415 160||2.5|
|Yukon||33 745||33 342||1.2|
|Northwest Territories||48 565||42 965||13.0|
|Nunavut||33 642||31 414||7.1|
|Canada||34 429 804||33 348 396||3.2|
For all Canadians, the CCDSS denominator diabetes prevalence rate (7.2%; 2 489 520/34 429 804) was 4.00% (-0.3/7.5) lower than Statistics Canada's estimate of population rate (7.5%; 2 489 520/33 348 396) and 3.03% (-0.2/6.6) lower for incidence rates with the CCDSS denominator (6.4 per 1000; 218 240/34 211 564) and Statistics Canada's estimates of population (6.6 per 1000; 218 240/33 130 156). The CCDSS denominator total all-cause mortality rate (669.2 per 100 000; 230 408/34 429 804) was 3.1% (-21.7/690.9) lower than Statistics Canada's estimate of population rate (690.9 per 100 000; 230 408/33 348 396; Table 5). The life expectancy at birth was 82.9 years for the CCDSS denominator and 81.2 years for Statistics Canada's estimate of population.
|Statistic||CCDSS denominatorFootnote b||Statistics Canada's estimates of populationFootnote c||Percent difference, %|
|Incidence||6.4 per 1000||6.6 per 1000||-3.0|
|Total (without the deaths in the populations)|
|Mortality||669.2 per 100 000||690.9 per 100 000||-3.1|
Notes: This study was made possible through collaboration between PHAC and the respective provincial governments of Alberta, British Columbia, Saskatchewan, Manitoba, Ontario, Quebec, New Brunswick, Prince Edward Island, Nova Scotia, Newfoundland and Labrador, and territorial governments of Yukon, Northwest Territories, and Nunavut. The opinions, results and conclusions reported in this paper are those of the authors. No endorsement by British Columbia, Saskatchewan, Manitoba, Ontario, Quebec, New Brunswick, Prince Edward Island, Nova Scotia, Newfoundland and Labrador, Yukon, Northwest Territories, Nunavut is intended or should be inferred.
A difference between the CCDSS denominator and Statistics Canada's estimate of population is that the latter provides a cross-sectional view of the population at a specific time period (period of enumeration), whereas the CCDSS denominators have been used to provide an estimate of the population exposed, or "at risk," over an annual period. Although prevalence and incidence rates can be calculated for a single time point, it is more common to calculate these measures for a time period. Researchers should note that using different denominators may affect calculations of prevalence, incidence, mortality rates and life expectancy. For example, a military member who visits the hospital for diabetes will not be captured by the CCDSS, because military members are a subgroup of the population who are federally insured. Therefore, the most appropriate denominator to use for calculating a measure of disease frequency would be Statistics Canada's estimates. However, the CCDSS denominator is the most appropriate denominator for estimating measures of disease frequency, whenusing administrative data (i.e. numerator is derived from CCDSS data).
In addition, a researcher may wish to calculate the prevalence of diabetes in 2006 among Canadians. The researcher could obtain the number of diabetes cases that occurred in 2006 from the CCDSS and may decide to use Statistics Canada estimates for the denominator (representing the total Canadian population for 2006). However, the risk of outcome for the numerator may not match the Statistics Canada's estimates (may not represent this population). Or perhaps an individual with diabetes died before the 2006 census enumeration date, but was registered as a case in the data source before their death. In this scenario, this individual would be accounted for by the numerator, but not in the denominator, thereby resulting in an inaccurate estimate of the prevalence of diabetes in Canada during 2006.
In order to quantify the gaps in missing data in both the CCDSS and Statistics Canada's estimate of population, further research must be conducted on data quality of the health insurance registries and work to quantify subgroups of the population (i.e. emigrants) that are unaccounted for by either data source.
The CCDSS captures nearly the entire Canadian population through the health insurance registry. We recognize that people who move to a new province or territory during a fiscal year (interprovincial migration) and receive valid health insurance are counted twice in the CCDSS denominator, for a limited period of time.
The difference between the adjusted CCDSS denominator and Statistics Canada's estimate (431 296 people) may be attributed to the discrepancies of the valid and eligible health numbers defined in the health insurance registries, possibly due to fraud.Footnote 30 In addition, the health insurance registries can include inaccurate information about deaths, due to the time and resources required to process this information.Footnote 31 Statistics Canada has found it challenging to count Canadians who emigrate for work and who are homeless, but the number of these emigrants is estimated to be small.Footnote 32-35 Both the health insurance registries and Statistics Canada staff continue to monitor the data, looking for these issues and finding explanations and mitigation strategies for them. Small differences by age (in the younger age groups) can also be attributed to the different age-at-reference dates between the two data sources. The underestimate found by the Alberta ministryFootnote 11 could be attributed to the different methods used by AHCIP (2006 census data was compared) and Statistics Canada Demography Division (conducting the provincial-specific CCDSS denominator adjustment).
Our results illustrate the importance of making an informed choice when selecting estimates of population for research, as the selection can have an effect on the calculation of rates. We found that even after adjusting the CCDSS denominator for deaths and interprovincial migration, the CCDSS denominator was greater than the Statistics Canada estimate.
These findings allow researchers to compare the major reasons for the differences between the CCDSS denominator and Statistics Canada's estimate of population in order to select the most appropriate denominator for their projects and measuring disease frequency.
It is our opinion that the CCDSS denominator best represents the population at risk for events identified using health administrative data. The CCDSS denominator should be used to measure disease frequency, as it comprises those with valid health insurance over a period. When Statistics Canada's estimates of population were used as the denominator, the exposed population was underestimated, as the census was taken at a point in time; however, people who were deceased or away at that time were included in the period prevalence numerator. Although the magnitude of the differences in diabetes rates between the two sources was small, these findings could have a slight implication on the interpretation and conclusions drawn from previous studies that have estimated prevalence, incidence and mortality using Statistics Canada's estimates of population as population denominator estimates.
We wish to thank the members of the CCDSS Scientific and Technical Committee. This study was made possible through the collaboration of PHAC and the provincial governments of British Columbia, Alberta, Saskatchewan, Manitoba, Ontario, Quebec, New Brunswick, Prince Edward Island, Nova Scotia, Newfoundland and Labrador and the territorial governments of Yukon, Northwest Territories and Nunavut. The opinions, results and conclusions reported in this paper are those of the authors. No endorsement by any government is intended or should be inferred.
- Date modified: