Original quantitative research – Examining the municipal-level representativeness of the Canadian Longitudinal Study on Aging (CLSA) cohort: an analysis using Calgary participant baseline data

Samantha J. Norberg, MSWAuthor reference footnote 1Author reference footnote 2; Ann M. Toohey, PhDAuthor reference footnote 2Author reference footnote 3; Sian Jones, MScEcon, MPPAuthor reference footnote 4; Raynell McDonough, MSWAuthor reference footnote 4; David B. Hogan, MDAuthor reference footnote 2Author reference footnote 3Author reference footnote 5


This article has been peer reviewed.

Correspondence: Ann M. Toohey, 3280 Hospital Dr NW, Calgary, AB T2N 4Z6; email: amtoohey@ucalgary.ca


Introduction: The Canadian Longitudinal Study on Aging (CLSA) is a rich, nationally representative population-based resource that can be used for multiple purposes. Although municipalities may wish to use CLSA data to address local policy needs, how well localized CLSA cohorts reflect municipal populations is unknown. Because Calgary, Alberta, is home to one of 11 CLSA data collection sites, our objective was to explore how well the Calgary CLSA sample represented the general Calgary population on select sociodemographic variables.

Methods: Baseline characteristics (i.e. sex, marital status, ethnicity, education, retirement status, income, immigration, internal migration) of CLSA participants who visited the Calgary data collection site between 2011 and 2015 were compared to analogous profiles derived from the 2011 National Household Survey (NHS) and 2016 Census datasets, which spanned the years when data were collected on the CLSA participants.

Results: Calgary CLSA participants were representative of the Calgary population for age, sex and Indigenous identity. Discrepancies of over 5% with the NHS and/or 2016 Census were found for marital status, measures of ethnic diversity (i.e. immigrant status, place of birth, non-official language spoken at home), internal migration, income, retirement status and education.

Conclusion: Voluntary studies face challenges in recruiting fully representative cohorts. Communities opting to use CLSA data at a municipal level, including the 10 other CLSA data collection sites, should exercise caution when interpreting the results of these analyses, as CLSA participants may not be fully representative of the local population on select characteristics of interest.

Keywords: demographics, Calgary, cities, longitudinal studies, census, Canadian Longitudinal Study on Aging, CLSA


  • The Canadian Longitudinal Study on Aging (CLSA) is designed to be nationally and provincially representative for age and sex.
  • Municipal representativeness of CLSA data is unknown.
  • We compared baseline sociodemographic characteristics of Calgary CLSA participants in the comprehensive cohort with those derived from the 2011 National Household Survey and 2016 Canadian Census.
  • Calgary’s CLSA sample was representative for age, sex and Indigenous identity but was not fully representative for ethnic diversity, internal migration, education and income when compared to the true population.
  • Researchers, planners, policy makers and others using municipal-level CLSA data should consider representativeness of their CLSA sample when interpreting findings.


Like many other municipalities in Canada and around the world, the City of Calgary, Alberta, implemented an age-friendly strategy in 2015. Based on the World Health Organization’s Global Age-friendly Cities guide,Footnote 1Footnote 2Footnote 3 the strategy’s vision states that “Calgary is an age-friendly city where all people have lifelong opportunities to thrive.”Footnote 3,p.12 To achieve this vision, the City and other stakeholder groups identified short- and mid-range actions.Footnote 3 The Canadian Longitudinal Study on Aging (CLSA) was recommended as a data resource that could assist in both informing and evaluating the strategy by establishing baseline measures and tracking changes over time.Footnote 4

The CLSA is a national research platform designed to advance our understanding of the complexities of aging. The overall aims of the CLSA are to examine aging as a dynamic process; investigate the interrelationship between intrinsic and extrinsic factors, from mid-life to older age; capture the transitions, trajectories and profiles of aging; and provide infrastructure and build capacity for sustained high-quality research on aging.Footnote 5 Many of the characteristics being tracked longitudinally were selected to both support population-based research and lead to evidence-informed government policies.Footnote 6

Launched in 2011, the CLSA is a rich data source available to planners and policy makers at all levels of government and academic researchers. While participation is voluntary, the CLSA sampling framework was designed to achieve national and provincial representativeness for age and sex. The extent to which CLSA data are representative for these variables at the municipality level is not known. There are also concerns about its representativeness for other sociodemographic characteristics that may be of interest. These questions merit attention if municipalities use CLSA data to assess population-level needs, monitor sociodemographic changes and understand the impact of public policy implementation.

To understand the extent to which CLSA measures are generalizable to the true population, it was important to assess how closely the CLSA Calgary sample mirrored the sociodemographic profile of middle-aged and older residents of Calgary, Alberta. Other studies have examined the limitations of establishing study cohort representativeness through comparisons with census data.Footnote 7Footnote 8Footnote 9Footnote 10Footnote 11 Volunteer and selection bias that lead to underrepresentation of minorities and other vulnerable groups may be a possible threat to the generalizability of results derived from volunteer-based cohort studies.Footnote 7Footnote 8Footnote 9Footnote 10Footnote 11 Examining cohort representativeness can help those using these data to determine when other data sources or analytical approaches should be utilized.

Calgary hosts one of the 11 CLSA data collection sites located across Canada. Assessing the representativeness of Calgary’s CLSA sample would inform the City of Calgary administration of the strengths and limitations of CLSA data in evaluating the characteristics and needs of the local older population and the effectiveness of age-friendly policy implementation. The objective of this current study was to evaluate the extent to which baseline CLSA Calgary municipal data were representative of the corresponding geographical level “true” population as captured by either the 2011 National Household Survey (NHS) or 2016 Census data. This examination would assess the utility of the CLSA as a prospective longitudinal data source for tracking age-friendly policy implementation. Although we focussed on Calgary data to explore representativeness, other cities with a CLSA data collection site could replicate our analysis to evaluate their age-friendly policy implementation.


Data sources – baseline CLSA data

The CLSA is a national voluntary study that consists of two cohorts.Footnote 5 The tracking cohort is made up of 21 241 randomly selected participants from across Canada who provide alphanumeric data via computer-assisted telephone interviews. The comprehensive cohort consists of 30 097 randomly selected participants living within a 25-km radius of one of 11 data collection sites spread across Canada. These participants provide alphanumeric data, undergo detailed in-person assessments and provide biological samples.

Study baseline data collection for both the tracking and comprehensive cohorts began in 2011 and was completed in 2015. The intent is to follow participants for 20 years or until death, whichever comes first.

Two different sampling designs were used to recruit the two study cohorts. A national sampling frame was used to ensure representativeness for age and sex. In the tracking cohort, people living in postal code areas with average lower education achievement were oversampled to adjust for bias toward recruiting participants with higher socioeconomic status and to ensure sufficient heterogeneity for analyses.Footnote 6 Individuals living within a 25-km radius of CLSA data collection sites were intentionally oversampled for inclusion in the comprehensive cohort to receive physical examinations and provide biological samples.Footnote 5 For a more detailed description, see Raina et al.Footnote 5

For the tracking cohort, sampling was designed to provide results that would be generalizable at a national level and by province in relation to the overall age and sex distribution of the population. Three sampling frames were used: the Canadian Community Health Survey on Healthy Aging; provincial health registries (except in the provinces of Alberta and Quebec); and random-digit dialling to landlines (as distinguished from a mobile cellular line).Footnote 12 Participants in the comprehensive cohort were recruited using provincial health registries (as before, except for in Alberta and Quebec) and random-digit dialling sampling frames.

The overall response rate was 10% and the participation rate was 45%.Footnote 5 Individuals were excluded from participating in the CLSA if they could not speak or write in English or French, were living in an institution, were unable to provide informed consent at the time of enrolment, resided in any of the three Canadian territories, were living on federal First Nations reserves or were full-time members of the Canadian Armed Forces.

We initially examined baseline data of CLSA participants in both the tracking (n = 306) and comprehensive (n = 2956) cohorts who, at the time of recruitment, lived in Calgary. We examined data for participants aged 45 to 64 years and 65-plus years separately. After assessing the similarities and differences between the cohorts, we elected to utilize only the comprehensive cohort dataset, with its response rate of approximately 11%.Footnote 12

Participants in the tracking cohort were ultimately excluded because some measures of interest to the municipality for evaluating its age-friendly strategy (e.g. life space index) were either not available or were collected in a manner that may have affected the responses (e.g. elder abuse, as tracking cohort participants underwent a telephone interview where other people may have been in hearing distance, whereas comprehensive cohort participants underwent a confidential face-to-face interview).

Finally, we excluded 109 comprehensive cohort participants (3.7% of the sample) whose residential address was outside of the City of Calgary legal jurisdiction.

Of those included in the final sample (n = 2847), 1640 were aged 45 to 64 years and 1207 were aged 65-plus years.

Data sources – 2011 NHS and 2016 Census data

Because baseline CLSA data were collected between 2011 and 2015, we compared CLSA data with both 2011 NHSFootnote 13 and 2016 CensusFootnote 14 data to determine the distribution of characteristics for the “true” population of Calgary.

In 2011, completion of the long-form questionnaire of the Canadian census was not mandatory. The 2011 NHS, collecting similar data to the long-form questionnaire, sampled 3 out of 10 households; participation was voluntary. The NHS included questions from previous iterations of the national census. The present study accessed 2011 NHS data through a public use microdata file released by Statistics Canada, reporting a 25% sample of the collected data. Our approach also reflects the CLSA’s assessment of the national representativeness of the data, which also made comparisons to 2011 NHS data, as the best available representation of the true Canadian population.Footnote 5

The 2016 Census long-form questionnaire sampled 25% of the Canadian population. Mandatory participation had been reinstituted in 2015. Due to the timing of our analysis, the 2016 Census public use microdata file for Calgary had not yet been released. Therefore, we accessed 2016 Census data through public aggregated tables reporting on a 25% sample of the collected data. The data tables were accessible using “Beyond 20/20,” a platform that Statistics Canada uses to disseminate aggregate data.

Because the 2011 NHS public use microdata file and the 2016 Census aggregated table data are organized into 5-year age cohorts, we merged available data to create the 45- to 64-year and the 65-plus-year age group categories. From the 2011 NHS, we created a dataset describing the Calgary population aged 45 to 64 years (n = 8808) and 65-plus years (n = 2849). The datasets created from the 2016 Census described the Calgary population aged 45 to 64 years (n = 319 600) and 65-plus years (n = 127 880).


We examined sex, marital status, immigration status, place of birth, Indigenous identity, language most often spoken at home, education, working status, total personal income and internal migration status (defined as a within-Canada household relocation that took place within the past 5 years).

Data for these variables were categorized and presented as percentages. For accurate comparisons between data sources, variable recoding was used to collapse 2011 NHS and 2016 Census response categories as needed. Across all data sources, education, working status and language most spoken at home variables were recoded for comparability. Additional recoding of marital status, country of birth, Indigenous identity and total personal income variables in the 2011 NHS and 2016 Census was done to allow for comparability with CLSA data. Since neither comparison dataset reported a retirement variable, we used the “not in the labour force” measure as a proxy for retirement status. Comparative internal migration variables were also derived from corresponding variables in the CLSA, 2011 NHS and 2016 Census datasets. Finally, we created a category for Canada as the country of birth in the 2011 NHS and 2016 Census datasets using the “non-immigrant” measure.

The CLSA dataset had 5.0% or less missing data (i.e. data coded as “refused,” “required question was not answered,” “at least one required question was not answered” and “don’t know/no answer” responses combined) for the sociodemographic variables in this analysis. The exception was “personal income,” with 7.3% missing data for the 45- to 64-year age group and 12.5% missing data for the 65-plus-year age group. The 2011 NHS variables contained 5% or less missing data for all sociodemographic variables assessed in this analysis. It is important to note that while we excluded observations with missing values from our CLSA sample, the NHS values for missing data were imputed using the nearest-neighbour method described in the NHS User Guide.Footnote 15 At a national level, response rates for education, income and work items in the NHS questionnaire were lower than for the other characteristics measured, and the values for these items were more likely to be imputed.Footnote 15


Sociodemographic profiles were established for baseline CLSA (collected between 2011 and 2015), 2011 NHS and 2016 Census cohorts living within Calgary’s jurisdictional boundaries, as defined using Forward Sortation Areas (i.e. the first three digits of participants’ postal codes). We stratified our analysis separately for the 45- to 64-year and 65-plus-year age groups because people become eligible for policy-driven programs like pensions and subsidies and other age-friendly programs and activities upon reaching age 65. For CLSA data, descriptive variables were adjusted for sampling probabilities using the trimmed analytic weights provided by CLSA, which adjust for inclusion probability.Footnote 12 The 2016 Census sociodemographic frequencies were calculated using Microsoft Excel, as per the format of the released data. All other analyses were conducted using SPSS version 25.0.Footnote 16

Rather than looking for statistical significance (trivial differences in proportions can be statistically significant in studies with a large number of participants), we were interested in the practical importance  of any differences seen, that is, whether differences would have real and noticeable effects on the interpretation of the data. We decided a priori on a 10% difference in proportions between the CLSA sample and the true population as a threshold for practical importance to consider when interpreting findings. We also noted percentage differences of 5% to 9% as being of questionable importance when interpreting findings.


Baseline CLSA (2011­–2015) and 2011 NHS comparisons

Both sex and Indigenous identity distributions in the CLSA sample were representative across both age categories (see Table 1). In contrast, immigrants were underrepresented in the CLSA sample, as indicated by differences in place of birth (and Asia in particular) and language spoken at home, suggesting practical differences compared with 2011 NHS data.

In both age groups, but especially among those aged 65-plus years, lower-income Calgary residents were underrepresented in the CLSA sample. Marital status and educational achievement were representative for Calgary CLSA participants aged 65-plus years. For the 45- to 64-year age group, discrepancies in marital status were questionable, with married individuals overrepresented and divorced individuals underrepresented. Differences in education were also questionable. Older CLSA participants (≥65 years) were more likely to be not retired.

Table 1. Comparison of baseline CLSA and 2011 NHS demographic characteristics by age group, Calgary dataFootnote a
Characteristic 45–64 years ≥ 65 years
CLSA 2011 NHS Difference CLSA 2011 NHS Difference
Sample size (n)Footnote b 1640 8808 1207 2849
Sex (%)
Male 50.7 50.6 0.1 46.5 44.6 1.9
Female 49.3 49.4 −0.1 53.5 55.4 −1.9
Marital status (%)
Married/common-law 81.7 73.9 7.8Footnote * 68.0 63.1 4.9
Single 7.9 8.8 −0.9 5.0 3.8 1.2
Widowed 2.2 2.4 −0.2 17.0 21.9 −4.9
Divorced 6.0 11.7 −5.7Footnote * 8.6 9.6 −1.0
Separated 2.1 3.2 −1.1 1.3 1.6 −0.3
Immigrant status (%)
Immigrant 16.7 32.2 −15.5Footnote ** 25.4 39.5 −14.1Footnote **
Non-immigrant 83.3 67.8 15.5Footnote ** 74.6 60.5 14.1Footnote **
Place of birth (%)Footnote c
Canada 83.3 67.8 15.5Footnote ** 74.6 60.5 14.1Footnote **
Other North America 1.5 1.4 0.1 1.7 1.3 0.4
South America, Central America, Caribbean 2.0 2.3 −0.3 1.5 1.6 −0.1
Europe 8.6 9.3 −0.7 19.6 18.9 0.7
Africa 0.6 2.1 −1.5 0.9 1.7 −0.8
Asia 3.2 16.7 −13.5Footnote ** 1.4 15.4 −14.0Footnote **
Oceania and others 0.9 0.4 0.5 0.3 0.6 −0.3
Indigenous identity (%)
Indigenous 3.7 1.9 1.8 2.3 0.7 1.6
Non-Indigenous 96.3 98.1 −1.8 97.7 99.3 −1.6
Language most spoken at home (%)
English/French 94.8 86.4 8.4Footnote * 97.2 79.8 17.4Footnote **
Other 5.2 13.6 −8.4Footnote * 2.8 20.2 −17.4Footnote **
Postsecondary degree/diploma (%)
No 27.6 33.7 −6.1Footnote * 51.5 49.3 2.2
Yes 72.4 66.3 6.1Footnote * 48.5 50.7 −2.2
Working status (%)
Not retired 86.7 81.8 4.9 26.8 18.4 8.4Footnote *
Retired 13.3 16.3 −3.0 73.3 69.7 3.6
Never worked n/a 2.0 n/a n/a 11.9 n/a
Total personal income (%)
< $20 000 14.3 23.2 −8.9Footnote * 16.1 30.6 −14.5Footnote **
$20 000–49 999 21.7 26.8 −5.1Footnote * 46.9 44.0 2.9
$50 000–99 999 35.3 30.3 5.0Footnote * 27.4 19.4 8.0Footnote *
$100 000–149 999 15.7 9.4 6.3Footnote * 5.5 2.4 3.1
≥ $150 000 13.0 10.3 2.7 4.1 3.7 0.4
Internal migration status (%)Footnote d
Non-movers 97.0 92.6 4.4 97.9 95.6 2.3
Moved home within community 1.6 5.4 −3.8 1.6 3.0 −1.4
Moved home and community 1.4 1.3 0.1 0.6 0.9 −0.3

Abbreviations: CLSA, Canadian Longitudinal Study on Aging; n/a, not applicable; NHS, National Household Survey.

Footnote a

Calgary data are derived from the CLSA comprehensive sample from the Alberta data collection site, which is located in Calgary. Only participants living within Calgary’s jurisdictional region were included in this analysis.

Return to footnote a referrer

Footnote b

Sample sizes are not weighted. All other proportions were adjusted for sampling probabilities using inflation weights provided by the CLSA, based upon the size of the community-dwelling population living near the data collection site in 2011.

Return to footnote b referrer

Footnote c

The “Canada” category was manually calculated and added to this variable for comparison purposes.

Return to footnote c referrer

Footnote d

Defined as a within-Canada household relocation within the past 5 years.

Return to footnote d referrer

Footnote *

Questionably important difference (5%–9%).

Return to footnote * referrer

Footnote **

Practically important difference (≥ 10%).

Return to footnote ** referrer

Baseline CLSA (2011–2015) and 2016 Census comparisons

Baseline CLSA (2011–2015) and 2016 Census data comparisons were similar to the CLSA and 2011 NHS comparisons with just a few exceptions (see Table 2). In the older age category, discrepancies in educational achievements were greater, achieving a practical versus questionable level of importance. For both age groups, differences in the proportions of participants who had moved within Canada during the past 5 years were of practical importance compared with the 2016 Census data.

Table 2. Comparison of baseline CLSA and 2016 Census demographic characteristics by age group for Calgary dataFootnote a
Characteristic 45–64 years ≥ 65 years
CLSA 2016 Census Difference CLSA 2016 Census Difference
Sample size (n)Footnote b 1640 319 600 1207 127 880
Sex (%)
Male 50.7 50.0 0.7 46.5 46.8 −0.3
Female 49.3 50.0 −0.7 53.5 53.2 0.3
Marital status (%)
Married/common-law 81.7 73.1 8.6Footnote * 68.0 64.1 3.9
Single 7.9 10.6 −2.7 5.0 4.1 0.9
Widowed 2.2 2.2 0.0 17.0 18.8 −1.8
Divorced 6.0 10.8 −4.8 8.6 11.2 −2.6
Separated 2.1 3.3 −1.2 1.3 1.9 −0.6
Immigrant status (%)Footnote c
Immigrant 16.7 38.1 −21.4Footnote ** 25.4 43.6 −18.2Footnote **
Non-immigrant 83.3 61.0 22.3Footnote ** 74.6 56.0 18.6Footnote **
Place of birth (%)Footnote d
Canada 83.3 61.6 21.7Footnote ** 74.6 56.2 18.4Footnote **
Other North America 1.5 1.2 0.3 1.7 1.4 0.3
South America, Central America, Caribbean 2.0 3.2 −1.2 1.5 2.5 −1.0
Europe 8.6 7.9 0.7 19.6 17.3 2.3
Africa 0.6 3.3 −2.7 0.9 2.4 −1.5
Asia 3.2 22.4 −19.2Footnote ** 1.4 19.8 −18.4Footnote **
Oceania and others 0.9 0.4 0.5 0.3 0.5 −0.2
Indigenous identity (%)
Indigenous 3.7 2.3 1.4 2.3 1.3 1.0
Non-Indigenous 96.3 97.7 −1.4 97.7 98.7 −1.0
Language most spoken at home (%)
English/French 94.8 82.5 12.3Footnote ** 97.2 79.5 17.7Footnote **
Other language 5.2 17.5 −12.3Footnote ** 2.8 20.5 −17.7Footnote **
Postsecondary degree/diploma (%)
No 27.6 33.6 −6.0Footnote * 23.0 47.8 −24.8Footnote **
Yes 72.4 66.4 6.0Footnote * 77.0 52.2 24.8Footnote **
Working status (%)
Not retired 86.7 80.6 6.1Footnote * 26.8 20.2 6.6Footnote *
Retired 13.3 19.4 −6.1Footnote * 73.3 79.8 −6.5Footnote *
Total personal income (%)
< $20 000 14.3 18.8 −4.5 16.1 26.3 −10.2Footnote **
$20 000–49 999 21.7 26.2 −4.5 46.9 43.8 3.1
$50 000–99 999 35.3 31.0 4.3 27.4 21.6 5.8Footnote *
$100 000–149 999 15.7 11.9 3.8 5.5 4.1 1.4
≥ $150 000 13.0 12.1 0.9 4.1 4.2 −0.1
Internal migration status (%)Footnote e
Non-movers 97.0 70.0 27.0Footnote ** 97.9 80.3 17.6Footnote **
Moved home within community 1.6 20.9 −19.3Footnote ** 1.6 13.1 −11.5Footnote **
Moved home and community 1.4 9.0 −7.6Footnote * 0.6 6.6 −6.0Footnote *

Abbreviations: CLSA, Canadian Longitudinal Study on Aging; NHS, National Household Survey.

Footnote a

Calgary data are derived from the CLSA Comprehensive sample from the Alberta data collection site, which is located in Calgary. Only participants living within Calgary’s jurisdictional region were included in this analysis.

Return to footnote a referrer

Footnote b

Sample sizes are not weighted. All other proportions were adjusted for sampling probabilities using inflation weights provided by the CLSA, based upon the size of the community-dwelling population living near the data collection site in 2011.

Return to footnote b referrer

Footnote c

Percentages for 2016 data do not add up to 100% as we do not include a “non-permanent resident” category (not included in the CLSA).

Return to footnote c referrer

Footnote d

The “Canada” category was manually calculated and added to this variable for comparison purposes.

Return to footnote d referrer

Footnote e

Defined as a within-Canada household relocation within the past 5 years.

Return to footnote e referrer

Footnote *

Questionably important difference (5%–9%).

Return to footnote * referrer

Footnote **

Practically important difference (≥ 10%).

Return to footnote ** referrer


CLSA data can be used to provide valuable insights into the social and physical characteristics of middle-aged and older adults in Canada. In addition to being a key data source for researchers, the CLSA is of potential use for all levels of governments that are trying to understand their aging population and evaluate the impact of age-friendly policies such as “aging-in-place” (i.e. safely remaining in one’s community throughout older age).Footnote 17Footnote 18

The sampling frame the CLSA employs ensures national and provincial representativeness of the sample by age and sex. At the national level, the CLSA sample, and particularly the comprehensive cohort, is characterized by higher levels of education and household income as well as higher percentages of Canadian-born participants, compared with 2011 Census data.Footnote 5 The objective of our study was to explore how well baseline data collected from CLSA participants (between 2011 and 2015) reflected the “true” population at a local municipal level. To do this, we compared these data with both the 2011 NHS and 2016 Census data. Establishing a baseline description of the population would also be helpful when using CLSA and other data sources to evaluate policy implementation over time, given the longitudinal design of the CLSA.

In general, we found the largest discrepancies between CLSA and 2016 Census estimates. While Calgary CLSA participants were representative for sex and Indigenous identity, marital status, ethnic diversity (i.e. immigrant status, place of birth and language spoken at home), education, working status, personal income and internal migration diverged from the true population. For some of these measures, the differences between CLSA and NHS/Census estimates were greater than 10%, which we viewed as of practical importance when using CLSA data to assess the local population. These findings suggest that CLSA data for municipally defined populations may underrepresent certain marginalized populations, which could affect interpretation. The importance of accounting for these differences will depend on the research questions being asked and how the results will be used.

We used Calgary as a case study for evaluating the representativeness of CLSA data at a municipal level. While we cannot comment on the representativeness of CLSA data for other municipalities, our findings may inform others who are considering using CLSA data to describe the health and well-being of their aging population to assess the impact of public policy.Footnote 19Footnote 20 Calgary differs from other Canadian cities in terms of its economic and sociodemographic profile.Footnote 21 Others are advised to conduct similar comparisons for their own local setting. This would be particularly applicable for other cities with CLSA data collection sites (i.e. Surrey, BC; Vancouver, BC; Victoria, BC; Winnipeg, MB; Hamilton, ON; Ottawa, ON; Montréal, QC; Sherbrooke, QC; Halifax, NS; and St. John’s, NL). Although the CLSA was not designed to be representative at a municipal level, there is growing interest in using the data to address local questions.Footnote 21

The CLSA’s inclusion and exclusion criteria may partially account for some of the differences observed. Because CLSA participants were required to be fluent in English or French, this may have systematically eliminated some older people who had been born outside of Canada. It is unclear, however, why people who reported being born in Asia were particularly underrepresented in our setting. And while efforts were made to over-sample CLSA participants with lower levels of educationFootnote 5Footnote 6 we found that participants with lower-level education attainment were nonetheless underrepresented.

Because of provincial privacy legislation, sampling from provincial health registries was not permitted in Alberta or Quebec.Footnote 12 Calgary participants were exclusively recruited using random-digit dialling of landlines.Footnote 12 This use of landlines may help explain the overrepresentation of those who had not moved in the past 5 years. The proportion of Canadian households with a landline has diminished over the last few years.Footnote 22 In addition, it is likely that someone who has recently moved would be more inclined to only use mobile telephone services in their new community,Footnote 22 and at the time of baseline data collection, Calgary was experiencing a high level of net internal migration.Footnote 23

The random-digit dialling approach may also help account for the underrepresentation of lower-income households; these households are more likely to report using mobile telephone services only and no landline.Footnote 22 In Australia, where telecommunications trends are similar to those in Canada, Barr et al. found that relying on landlines only when sampling for a cohort study reduced the accuracy of estimates for certain health indicators compared with when using a combination of landline and mobile phones.Footnote 24

Our decision to remove CLSA participants who lived outside of the City of Calgary jurisdiction from our analysis may have also influenced the extent to which the representativeness of the sampling frame was maintained.


Strengths of this study include the relative novelty of assessing the representativeness of a municipally defined subsample of CLSA participants on characteristics other than age and sex. A similar methodological approach has been applied to comparable data sources in France,Footnote 7 AustraliaFootnote 9 and the USA,Footnote 10 and more recently in Canada.Footnote 8Footnote 11

The CLSA research team has explored and reported on the extent to which the CLSA cohort is representative, at a national level, on select sociodemographic characteristics.Footnote 5 This current research investigates this question at the municipal level, addressing concerns that representativeness at the municipal level of analysis might differ from that found at the national level. Our endeavour is relevant, as interest in using CLSA data for municipally driven age-friendly initiatives is growing, and those using CLSA data to evaluate, for example, the impact of age-friendly policy, should be aware that some local subgroups may not be well represented. Identifying discrepancies between the characteristics of CLSA participants and the “true” population, coupled with thoughtful judgements about the importance of these discrepancies in understanding the state of the older local population, will help guide policy makers in applying their findings.

Our current study also contributes to the growing interest in population health intervention research methodologies focussing on activities that lie outside of traditional public health or health care jurisdictions that influence the health and well-being of the population.Footnote 25 As the CLSA begins to release longitudinal data, understanding the generalizability of the data will equip policy makers and academic researchers to make more informed interpretations of the potential connection between policy implementation and population health.


Our study offers a model that others can adapt to their own municipal contexts. Nevertheless, there are also a number of limitations to consider. For instance, we made judgements based on what we considered to be practically or questionably important differences between CLSA and 2011 NHS or 2016 Census data. There is no consistent precedent for these judgements in the literature.

Another limitation is the exclusion of missing CLSA data from this analysis. For most variables, the proportion was nominal, but for one (i.e. “personal income”) the proportion was above the desired threshold of 5% that we had established. Although high rates of missing income data are not unusual in population-based studies, this leads to a less precise categorization of the variable and possibly a degree of bias if the missing values are clustered within particular social subgroups.Footnote 26 However, the consistency of the trends in several variables (i.e. retirement and education in relation to income; immigration status and place of birth in relation to ethnic diversity) affirms our cautious conclusions regarding the representativeness of the CLSA sample. As already noted, the CLSA was not designed to provide representative data at a municipal level.

It is also important to note limitations with the census data sources that we used as proxies for the “true” population. We used Canadian population data from both 2011 and 2016 as comparison data because these straddled the period when CLSA baseline data were collected. Changes in federal government policies to do with privacy led to permitting voluntary versus mandatory completion of portions of Canada’s 2011 Census. This decision was reversed in time for the 2016 Census.Footnote 27 Although the national weighted response rate was 77.2% for the NHS,Footnote 15the voluntary nature of participation may have introduced systematic biases into this dataset. Future waves of CLSA data should be compared with 2016 and subsequent census datasets, where participation is mandatory and the true population is better reflected.

The public use microdata file used to access the 2011 NHS data and the “Beyond 20/20” aggregated tables used to access the 2016 Census data report on just 25% of the data collected. Most municipalities will have access to these publicly available forms of census data in conducting their evaluation activities.

Finally, recoding variables contained within these data sources in order to make meaningful comparisons with CLSA data posed methodological challenges. These included merging age groups, as was necessary for both the 2011 and 2016 Census data. We also note that our decision to use “Not in the labour force” to indicate retirement may have led to an inflated number of people considered retired in both datasets. However, these data are counterbalanced by comparisons of those reporting being “not retired” in the CLSA and those reporting being in the labour force in both the 2011 NHS and 2016 Census questionnaires.

Our analysis provides a case study that focuses on a single municipality. Exploring the representativeness in other municipalities with a CLSA data collection site was beyond the scope of our research objectives, data application agreement for the use of CLSA data and institutional ethics approval. We intended that our work support the City of Calgary in their exploration of data sources that could be used in the evaluation of their age-friendly strategy.Footnote 3


The CLSA is a valuable national resource for extending our understanding of determinants of health and well-being later in life, yet voluntary studies face challenges in recruiting representative cohorts. Our study examined the extent to which the baseline CLSA sample is representative of the actual Canadian population at the municipal level, specifically for Calgary, Alberta. Notable differences in sociodemographic characteristics were observed between the CLSA subsample of Calgary participants surveyed between 2011 and 2015 and true comparable populations as described by 2011 NHS and 2016 Census data. Ethnic diversity was underrepresented within Calgary’s CLSA subsample, as were older participants reporting lower education and personal income.

Researchers and policy makers who use this important dataset to explore locally defined populations should also be aware of the potential sampling limitations. We recommend that municipalities that utilize CLSA data at a local level conduct a similar analysis to the one we performed, comparing geographically defined CLSA data with local or national census data. The use of multiple data sources is also recommended in order to triangulate or enrich interpretation of the findings obtained, especially those with implications for traditionally underserved populations such as lower socioeconomic status and ethnically diverse older adults.

Ethics approval

REB approval obtained (University of Calgary REB17-0692).


This analysis was made possible using the data collected by the Canadian Longitudinal Study on Aging (CLSA). The Government of Canada funded the CLSA through the Canadian Institutes of Health Research (grant reference LSA 9447) and the Canada Foundation for Innovation. Samantha Norberg was funded through an Alberta Health Services “Seniors Health Strategic Clinical Network Summer Studentship” award and by the Brenda Strafford Centre on Aging at the University of Calgary.

The authors also wish to thank Peter Peller, Tyler Williamson, Sohel Nazmul and Lauren Griffith for their valuable assistance. The authors also wish to thank two anonymous reviewers and the associate editor of this journal for their contributions.

Conflicts of interest

Dr. David B. Hogan is the Local Responsible Investigator for the Calgary data collection site of the Canadian Longitudinal Study on Aging.

Authors’ contributions and statement

AMT and DBH developed the research objectives and research design. SJN and SJ led the analysis of the data. SJN and AMT led the drafting and revising of the manuscript. All authors contributed to interpreting the data and revising the manuscript. All authors approved the manuscript for publication.

The content and views expressed in this article are those of the authors and do not necessarily reflect those of the Government of Canada.

Page details

Date modified: