ARCHIVED – Social Capital and Wages - Outcome of Recent Immigrants to Canada

4. Theoretical and Empirical Framework

4.1. Theoretical framework

The theoretical framework of the paper is inspired by a model proposed by Calvó-Armengol and Jackson (2007). The labour market includes social networks through which economic agents hear about jobs. The authors model the transmission of job information among individuals with a function that keeps track of job turnover and show that an improvement in the wage or employment status is positively associated with social networks across time and agents.

The model extends the previous Calvó-Armengol and Jackson (2003) model by adding features allowing for heterogeneity in jobs (so in wages) and agents’ skills, multiple offers, higher wages due to outside offers, switching of jobs and so on. A brief description of the model follows.

N agents live and work in discrete periods indexed by t. wit keeps track of the wage of agent i at time t. At the end of period t, wit = 0 if agent i is unemployed. From an agent’s wage, his or her employment status si can be deduced. When agent i is employed, then sit =1; and sit = 0 if he or she is unemployed. So the vectors wt and st represent realizations of the wage levels and employment status at time t.

A period begins with some agents employed and others not. In each period, a specific agent i learns about a job opening offering a wage wi with a probability the probability of a specific agent i learning about a job opening offering a wage withat is between 0 and 1. It is assumed that the job information arriving process is independent across agents. If the agent is unemployed, he or she will take the position. If an agent is employed, depending on whether the job constitutes an improvement over the current one, he or she will choose to guard the information or pass it on to a randomly chosen relative, friend, or acquaintance that is currently either unemployed or employed at a wage lower than that of the new job, depending on the current status of the connections. Generally, the higher the current wage of the agent, the higher the probability that the new job will not be an improvement and the agent will pass on the information. Information flows only between agents who know each other. Meanwhile, some agents lose jobs in a given period at an exogenous break-up probability b. Then the probability of the joint event that agent i hears about a job and this job ends up in agent j’s hands, is pij (w), where w is the wage status of all the agents at the beginning of the period:

Mathematical Equation

where nij =1 when individuals i and j know each other and equals 0 when they do not know each other.

In this model, the wage that agent i obtains is a function of past wage status and the person’s network. The model provides a tool for analyzing effects of social networks on employment and wage dynamics. Calvó-Armengol and Jackson used this model to show that the wages of any connected agents are positively correlated across network under the steady-state distribution and furthermore, the wages of connected individuals are positively correlated across time periods.[Note 7] There exists a short run negative correlation between employment status, wages and network size, which results from competition for information about certain jobs. However, the long-run benefits of improved wage status of networks outweigh the short run competition effects. Wages increase with network diversity and quality. Different social groups with identical job information networks but different starting wage have different wage outcomes across groups.

The current paper is an empirical test in the immigration context of the results implied by this network model, especially on the claim that network size and content matters to labour market outcomes.

4.2 Estimation framework and model specification

Panel or longitudinal data provide more information than cross-sectional data, which increases estimation precision and also enables researchers to control for unobserved heterogeneity related to the omitted variable bias in cross-section models. Thus the empirical work in this paper uses panel data models to identify social capital effects on immigrant wages, taking advantage of the longitudinal nature of the data.

The basic panel data model takes the form

Mathematical Equation

where yit is the log of wages, and xit is socio-economic characteristics of the LSIC immigrants including social networks. Among xit, some variables vary with time such as age, marital status, while others do not, such as country of origin and immigration category.  is the residual term. αi is the individual specific effect, which differs between individuals but for any particular individual, its value is constant.  is the usual residual that is strictly exogenous with the usual properties, i.e. mean 0, uncorrelated with itself, uncorrelated with x, uncorrelated with αi, and homoskedastic.

the residual term including the individual specific effect and the regular residual
the regular residual

Mathematical Equation

The various panel data models depend on the assumptions made about the individual specific effects αi.

4.2.1. Random effects model

One variant of the model (1) assumes that the unobserved individual effects are random variables that are distributed independently of the explanatory variables (assumption (3) below). This is called the random effects (RE) model.

Mathematical Equation

The RE estimator is a generalised least squares (GLS) estimator, which uses both within-group (deviation from individual mean) and between-group (individual mean) variations, but weights them according to the relative sizes of   and . It is equivalent to the following two steps: 1) transform the data:

the weighting factor used to transform the random-effects GLS estimator
variance of the regular residual

Mathematical Equation

2) regress  on . The variance parameters  and can be estimated from the within-group and between-group regression residuals.

random-effects GLS transformation of y using both within-group and between-group variations
random-effects GLS transformation of x using both within-group and between-group variations
variance of the regular residual
variance of the individual specific effect

However, assumption (3) is unlikely to hold in many cases. In the present study, the unobserved individual invariant effects αi could include personal characteristics such as ability, motivation and preferences which are very likely related to some explanatory variables for wages, like educational attainment, social network type and content and so on.  In this case E (αi | xit) ≠ 0 and the random effects estimator is biased and inconsistent.

4.2.2. Fixed effects model

The fixed effects (FE) model treats the unobserved individual effects as random variables that are potentially correlated with the explanatory variables. Unlike the random effects estimators, the FE estimator assumes nothing regarding the correlation structure between αi and the explanatory variables. As we don’t know the statistical properties of αi, it can be eliminated from the model. Among various ways to eliminate αi, the within-group transformation or deviation from mean is easy to understand: The FE estimator is a regression of   on . Given the assumption (2), β can be consistently estimated using the FE estimator.

deviation of yit from individual mean or the within transformation of y
deviation of xit from individual mean or the within transformation of x

Regarding the choice between random effects model and fixed effects model, Hausman (1978) suggests a specification test comparing the RE estimator and the FE estimator, both of which are consistent under the null hypothesis H0: E (αi | xit) = 0. A rejection would be interpreted as an adoption of the fixed effects model and non-rejection as an acceptance of the random effects model. This test was done for the current study and the results are reported in the estimation results table.

However, a major limitation of the fixed effects estimator is that the coefficients of time-invariant explanatory variables are not identified. Thus it is not suited to estimate the effects of time constant variables, such as ethnic group, education before landing and immigration class on earnings in the current study.

4.2.3. Hausman-Taylor model

As social capital is very likely to be correlated with the individual-specific effect αi, which may consist of ability and motivation, an obvious choice would be the use of a fixed effects model. However, if the effect of a time-invariant variable is of main interest, a fixed effects model cannot estimate it. Hausman and Taylor (1981) considered the following model

Mathematical Equation

where x1it and x2it are time varying variables (e.g. age, marital status, and number of friends)  while w1i and w2i are time-invariant variables (e.g. immigration category, country of origin, pre-migration experience and social networks upon landing). x1it and w1i are assumed to be uncorrelated with individual effect αi (e.g. age and country of origin), whereas x2it and w2i are assumed to be correlated with αi (e.g. education and social networks upon landing), i.e. endogenous,

Mathematical Equation

Hausman and Taylor suggested using the time-varying exogenous variables x1it to estimate β1 and meanwhile as instruments for w2i permitting estimation of . So compared to the random effects model which assumes exogeneity of all the explanatory variables with the unobserved heterogeneity, and to the fixed effects model which allows for endogeneity of all the independent variables with the individual heterogeneity, the Hausman-Taylor model instead allows for only some of the independent variables to be correlated with the individual effects. In the current earnings equation, individual specific terms αi may denote ability, personality, motivation and attitudes towards networking and work and this may be correlated to social capital variables as well as educational attainment, skill level, job tenure and working hours. Thus such variables are assumed to be endogenous (i.e. x2it or w2i) with individual specific effects. All the other variables are assumed to be exogenous (i.e. x1it or w1i).

coefficients of time-invariant and endogenous variables

Under assumptions (2) and (5), the Hausman-Taylor estimator consistently and efficiently provides estimates of β, while the fixed effects estimator consistently estimates β under weaker assumptions (2) but not efficiently. Thus a Hausman test based on the difference between the Hausman-Taylor estimator and the fixed effects estimator is used to test assumption (5). The test results, presented in the estimation results table, indicate that instrumentation of the social capital variables, education, skill level, job tenure and working hours is sufficient to remove any correlation between the individual specific effects (ability, motivation and so on) and the remaining explanatory variables.

4.2.4. Instrumental variables estimator for panel data models

There are two forms of endogeneity in this context. One is the unobserved common factors[Note 8]  which are addressed by the Hausman-Taylor estimator. The other is the so-called two-way causation: social capital is rewarded with higher pay and workers tend to develop social networks in high-paid jobs (i.e. ≠ 0). This potential endogeneity of social capital variables with the disturbance term  would require instrumental variables (IV) methods like two stage least squares (2SLS) to obtain consistent parameter estimates.

covariance of social capital indicators and residual
the residual or disturbance term

To check the sensitivity of the results towards the identifying assumptions about endogeneity of social capital, the panel data regression is expanded to include exogenous variables from outside the LSIC dataset. Inspired by Warman (2005), the ethnic concentration ratio in the CMA/CA where a LR lived is constructed from the 2001 Census and used as an instrument. Then the interaction terms of ethnic concentration ratio in CMA/CAs with LRs’ ethnic groups are used as additional instrumental variables. Job-found channels, network diversity and organizational participation are instrumented by these instrumental variables.

Fixed effects IV (FE2SLS), random effects IV (RE2SLS) and Baltagi (1981)’s error component two-stage least squares (EC2SLS) estimates are employed to allow for the endogeneity of social capital variables and labour market success in terms of wages. While fixed effects 2SLS cannot provide estimates for time invariant variables, Baltagi’s EC2SLS is a matrix-weighted average of between 2SLS and fixed effects 2SLS.[Note 9]  So the EC2SLS estimates are reported in the paper as representative of panel IV models to be compared with ordinary panel data models. Hausman tests are conducted to compare the results from various panel data models including instrumental variable ones.

Despite the large change in the social capital coefficients when IV models are used, there is no significant evidence of endogeneity in social capital with disturbances term.

4.2.5. Variables used and model specification

Within all models of the study, the dependent variable is the log of real weekly wages of the current job(s). Weekly wages are determined by summing weekly wages of all the current jobs at each interview. The nominal wages are converted to real values based on 2005 Canadian dollar (i.e. Wave 3 interview period is treated as base year) using annual CPI from 2001 to 2005.[Note 10]

Control variables cover a range of individual, household and local characteristics:

  1. Demographic variables: age, marital status, which are time varying and exogenous with.
  2. Immigration category: dichotomous variables equal to unity if Skilled Worker Principal Applicants, Skilled Worker Spouses and Dependants, Refugees and Others, with Family Class immigrants as the reference. These variables are time invariant and exogenous.
  3. Region of birth: dichotomous variables equal to unity if born in Asia and Pacific, Central and South America, Europe other than UK and Western Europe, and Africa and Middle East, with North America, UK and Western Europe as the benchmark. These variables are time invariant and assumed to be exogenous.
  4. Province of residence: dichotomous variables equal to unity if lived in Atlantic Provinces, Quebec, Prairies Provinces and British Columbia with Ontario as the reference category; five dichotomous variables equal to unity if lived in the top five CMAs – Toronto, Montreal, Vancouver, Ottawa and Calgary. Inclusion of these variables is to capture the local labour market disparity. These variables are time varying and exogenous.
  5. Ethnic group: dichotomous variables equal to unity if Chinese, South Asian, Black, Filipino, Latin, West Asian and Arab, Other Asian (Southeast Asian, Korean and Japanese), and Other Visible Minority, with White as the benchmark.[Note 11]  Similar to region of birth variables, these variables are time constant and exogenous.
  6. Education: dichotomous variables equal to unity if LR had a master’s degree, college diploma or some university education, some post-secondary education, a high school diploma or less, with a bachelor’s degree as the reference; a dichotomous variable equal to unity if in school at the time of interview. The education variables are time varying and assumed to be correlated with unobserved ability.
  7. Languages: dichotomous variables equal to unity if has knowledge of English (speaking fairly well, well, very well and with English as the native language), or knowledge of French (speaking fairly well, well, very well and with French as the native language). Both variables are time varying and assumed exogenous.
  8. Experience: length of time in Canada measured in weeks and a set of dichotomous variables equal to unity if had work experience before immigration, had visited Canada before, had worked in Canada on a work permit before, had studied in Canada on a study permit before, or had an arranged job in Canada when landing. Obviously the time spent in Canada is time varying while other variables indicating experience before or upon landing are time constant. All of these are assumed to be exogenous.
  9. Occupation and skill level: Occupation major groups are defined using the Standard Occupational Classification (SOC) 1991 while skill levels are determined using the National Occupational Classification (NOC) 2001. Management occupations are considered as skill level A. For multiple-job holders, occupation group and skill level are determined by the current main job.[Note 12]  These variables change over time. Occupational variables are exogenous while the skill levels are assumed to be endogenous with individual ability as they are highly correlated with education level.
  10. Number of current jobs and total hours worked per week are included in the control variables to account for comparability, as the weekly wages are the summation of weekly wages of all current jobs. Job tenure is measured as the number of weeks worked at the job and is included as a control variable. These variables are all time varying. Hours worked per week and job tenure are assumed to be endogenous.
the individual specific effect

Social capital indicators are built according to the LSIC data structure (Xue 2007). Social networks are categorized into three types. The first type is kinship network, which includes relationships with family members and relatives living in Canada. The second type is friendship network, which consists of ties with friends and workmates. The third type is organizational network, defined as the relationships immigrants have with groups and organizations, such as community organizations, religious groups, ethnic or immigrant associations, etc. Different dimensions of social capital are also considered. For each type of network, indicators are built to measure the social capital stock: size, geographic closeness, diversity, frequency of contact.

Different from what are included in the empirical analysis for employment likelihood (Xue 2007), workplace network is added in friendship network to further investigate the effects of characteristics of workplace interpersonal network on wages outcome of immigrants. Specifically, while meeting new friends at workplaces was excluded from the number of sources of meeting new friends in Xue (2007) for analysis of employment likelihood, given the endogeneity of the two variables, it is counted as one source of meeting friends, i.e. size of friendship, in the current paper as the current sample only includes those who were employed. A new variable, ethnic diversity of workplace network, considering the relative number of supervisors and co-workers of the same ethnic group as an immigrant worker, is included in social capital indicators in the estimations for wages.

For group and organizational networks, due to the low participation in groups and organizations among all immigrants, only one dummy variable indicating whether LR participated in any kind of groups or organizations is included in the estimation models instead of size, diversity, density indexes for organizational networks.[Note 13

In order to capture the direct effects of networks on wages, job search channels through which immigrants obtained their current main job are also included in the models in addition to the aforementioned social capital indicators. For complete variable descriptions, see the variable definition table in Appendix Table A.1.

All the social capital variables are assumed endogenous with unobserved individual heterogeneity in the Hausman-Taylor models.


7 For detailed proof of propositions, see Calvó-Armengol and Jackson (2007).

8 Here, unobserved ability is rewarded with high pay and people with high innate ability tend to have higher levels of social capital and education.

9 For technical details on Baltigi’s EC2SLS, see Baltigi (2005), Section 7.1 in Chapter 7.

10 The conversion considers Wave 1 to be in 2001, Wave 2 in 2003 and Wave 3 in 2005.

11 In the LSIC, there are questions on both ethnicity and visible minority group (population group). The question on population group was used to construct the ethnic group variables in the current paper.

12 The current main job is identified by the following criteria: 1. If the LR only had one current job, it was the main job. 2. If the LR had more than one current job, the job with the most hours worked per week was the main job. 3. If more than one current job met the above criteria, the job with the earliest start date was selected. 4. If the above criteria did not help to identify one job among the current jobs, the first job reported was selected.

13 In the preliminary estimations, size, diversity and density indexes were used to capture the characteristics of group and organizational networks, which leads to insignificant coefficients for all organizational networks indicators. This could result from the large number of missing values for size, diversity and density indicators of organizational networks due to the low participation in organizations among all immigrants. However, the results including size, diversity and density indexes of organizational networks in the estimations are available upon request.

Page details

Date modified: