Business intelligence research and development environment v 2.0 - Privacy impact assessment summary

Service, Innovation and Integration Branch
Agency Analytics and Data Directorate
Canada Revenue Agency

Overview & PIA Initiation

Government institution

Canada Revenue Agency

Government official responsible for the PIA

Mireille Laroche

Assistant Commissioner

Service, Innovation and Integration Branch 

Chief Data Officer

Head of the government institution or Delegate for section 10 of the Privacy Act

Marie-Claude Juneau

Director

Access to Information and Privacy Directorate

Name of program or activity of the government institution

Management and Oversight Services

Description of the class of record and personal information bank

Standard or institution specific class of record:

All CRA classes of records – list here

Standard or institution specific personal information bank:

All CRA personal information banks – list here

Legal authority for program or activity

Section 241(4)(d)(ix) of the Income Tax Act

Section 295(5)(d)(v) of the Excise Tax Act

Section 211(6)(e)(v) of the Excise Act  

Summary of the project / initiative / change

The CRA administers taxes, benefits and related programs on behalf of the Government of Canada and most provinces and territories. It promotes compliance with Canada's tax legislation and regulations and plays an important role in the economic and social well-being of Canadians. The CRA works closely with its stakeholders and clients, and ensures the responsible enforcement of tax legislation.

To fulfill its mandate and to manage the organization internally, the Agency collects and creates a significant amount of data. The Agency has been using data for many years to gain insight into its programs and activities, make strategic changes and take appropriate actions in support of its mandate. The information derived from the data also supports the fairness and integrity of Canada’s tax system through compliance. To be able to detect, deter, predict and correct the behaviours of taxpayers who are non-compliant, the CRA needs to learn and be proactive about taxpayer behaviours.

Much of the skills and technologies to capture and use data from Agency operational systems for queries and reporting are mature. The Agency has an established data environment specifically designed to support traditional BI activities (e.g. reporting and basic descriptive statistics). This environment will have the potential to contain any and all data collected by CRA taxpayer programs.  

The following Branches conduct various research and analytics activities, projects and initiatives that are branch-specific or strategic in nature.

What's new

As per the Evergreen PIA action plan item from the previous version, the PIA has being updated to include external data sources, CRA organizational changes and housekeeping items such as: software licensing, acronyms, terms and grammar/language.   

External data refers to data that is collected from third party sources, and not directly by CRA. The Agency Data Lake now includes 14 external data sources, which represent 3 different categories of data determined to have horizontal value across the Agency for BI and research & development activities, namely: 

  1. Public/open sourced data, which is information in the public domain
  2. Federal government institution: Data provided based on an inter-department agreement
  3. Private Sector: Data submitted to CRA under a court approved ‘Requirement for Information (RFI) or purchased through subscription

The CRA Information Management (IM) Strategy 2019-2020, an Agency wide approach to IM, advances the importance of increased collaboration across work functions and teams. Two Corporate Policy Instruments (CPIs) were signed in 2019 that link directly into this strategy to provide direction on the management of data and data quality within the Agency, namely the Directive on Managing Data, effective June 10, 2019 and the Data Quality Standards, effective June 21, 2019. Both were approved by the Assistant Commissioner of the Service, Innovation and Integration Branch and all employees of the Canada Revenue Agency (CRA), and any other individuals required to follow CRA policy, are required to comply with these corporate policy instruments.

Scope of the privacy impact assessment

Over time, the CRA Branches noted above have identified the need for more sophisticated tools and environments to examine taxpayer behaviour in order to identify new insights about how Agency programs and services could be delivered in ways that are more efficient and effective.

To address these needs, the Agency identified the requirement to create an R&D environment where researchers and analysts can conduct research and develop strategies to identify, predict and address non-compliance and/or improve service to taxpayers and benefit recipients.

To accomplish this an environment has been created where designated people with the necessary skills and knowledge, using appropriate tools, can explore and analyze data. This environment allows for timely and consistent responses to business challenges, and reduces the time between identifying a problem or question to be answered and delivering a data-driven, evidence-based strategy.

The following core BI components make up the activities undertaken within the CRA’s R&D environment and comprise the scope of this PIA:

Risk identification and categorization

A) Type of program or activity

Program or activity that does NOT involve a decision about an identifiable individual

Level of risk to privacy: 1

Details:

The Agency uses information to undertake administrative actions on individuals under other programs and activities; however, personal information accessed and used within the R&D environment will not involve a decision about an identifiable individual, unless the program responsible for the decision has the mandate and PIA that specifically permit such action. The Agency has been using data for many years to gain insight into its programs and activities, make strategic changes and take appropriate actions. The information derived from the data also supports the fairness and integrity of Canada’s tax system through compliance. To be able to proactively detect, deter and correct the behaviours of taxpayers who are non-compliant, the CRA needs to learn about taxpayer behaviours. The R&D environment allows specific employees to undertake the analytics activities to explore and uncover these insights and data relationship. Core R&D activities may include, but are not limited to:

B) Type of personal information involved and context

Social Insurance Number, medical, financial or other sensitive personal information and/or the context surrounding the personal information is sensitive. Personal information of minors or incompetent individuals or involving a representative acting on behalf of the individual.    

Level of risk to privacy: 3

Details:

The data used within the R&D environment draws from the data present within the Agency Data Warehouse (ADW) as well as other internal source systems, data marts and external sources of data. External source data refers to data that is collected from third party sources and not by CRA, such as Electronic Funds Transfer (EFT) data collected from financial service providers under FINTRAC, or databases purchased through a subscription such as from Dun & Bradstreet. Personal information may include sensitive information such as the SIN and financial information of individuals and businesses. According to the Agency's Classification Policy (Chapter 5 of the Finance and Administration Manual, Security Volume "Identifying Protected and Classified Information Assets"), this data has been classified at the Protected B level. Agency staff within the R&D environment must adhere to CRA's stringent policies and procedures surrounding privacy and confidentiality, security, and information management.

The data to be housed within the R&D environment on a BI Appliance consists of copies of the ADW as well as CRA source systems, data marts and external source data. The personal information extracted from these sources will be subject to the same considerations applied to the ADW and source systems/data marts.   

C) Program or activity partners and private sector involvement

Within the institution (amongst one or more programs within the same institution)

Level of risk to privacy: 1

Details:

The R&D environment includes only research, analytics, and development activities performed within the Agency. A select number of researchers and analysts from branches in CRA headquarters (Appeals; Assessment, Benefit, and Service Branch; Audit, Evaluation, and Risk Branch; Collections and Verification Branch; Domestic Compliance Programs Branch; International, Large Business and Investigations Branch; Information Technology Branch; Legislative Policy and Regulatory Affairs Branch; Finance and Administration Branch; Human Resources Branch; Public Affairs Branch; and Service, Innovation and Integration Branch) and the Atlantic Regional Office are involved in R&D activities within the Agency.

The R&D environment has the potential to contain any and all data collected by CRA taxpayer programs which may include external sources of data. The acquisition of this external data is governed by information sharing agreements, international treaties and commitments, contracts between the respective taxpayer programs and the external data suppliers, and/or publicly available information.   

D) Duration of the program or activity:

Long-term program

Level of risk to privacy: 3

Details:

The R&D activities within CRA are ongoing (long-term) activities. 

E) Program population

n/a

Level of risk to privacy: n/a

Details:

R&D activities within the CRA do not affect individuals for an administrative purpose. However the program population includes most of the Canadian population (i.e. taxpayers). Business intelligence activities conducted for administrative purposes (e.g. workload selection) will be subject to separate PIAs.  

F) Technology & privacy

Does the new or modified program or activity involve the implementation of a new electronic system, software or application program including collaborative software (or groupware) that is implemented to support the program or activity in terms of the creation, collection or handling of personal information? 

Risk to privacy: Yes

Does the new or modified program or activity require any modifications to IT legacy systems and/or services?

Risk to privacy: No

The new or modified program or activity involves the implementation of one or more of the following technologies:

Enhanced identification methods - this includes biometric technology (i.e. facial recognition, gait analysis, iris scan, fingerprint analysis, voice print, radio frequency identification (RFID), etc...) as well as easy pass technology, new identification cards including magnetic stripe cards, "smart cards" (i.e. identification cards that are embedded with either an antenna or a contact pad that is connected to a microprocessor and a memory chip or only a memory chip with non-programmable logic).  

Risk to privacy: No

Details: n/a

Use of Surveillance - this includes surveillance technologies such as audio/video recording devices, thermal imaging, recognition devices , RFID, surreptitious surveillance / interception, computer aided monitoring including audit trails, satellite surveillance etc. 

Risk to privacy: No

Details: 

The program does not involve the use of surveillance on taxpayers.

However, as part of the CRA security program, CRA employees that have access to personal information will be monitored by the use of Online Audit Trail Search (OATS). The Online Audit Trail Search records such information as user logon ID, date and time of logon, logout, user location, terminal identity, name and ID of client records accessed, including edits or changes made during each user session, etc.

The information is used to verify that only authorized users have accessed personal information and to ensure that access can be linked to specific individuals to support the investigation of suspected or alleged misuse.

Every time CRA employees log in on their computers, a notice pops up requiring employees to acknowledge that they are aware that all access to CRA networks is monitored and that access is on a need-to-know basis. This information is already described in the standard personal information bank Electronic Network Monitoring Logs PSU 905.

Use of automated personal information analysis, personal information matching and knowledge discovery techniques - for the purposes of the Directive on PIA, government institutions are to identify those activities that involve the use of automated technology to analyze, create, compare, identify or extract personal information elements. Such activities would include personal information matching, record linkage, personal information mining, personal information comparison, knowledge discovery, information filtering or analysis. Such activities involve some form of artificial intelligence and/or machine learning to uncover knowledge (intelligence), trends/patterns or to predict behavior.

Risk to privacy: Yes

Details: Automated personal information analysis, personal information matching and knowledge discovery techniques are used by the Agency. The R&D environment employs statistical analysis software (SAS), data mining software (IBM SPSS Modeler), and other advanced BI software tools to explore and visualize data for research, analytics and reporting. 

G) Personal information transmission

The personal information is used in a system that has connections to at least one other system.  

Level of risk to privacy: 2

Details:

Workspace set up for R&D users will:

H) Risk impact to the individual or employee

Details:

The sensitivity of information utilized through the R&D environment is considered high (Protected B). Unauthorized use or disclosure of this information could result in loss of privacy, severe personal financial injury and or embarrassment to the taxpayer. 

Page details

Date modified: