Science Policy Note: Assigning Values to Nondetected/Nonquantified Pesticide Residues in Food
July 28, 2003
ISBN: 0-662-34732-3
Cat. No.: H113-13/2003-2E-PDF
(SPN2003-02)
The following document is a policy/guidance document that reflects the United States Environmental Protection Agency's (U.S. EPA's) recent dietary risk assessment science policy/guidance paper entitled, Choosing a Percentile of Acute Dietary Exposure as a Threshold of Regulatory Concern (March 16, 2000).
The Pest Management Regulatory Agency (PMRA) has adopted the policy and guidance outlined in the U.S. EPA document as part of efforts to harmonize dietary risk assessment procedures for determination of the safety of pesticide residues in domestic and imported treated foods.
This endeavour, to harmonize methodologies, is part of the North American Free Trade Agreement (NAFTA) goals within the Pesticides Technical Working Group Subcommittee.
The U.S. EPA has taken the lead in developing science policies related to the U.S. Food Quality Protection Act (FQPA). Harmonization of these policies between our agencies has been key to our ability to do joint reviews. Such policies play an increasingly important role in the evaluation and assessment of risks posed by pesticides and improve the regulator's ability to make decisions that fully protect public health and sensitive subpopulations. These policies are vetted by the NAFTA Technical Working Group on Pesticides and have been approved for adoption, only after extensive consultation by scientific experts from governmental, academic and all nongovermental interested parties. The consultation process utilized by the PMRA for science policy notices is outlined in a memo entitled, Memorandum to Registrants, Applicants and Agents (January 25, 2001), and may be obtained from the PMRA web site.
The following policy document is intended to provide guidance and information to PMRA personnel and decision-makers and to the public. As a guidance document, the policy in this document describes the process used by PMRA scientists in dietary risk assessments. Stakeholders remain free to comment on the application of the policy to individual pesticides. The PMRA will carefully take into account all comments that are received.
Executive Summary
Residue data are used by the Pest Management Regulatory Agency (PMRA) to support the establishment or reassessment of a pesticide maximum residue limit (MRL) associated with a particular food use. In some cases, a portion of the measurements of the levels of pesticide residue present on food shows no detection of residues. These "nondetects" (NDs) do not necessarily mean that the pesticide is not present at any level, but simply that any amount of pesticide present is below the level that could be detected or reliably quantified using a particular analytical method.
The primary science policy issue concerning NDs is the value that should be assigned to NDs when estimating exposure and risk from a pesticide in food. The PMRA's goal is to make exposure and risk assessments as accurate and realistic as possible, while not underestimating exposure or risk, so that all humans, including infants and children, are fully protected. The specific issues addressed in this paper concern the values the Agency should assign to NDs in order to meet this goal. This is an important criteria used in the determination of chronic and especially acute dietary risk assessment.
In general, the PMRA recommends use of a default value of half the Limit of Detection (LOD) or half the Limit of Quantitation (LOQ) for commodities which have been treated but for which no detectable residues are measured.
This paper also describes PMRA's policy of performing a "sensitivity analysis" to determine the impact of using different assumptions (e.g., assuming NDs = full LOD or full LOQ vs. NDs = zero), on the PMRA's risk assessment for the pesticide under evaluation. If it is demonstrated through the sensitivity analysis that the default assumptions have no effect on the final PMRA risk decision, then there is little reason for the PMRA to attempt to further refine these default assignments. If the PMRA finds that these default assignments do have a significant effect on the risk estimate or risk decision or decides that a more refined risk estimate is needed, a second, more accurate set of statistical methods can be used instead to determine the values or distribution of values for NDs.
These statistical methods provide a more accurate way of estimating food exposure and risk than assuming that, for NDs, exposure occurs at ½ LOD or some other single, finite value and allowing risk assessors to impute (attribute, ascribe) a series of values that represent concentrations below the stated detection limit. These methods would generally be used only in situations where the NDs comprise a significant (but less than half) portion of the data set and the rest of the data are normally or lognormally distributed, but exceptions can be considered on a case-by-case basis.
The following policy document is intended to provide guidance and information to PMRA personnel and decision-makers and to the public. As a guidance document, the policy in this document describes the process used by PMRA scientists in dietary risk assessments. Stakeholders remain free to comment on the application of the policy to individual pesticides. The PMRA will carefully take into account all comments that are received.
Table of Contents
- Introduction
- Assigning values to nondetected/nonquantified pesticide residues
- Definitions
- Refining anticipated residue estimates using ½ LOD or ½ LOQ for nondetects
- Sensitivity analysis
- Use of percentage of crop treated
- Considerations related to pesticides having analytes of concern
- Essentially zero residues: Use of zero (or near zero) residue concentrations
- A statistical method for incorporating nondetected pesticide residues
- List of abbreviations
- Appendix I
- References
Introduction
Pesticide manufacturers (i.e., registrants) who petition the PMRA to establish a maximum residue limit (MRL) are required to submit data on the level of pesticide residues that remain in or on food. Data on the levels of pesticide residues in food are also available from a number of other sources. Often, instrumentation in the laboratory is not able to detect any residue below the limit of detection (LOD).
However, even though the laboratory instrumentation cannot detect a residue, a residue may be present at some level below the LOD and may still present a potential concern to human health. This paper describes the PMRA's policy for assigning values for use in human health exposure and risk assessment to nondetected/nonquantified pesticide residues in food. In general, and as described more fully later in the document, PMRA recommends use of a value of one half the analytical Limit of Detection (½ LOD), one half the Limit of Quantitation (½ LOQ), the (full) Lower Limit of Method Validation (LLMV), or true zero for these nondetected residues.
One issue that arises from use of the aforementioned default assumptions of ½ LOD, ½ LOQ, etc. is whether the Agency's method for assigning finite values to nondetects (NDs) in its risk assessments may either overestimate or underestimate risk depending on the actual distribution of data below the LOD. Specifically, the question arises as to whether the PMRA's default assumptions regarding the residue values to associate with nondetected or nonquantifiable residues are a significant factor in controlling the risk decision per se.
Should there be concern about the effect of PMRA's default procedure of assigning one half the limit of detection/quantification values to treated commodities with nondetected residues on the risk estimate or risk decision, this paper also describes the PMRA's policy of performing a "sensitivity analysis" to determine the impact of different assumptions (e.g., assuming NDs = LOQ or NDs = zero) on the Agency's risk assessment for the pesticide under evaluation. If it is demonstrated through the sensitivity analysis that the risk estimate or final risk decision is unaffected by the default assumptions, the PMRA will conclude that the relevant risk estimate is sufficiently "robust" so as not to warrant a more refined estimate of exposure and risk.
In those instances in which the default assignment is critical or decisive in determining PMRA risk-management action or it is simply desired that a more refined risk and exposure estimate that relies to a lesser extent on default assumptions be developed, one of a series of more accurate statistical methods can be used to estimate the values or distribution of values associated with the ND values. Such statistical methods provide a more accurate way of estimating exposure and risk from pesticides in food than assuming that exposure through the NDs occurs at ½ LOD or some other single, finite concentration. These methods are fully described in EPA's Guidance for Data Quality Assessment: Practical Methods for Data Analysis originally issued in July 1996 (EPA/600/R-96-084). In general, these methods would be used only in situations where the NDs compromise less than half the data set and the rest of the data are normally or lognormally distributed, but exceptions will be considered on a case-by-case basis. It is expected that many of the ND values obtained from this method would be less than ½ LOD or ½ LOQ but greater than zero.
The policy for assigning values to nondetectable residues is intended to avoid underestimating exposure to potentially sensitive or highly exposed groups such as infants and children while attempting to approximate actual residue concentrations as closely as possible. Both biological information and empirical residue measurements support the PMRA's belief that these science policies are consistent with these goals.
The policy paper is divided into several sections. Section I is this introduction. Section II, entitled "Assigning Values to Nondetected/Nonquantified Pesticide Residues", provides the rationale for assigning ½ LOD or ½ LOQ to commodities that have been treated with a pesticide but that show no analytically detectable residues for any or all commodities sampled. Section III, entitled "A Statistical Method for Incorporating Nondetected Pesticide Residues", provides a more accurate, statistically based method for estimating nondetected pesticide residues than simply assigning a default value of ½ LOD or ½ LOQ to these NDs. Section IV provides a list of references. The Appendix to this document is a sample calculation illustrating one (of many available) methods for calculating LOD or LOQ.
This science policy applies at this time only to exposure to pesticide residues via the food supply and, more specifically, only to the refinement of pesticide exposure from food by calculation of Anticipated Residues (ARs), a risk-assessment refinement tool. This policy is not appropriate for, and is not to be used in, the determination of the actual residue level that will be established as the "MRL" (maximum acceptable residue level) for a pesticide in a particular commodity.
This policy document is intended to provide guidance and information to PMRA personnel and decision-makers and to the public. As a guidance document, the policy in this document describes the process used by PMRA scientists in dietary risk assessments. Stakeholders remain free to comment on the application of the policy to individual pesticides or on the appropriateness of the policy itself. The PMRA will carefully take into account all comments that are received.
Assigning values to nondetected/nonquantified pesticide residues
Definitions
In the discussion of which values to assign to nondetected and/or nonquantified residues, it is important that consistent definitions be employed for the various terms being used to describe these concepts. Over the years, a variety of different practices have arisen because of definitional differences between LOD and LOQ, a lack of distinction between the two, preference for one over the other, the proliferation of several synonymous terms such as "limit of determination" or "limit of sensitivity", and the fact that there are situations in which one is, indeed, more appropriate to use than the other. In many cases, a sample is reported to contain nondetectable residues when, upon further investigation, the proper designation should have been "nonquantifiable", or vice versa. In a number of instances, the PMRA has noted in residue chemistry submissions that these terms have been inappropriately used, used interchangeably, or used without supporting documentation and/or information concerning their derivation. In the PMRA's policy, these terms will have the definitions provided below.
Limit of Detection (LOD)
The LOD is defined as the lowest concentration that can be determined to be statistically different from a blank. This concentration is recommended to be three standard deviations above the measured average difference between the sample and blank signals, which corresponds to the 99% confidence level. In practice, detection of an analyte by an instrument is often based on the extent to which the analyte signal exceeds peak-to-peak noise (Keith et al. 1983). Samples that do not bear residues at or above the LOD are referred to as NDs.
Method Detection Limit (MDL)
The MDL is the lowest concentration that can be reliably detected in either a sample or a blank, and the Instrument Detection Limit (IDL) is the smallest signal above background noise than an instrument can reliably detect; both MDL and IDL are related LOD concepts.
Limit of Quantitation (LOQ)
The LOQ is defined as the level above which quantitative results may be obtained. The corresponding sample/blank difference is recommended to be 10 standard deviations above the blank, which corresponds to the 99% confidence level (Keith et al. 1983) and to an uncertainty of ±30% in the measured value at the LOQ. LOQ is typically used to define the lower limit of the useful range of the measurement technology in use. Samples that do not bear residues at or above the LOQ are often referred to as "nonquantifiable".
Lower Limit of Method Validation (LLMV)
There are cases in which a laboratory does not stringently determine the LOD and LOQ of a particular substrate/method/equipment combination but, rather, a "Lower Limit of Method Validation" (LLMV) is reported that could be higher than the true LOQ within the capability of the method. The LLMV is simply the lowest concentration at which the method was validated. In these cases, neither the method limit of first choice (LOD) nor second choice (LOQ) was demonstrated, and the PMRA would normally request that an LOQ be estimated by the study sponsor from the LLMV, chromatograms and other available information.
In general, the PMRA discourages the determination and use of the LLMV if a consequence of this is that a legitimate LOD or LOQ cannot or will not be determined. To date, the PMRA has not issued formal guidance or suggested/recommended procedures, or made available a list of acceptable methodologies for the estimation of LOD and/or LOQ values for pesticide residue analyses. Due, in part, to the many valid operational definitions of LOD and LOQ and procedures used to estimate these, the PMRA believes it unwise to prescribe any one specific procedure or protocol as a standard universal requirement for pesticide registration submissions.
Nevertheless, in the interest of informing registrants and other data submitters of at least one method for LOD/LOQ determination that has been acceptable in the past, an Appendix to this Science Policy Notice illustrating this method is attached.
Any reasonable generally recognized statistical procedure may be considered and will be evaluated. It is recommended that registrants and other data submitters fully document the procedures and protocols used to estimate the LOD and/or LOQ for review by the PMRA.
Refining anticipated residue estimates using ½ LOD or ½ LOQ for nondetects
Pesticide food risk assessments are initially conducted using conservative assumptions such as MRL-level residues in crops, maximum theoretical livestock diets, highest field trial residue values and 100% of the crop being treated. Worst-case assessments using such assumptions may result in an apparently unacceptable acute and/or chronic food risk.
In such cases, refinement of food-exposure assessments to derive more realistic estimates is often warranted. To further refine food exposure, calculations may include use of percent crop treated data, more realistic livestock diets, averages of field trial data, statistically collected monitoring data from the Canadian Food Inspection Agency (CFIA) of Agriculture and Agri-food Canada (AAFC), Health Protection Branch of Health Canada, U.S. Department of Agriculture (USDA) or the U.S. Food and Drug Administration (U.S. FDA) and/or incorporation of residue reduction factors to account for cooking or processing.
In some cases, probabilistic analyses of composited or even single serving-size samples may be used. The use of these anticipated residues (ARs) produces more refined exposure estimates, which more closely approximate the pesticide residues that humans will actually consume in their diets.
The ways in which the data are refined depends on such considerations as what data are available, the relative confidence the Agency has in these data, the residues of toxicological significance, which of these residues are detected by the analytical method(s) used, as well as the metabolic profile over time. Not infrequently, data on ARs contain at least some measurements for which the chemical analyst reported residue concentrations at levels "below the limits of detection or quantitation". The fact that no residues are detected does not necessarily mean there are none.
Residues may exist at levels that are too low to detect. If the Agency has information demonstrating that a crop sample was treated with the pesticide in question, but residues were not analytically detected, there are a number of options available for dealing with these nondetectable values and integrating this information into pesticide food exposure assessments.
The two extreme options would be:
- to assume that if residues were not detected, they were not present (i.e., residues concentrations are zero); or
- to assume that if residues were not detected (at some limit of detection), they were present at just below that limit of detection.
The first option would lead to the least conservative (i.e., most likely to underestimate the actual average residue level in the ND samples) exposure estimate, since the Agency would be assuming nondetectable residues were actually zero; the second option would result in the most conservative (i.e., least likely to underestimate the actual average residue level in those samples) estimate, since the Agency would be assuming that nondetectable residues were actually present at just below the analytical limit of detection.
The PMRA believes that neither approach reasonably represents reality, particularly in data sets in which many NDs are present. Rather, biological information and empirical residue measurements indicate that residue data sets (including the NDs) are often lognormally distributed. On a theoretical basis, concentrations of pesticides in food crops might be expected to be a random-product process and the Theory of Successive Random Dilutions (SRD) would predict that concentrations of pesticides would be lognormal (Ott 1995). In addition, empirical evidence for a lognormal distribution of pesticides in foods exists from a recent study by the United Kingdom's Ministry of Agriculture, Fisheries and Food (MAFF) in which thousands of individual serving sized samples were analyzed for a variety of pesticides and found to follow in most cases a lognormal distribution (MAFF 1997).
Given the above information, the PMRA recommends (as an initial step in the exposure determined) or the LLMV, as appropriate, for samples with no detectable residues if it is known or believed that these samples have been treated with a pesticide according to the following protocol.
- The PMRA generally recommends use of a value of zero for the proportion of the data set corresponding to the percentage of the commodities known not to be treated with pesticide (see "Percentage of Crop Treated" section).
- For the remainder of the data points for pesticide-treated commodities, the PMRA recommends as its preferred approach use of the following assumptions:
- if a valid LOD exists, use ½ LOD as the assigned value for NDs when conducting food exposure and risk assessments;
- if an LOD is not available, but a valid LOQ exists, use ½ LOQ for the NDs;
- if neither an LOD nor an LOQ is available, use the LLMV for the NDs; and
- if both LOD and LOQ are determined and if nonquantifiable residues are detected between the LOQ and LOD, use ½ LOQ for those measurements.
In general, PMRA considers that the "replacement" or "substitution" method (replacing treated NDs with ½ LOD or ½ LOQ) will result in reasonable estimates of risk and exposure if the number of NDs is small (e.g., 10-15%). The use of ½ LOD or ½ LOQ for nondetectable residues in samples is widely used in the risk assessment community and is advocated by the U.S. EPA (U.S. EPA 1998a) when the appropriate conditions are met. Registrants are encouraged to use the substitution method in these instances, and the PMRA would perform sensitivity analyses routinely in these situations only on a case-by-case basis.
When the number of NDs increases to greater than ca. 10-15% (but is still less than 50%), risk assessments should be performed using the replacement method, but the effect of the substituted values should be assessed by performing a sensitivity analysis and verifying that the relevant risk and exposure estimates are not significantly affected. Such an analysis should be included as part of the risk characterization.
If it is determined that the effect of this substitution is significant, it may be desirable to use statistical methods developed for censored data (as explained in the section, "A Statistical Method for Incorporating Nondetected Pesticide Residues", of this document). When data sets consist of >50% NDs, the handling of NDs should be considered on a case-by-case basis, and no general rule of thumb is possible.
Additional details concerning this procedure and assignments are provided below.
(1)Policy for NDs when a LOD has been properly determined
The selection of a numerical value to represent NDs in a refined exposure assessment depends on the level of confidence the PMRA has in the supporting documentation of the various method limits under consideration. For the PMRA to have a high level of confidence, the claimed LOD should be demonstrated using chromatograms, calculations and statistics as noted above. Although there are a variety of acceptable techniques that can be used to estimate the LOD or LOQ, one example that would be acceptable is shown in the Appendix to this policy paper. The information provided in this attachment is only an illustrative example. Data submitters are free to use any reasonable and scientifically supportable methodology. In any case, and in accordance with the PMRA's Residue Chemistry Guidelines (DIR98-02) as outlined in the section, "A Statistical Method for Incorporating Nondetected Pesticide Residues", the procedures used by a laboratory to determine the LOD and LOQ should be fully explained and/or copies of any appropriate publications should be submitted with the analytical method description to the Agency.
The PMRA recommends that the actual numerical value used to represent ND residues and to be entered into the acute or chronic AR calculation should be ½ LOD. Particularly in those cases in which acute food risk is only marginally acceptable and ½ LOD is used for a significant portion of the samples, this assumption should be mentioned in the risk characterization and the use of a sensitivity analysis should be considered (see "Sensitivity Analysis" section of this document).
(2)Policy for NDs when only a LOQ has been properly determined
If an appropriate LOD has not been properly determined, PMRA scientists will examine whether an LOQ has been experimentally and statistically demonstrated and if a given sample with ND residues may be adequately represented by ½ LOQ as demonstrated by chromatograms and other information.
The PMRA recommends that the actual numerical value to be entered into the AR calculation should be ½ LOQ.
(3)Policy when neither a LOD nor a LOQ has been properly determined
If neither the LOD nor the LOQ has been properly determined, the full LLMV (lowest concentration at which the method was validated) generally will be used in risk assessment.
The rationale for this policy is that the Agency has less confidence in data samples when an LOD or LOQ cannot be statistically determined or reasonably estimated from the data. In general, if a LLMV is reported instead of an LOQ, it is likely that insufficient analyses were performed and a ½ LOQ value could not be calculated with sufficient statistical rigor and precision to be reliably used in a risk assessment. Accordingly, to assure that actual exposure to pesticides in food will not be underestimated using such data, the PMRA will use the full LLMV for each ND of a treated sample in this situation. The PMRA actively discourages a registrant from choosing to use or report a LLMV if this is to be used as a substitute for a properly determined LOD or LOQ.
However, the PMRA believes that, in many cases, a rigorously determined LLMV (e.g., one in which numerous determinations were made at levels close to the LOQ and appropriate statistical methodologies can be used) can be used to estimate an LOD or LOQ.
In these cases, the PMRA recommends use of the ½ LOD or ½ LOQ default, as appropriate, in risk assessments
(4)Policy when detectable but nonquantifiable residues are found
If a sample contains detectable, yet nonquantifiable residues, i.e., residues falling between the LOD and the LOQ, the PMRA recommends that such samples typically be represented numerically in the refined exposure assessment as ½ LOQ when assessing both acute and chronic risk.
This is consistent with the USDA Pesticide Data Program's (PDP) policy for reporting these values: all residues detected at >LOD but <LOQ by the PDP program are reported as ½ LOQ. If information is available indicating that most residue values are just above the LOD or just below the LOQ, a decision will be made on a case-by-case basis regarding the appropriate value to assign to NDs. The rationale for selection of a residue value different from ½ LOQ for these commodities should be explained clearly in the risk characterization. If available and clearly supported by raw data (chromatograms, etc.), the analyst's estimate of the residue between the LOD and the LOQ may, at the discretion of the PMRA, be used as a means of further refinement of the estimated exposure. If a significant portion of the residue values was derived via the analyst's estimation of values between the LOD and LOQ, this should be noted in the risk characterization.
Sensitivity analysis
In general, assigning numerical values to NDs as described above is not expected to significantly affect the PMRA's risk estimate. However, the PMRA, under certain circumstances, will perform a sensitivity analysis if it is believed that the substitution of ½ LOD, ½ LOQ, or LLMV values for NDs has significantly affected the outcome of a risk assessment and/or the PMRA's risk decision. That is, if the PMRA risk assessment shows unacceptable risks when ½ LOD values are used for NDs, the PMRA will attempt to demonstrate that the use of ½ LOD has not by itself significantly affected the risk assessment by re-estimating risks with zero substituted for ½ LOD or ½ LOQ.
Conversely, if the risk assessment shows acceptable risk when ½ LOD values are assigned to NDs, we will re-estimate the risks, where appropriate, with the full LOD or LOQ substituted for ½ LOD or ½ LOQ. This latter substitution will never change the estimated exposure (and risk) by more than a factor of 2 (and then only if all crops were considered treated and if all values were ND). If the Agency risk assessment changes substantially as a result of assigning these alternate values, the sensitivity analysis will have demonstrated that the Agency risk assessment is sensitive to assumed concentrations for the NDs. The PMRA may then request that additional data and/or an improved analytical method be developed and submitted. To date, the conduct of these sensitivity analyses has not resulted in a significant change in the upper percentiles of estimated acute exposures.
Use of percentage of crop treated
Notwithstanding the above discussion, the PMRA believes it to be appropriate to use "true zeros" for those NDs that represent untreated crops, and the PMRA continues to support the use of "true zero" for those samples that have not (or are not expected to have been) been treated with the pesticide of interest. Specifically, exposure assessments will generally be performed with the nontreated samples incorporated as "true zeros".
The Agency will determine which "nondetect" samples should be represented by zero in a ratio directly proportional to the percentage of crop not treated. In calculating average residues when a variety of limits of detection exist, the average residue value calculated will incorporate a weighted average of the LODs from treated commodities in which no residues were detected. (An U.S. EPA policy document (March 23, 2000) reported a range of interlaboratory LOD variation of up to 35 times has been observed for a single chemical/crop combination in one residue monitoring data set.) Such a calculation will not incorporate one half of the overall average LOD from all laboratories. For example, if 70% of a crop is not treated, but 80% of the monitoring samples in a data set is reported as <LOD, then 70% of the samples would be assigned a value of zero, 10% would be designated as ½ LOD, and 20% of the samples would be assigned the reported residue values. If more than one LOD is reported for the samples in the data set, one half of the weighted average of the LODs would be used.
An illustration of this calculation is shown below. Suppose that 30% of apples are treated with a pesticide (and 70% are therefore not treated), but a PDP survey of 5-lb. (2.3 kg) composite samples shows that 80% (i.e., a total of 240 samples) of the 300 samples collected have ND (not detected or less than detection limit) residues. Three quarters of those PDP ND values have a LOD of 0.05 ppm and one quarter of the ND values have LODs of 0.10 ppm. We wish to calculate the average residue in apples for use in a chronic pesticide food residue assessment. Given this information, we would conclude that 70% of the 300 composite apple samples contain no (or zero) residues, since they were not treated with pesticide. This means that 210 of the 300 composite samples are true zeros (70%). From this, it follows that 210 of the 240 ND values (or 87.5% of the NDs ) represent true zeros with the remaining 30 ND value (or 12.5% of the 240 NDs) representing treated apples with residues at less than the detection limit. To calculate residues in these treated samples, we would assign one half the 0.05 ppm LOD to three quarters of these NDs (representing an expected 22.5 of the 240 ND samples) and one half the 0.10 ppm LOD to the remaining one quarter of these NDs (representing an expected 7.5 of the 240 ND samples). The average residue for use in a chronic assessment, therefore, would be calculated as follows:
[(210 × 0 ppm) + (22.5 × 0.025 ppm) + (7.5 × 0.05 ppm) + ∑ (all >LOD values)]
÷ 300
If the residue data were to be used instead to establish an electronic residue file for use in an acute probabilistic assessment, the file would contain 210 true zeros and 30 values at 0.0313 ppm (i.e., ½ the weighted LOD), with the remaining 60 values represented by their >LOD measurements (at either ½ LOQ or >LOQ, as appropriate).
Similarly, in those cases where it is necessary to construct an electronic residue file for an acute exposure assessment (and the average residue values are not appropriate), the file should be constructed such that the treated ND samples are assigned a weighted average of the LODs in which no residues were detected. An example of how this file would be established is illustrated in the example shown above.
Considerations related to pesticides having analytes of concern
The LOD and/or LOQ is often not established for all residues of toxicological significance. In some situations, the method may, in fact, be incapable of determining the residues at all. This may particularly be true for multiresidue monitoring methods. For example, the U.S. FDA often reports only residues of the parent compound. CFIA (AAFC) and USDA's PDP often attempts to analyze all residues of toxicological significance; however, there are certain metabolites of concern that are not sought by CFIA or PDP due to analytical difficulty or due to the unavailability or expense of analytical standards. As a result, difficulty arises when attempting to sum the residues of multiple analytes of concern because a numerical limit is not available to assign to nondetectable levels of one or more of the residues of concern.
Such shortcomings may render one or both sources of monitoring data of limited value to the refinement of pesticide food residue exposure estimates unless metabolism studies and other information can be used to establish a ratio between the concentration of one or more analyte(s) to the concentration of toxicologically significant residues not determined by the method. Decisions on how to use such residue data will be made on a case-by-case basis.
Essentially zero residues: Use of zero (or near zero) residue concentrations
A number of instances may arise in which it is appropriate to assume for risk-assessment purposes that residue values so closely approach zero that this value (rather than ½ LOD or½ LOQ) should instead be used in the exposure assessment. An explanation of the rationale, and further illustration of situations where this might be appropriate, are detailed in an U.S. EPA Standard Operating Procedure (U.S. EPA 1999) and from which the following is excerpted:
... [I]t may be appropriate in certain cases to judge that the ND values from the monitoring data are "essentially zero", particularly if a substantial portion of the measured residue values is less than the analytical detection limit (and would therefore ordinarily be replaced by ½ LOD). In these instances, it may be appropriate to introduce a value of zero ppm (or near zero) as a residue value (in place of ½ LOD) for the ND measurements in the risk assessment. This judgement should be made on a case-by-case basis, with the reviewer bringing a wide range of information to bear on proper valuation of the NDs, including the nature of distribution of the values above the detection limit, the percent of the crop which is treated, and information on the processing of commodities before sampling.
For example, information from a radiolabel metabolism study or a field trial conducted at an exaggerated rate may be available, which indicates that residues of concern are present at levels much lower than ½ LOD. Alternatively, theoretical calculations based on mass balance considerations, for example, may demonstrate for a seed-treatment use that resulting residues in the harvested crop would be expected to be much less than ½ LOD. Other factors, perhaps when considered jointly, might warrant consideration in this evaluation and suggest that resulting residues would be near zero. For example, for a blended processed commodity such as corn oil, it might in some instances be appropriate to assume measured ND values from monitoring studies represent real zero (or near-zero) concentrations after consideration of the percentage of the corn crop that is treated, the range of actual application rates and pre-harvest intervals (PHIs) and processing/cooking factors.
For example, if only 5% of the corn crop is treated, residue field trials on the raw agricultural commodity show average residues of 0.1 ppm, and processing of corn into corn oil has been demonstrated to reduce residues 15-fold, it would be reasonable to assume that average residues are less than 0.0003 ppm, and thus, any ND value in a corn oil monitoring survey could be assumed to be less than this calculated average. In many cases, it may be a variety of factors which, when considered together, would lead the risk assessor to support a decision to replace <LOD measurements with zero (or near-zero) values in the risk assessment.
Alternatively, a value of zero may be appropriate to represent "nondetects" for one or more analytes of concern or residues of concern, provided this decision is supported by such information as metabolism studies, data at shorter PHIs, exaggerated rate data, etc. This approach may be appropriate only for certain crops or certain use patterns. On a case-by-case basis, plant or livestock metabolism data, data reflecting exaggerated application rates and/or short PHIs, close examination of the chromatograms, consideration of the analytes determined by the analytical method(s), and other information may be used singly or in conjunction to formulate a weight-of-the-evidence argument in favour of (or against) use of true zero to represent the level of one or more analytes of toxicological concern potentially present in samples denoted as bearing less than LOD/LOQ residues.
This procedure could be particularly important for pesticides having several residues of toxicological concern whereby, using the above information, the chemist gains confidence that only a subset of the terminal residues will be present at normal harvest time; zeros could be used for the other analytes of concern. On an international level, a similar approach is used by the Food and Agriculture Organization/World Health Organization's Joint Meeting on Pesticide Residues in the case of pesticides having a chronic toxicological endpoint.
A statistical method for incorporating nondetected pesticide residues
There may be instances in which a significant portion (e.g., more than 15%) of the residue data set contains nondetectable residues, when a sensitivity analysis reveals an inordinate effect of the ½ LOD or ½ LOQ assumption on the risk decision, or when it is simply decided that a more accurate assessment of residue levels is appropriate. This section of the policy paper describes how residues levels below the LOQ may be estimated using statistical imputation methodologies. Use of such methodologies should produce a more accurate estimate of <LOD and <LOQ residues.
When the appropriate recommended conditions are met (see below), statistical imputation methodologies are useful for predicting the distribution of nondetectable residues below the LOD in cases where some of the residues of the data set are undetectable. Here, statistical imputation refers to imputation procedures for left-censored data. Imputation (ascribing, attributing), in general, can refer to procedures applied to other aspects of pesticide food residue exposure. For example, imputing single-serving residue values from composite samples is on form of imputation (which is perhaps more aptly referred to as a data deconvolution exercise).
Similarly, construction of empirical distribution functions (EDFs) is a form of imputation as well that results in data interpolation. In the context of this paper, the term "imputation" refers to imputing (or "data uncensoring") of left-censored data. When properly employed, such methods can provide a scientifically sound basis for more accurately estimating pesticide food residue exposure and risk than assuming that exposure occurs at ½ LOD or some other single, finite value.
This method is intended to be used chiefly by persons conducting probabilistic human health pesticide food residue exposure assessments for purposes of registration, reregistration, or MRL assessment of pesticides.
Briefly, the methods described below use the information provided by the uncensored portion of the data (i.e., that portion of the data with >LOD values) and the assumed normal (or transformed normal) distribution of the data to calculate a mean and standard deviation which incorporates the data which lie below the detection limit in the "censored" region of the data. This assumption of normality (or lognormality) should be verified prior to use of these methods.
The reference to an "assumed normal distribution" is made to reflect a common statistical convention that one cannot prove a given distribution belongs to a hypothesized family of distributions (e.g., normal, lognormal, Poisson, etc.) but rather can only provide sufficient evidence to suggest that the hypothesized distribution is not inconsistent with actual distribution (analogous to either "rejecting" or "failing to reject" a hypothesis). If there is insufficient evidence to demonstrate that a distribution is not normal, then it is reasonable to refer to it as an "assumed normal" distribution.
Cohen's (1959, 1961) method requires that the distribution be normal (or can be made normal) and that there be only a single LOD or LOQ for all analyses of the same commodity. It will result in an estimated mean (i.e. arithmetic average) concentration that incorporates the <LOD values and can be used in a chronic assessment or in an acute assessment for those commodities for which the use of a mean value is appropriate (e.g., blended commodities). By mentioning several specific methods, the PMRA does not mean to imply that other methods are not appropriate for this task.
Whichever method is selected, the PMRA recommends that the method be adequately supported by both a sufficiently rich data set above the detection limit and a statistically robust methodology for imputing those values. Under the methods presented below, those measured values that lie above the LOD but below the LOQ should be considered as being "semiquantitative".
In contrast to the methodology described in the "Assigning Values to Nondetected/Nonquantified Pesticide Residues" section of this document, in which ½ LOQ is generally used as a default assumption for all values that lie between the LOD and LOQ, the actual measured "semiquantitative" value should instead be used when working with methods for censored data.
Cohen's Method
Cohen's (1959, 1961) method is a technique that can be used to more accurately determine mean residue values from heavily "censored" data sets, i.e., data sets for which a substantial amount of data (e.g., 15-50%) are simply reported as less than a given detection or quantitation limit. Cohen's method is fully described in EPA's Guidance for Data Quality Assessment: Practical Methods for Data Analysis (U.S. EPA 1998a).
The method is designed to be used only for data points that are part of a parent population that is normally distributed or that can be made normal via transformation. Practically, this means that the parent population should have either a normal or lognormal distribution. Thus, prior to using this method, the existence of a normal (or transformed lognormal) parent population should be demonstrated.
It is strongly recommended that the data be graphed on appropriate probability paper and that normality tests (e.g., Shapiro-Wilk) be performed to verify the assumed distribution. Various statistical procedures (with associated examples) which could be used to accomplish this task are available in the U.S. EPA document Guidance for Submission of Probabilistic Human Health Exposure Assessments to the Office of Pesticide Programs, April 11, 1998 (U.S. EPA 1998b).
Additional recommended criteria for use of Cohen's methodology is that not more than 50% of the data set be censored (ideally, less than 20% should be censored) and/or at least 10 noncensored data points (with 20 or more being strongly desirable) be available. Exceptions to these recommended criteria can be made on a case-by-case basis. However, with respect to the exceptions, it should be remembered that in many cases it is likely that a more refined estimation procedure such as Cohen's method is being using precisely, because the insertion of ½ LOD for ND residues resulting in risk above the PMRA's level of concern, while the substitution of 0 ppm for NDs resulted in risks below the PMRA's level of concern.
That is, in many cases, Cohen's method will be used, because the PMRA's risk estimate or resulting decision is very sensitive to assumptions about values to assign to ND residues. Thus, the PMRA is justified in recommending stricter criteria for use of Cohen's method than might normally be used in attempting to estimate a "best" estimate of a mean residue value.
It is important to note that, when using CFIA's monitoring data or USDA's PDP or other monitoring data to calculate an average residue for use in a risk assessment, the percentage of the data set that represents "true zeros" (i.e., not treated) should be eliminated from the data set before considering whether the procedure in this document is applicable.
For example, if 80% of a crop is not treated, but 90% of the PDP values are reported as NDs, the untreated portion of the data should be removed from the data set; the remaining NDs (i.e., 10% of the original sample) would be considered to represent treated commodities that have residues at levels lower than the LOD. Thus, in this case, 50% of the data would be censored (10% of the samples are ND and 10% of the samples are greater than the LOD). Since Cohen's method is designed for use with a distribution that is normal, the logarithms of the data should be used if the data are lognormally distributed with the resulting mean and standard deviation of the (original) untransformed data back-calculated using the following formulae for the mean and standard deviation, respectively:
M_{a} = exp(M_{L} + 0.5s_{L}^{2})
s_{a}^{2} = M_{a}^{2}[exp(s_{L}^{2}) - 1]
where M_{a} is the arithmetic mean of the original (untransformed) residue values, M_{L} is the mean of the logarithms of the residue values, s_{a} is the standard deviation of the original (untransformed) residue values, and s_{L} is the standard deviation of the logarithms of the residue values.
In general, the criterion that the data be normally (or lognormally) distributed is not expected to present an impediment to the widespread application of this technique. On a theoretical basis, concentrations of pesticides in food crops might be expected to be a random-product process and the Theory of Successive Random Dilutions (SRD) would predict that concentrations of pesticides would be lognormal (Ott 1995). In addition, a fair amount of empirical evidence for a lognormal distribution of pesticides in foods exists from a recent study by the U.K.'s MAFF in which thousands of individual serving sized samples were analyzed for a variety of pesticides and found to follow in many cases a lognormal distribution (MAFF 1997).
Briefly, Cohen's technique for censored samples involves the following steps for lognormally distributed data (derived from Perkins et al. 1990):
- determine N = total sample size;
- n = number of quantitated measurements;
- H = (N - n)/N;
- transform the uncensored measurements to logarithms;
- determine ln(LOD) = X_{o};
- determine s_{L}^{2}⁄(X_{L} - X_{o})^{2} = γ, where X_{L} and s_{L}^{2} are the mean and (population) variance of the log-transformed detectable data, respectively;
- using appropriate tables (e.g., in the United States, PMRA (1996) or Perkins et al. (1990)) with h and γ, find γ;
- M_{L} = X_{L} - γ(X_{L} - X_{o});
- s_{L}^{2} = S_{L}^{2} + γ(X_{L} - X_{o})^{2};
- M_{a} = exp(M_{L} + 0.5s_{L}^{2}); and
s_{a}^{2} = M_{a}^{2} [exp(s_{L}^{2}) - 1]
Cohen's paper (expression 2.5.5) indicates he is using n in the denominator, rather than n - 1. The use of n in the denominator is more commonly associated with a population variance formula, while the use of n - 1 is associated with the sample variance formula.
An example of the use of Cohen's method is shown below and is derived from Gilbert (1987, p. 183).
Concentrations of a pesticide in 10 samples of an agricultural commodity are as follows (in ppm): <0.2, 0.45, 0.60, 0.76, 1.05, 1.12, 1.20, 1.37, 1.69 and 2.06. The LOD is reported at 0.2 ppm. A statistical evaluation (not shown) demonstrates that these values are consistent with a lognormal distribution. Using this information, the following results would be generated:
- N = total sample size - 10
- n = number of quantitated measurements = 9
- h = (N - n)/N = (10 - 9)/10 = 0.1
- The natural logarithms of the pesticide concentrations are (respectively) as follows: ND, -0.7985, -0.5108, -0.2744, 0.04879, 0.1133, 0.1823, 0.3148, 0.5247 and 0.7227.
- ln (LOD) = ln (0.20) = -1.6094
- X_{L} = 0.035 88; s_{L}^{2} = 0.211 93; γ = 0.211 93/(0.035 88 + 1.6094)2 = 0.078 29
- γ= 0.1164
- M_{L} = 0.035 88 - 0.1164(0.035 88 + 1.6094) = -0.1556
- s_{L}^{2} = 0.211 93 + 0.1164(0.035 88 + 1.6094)^{2} = 0.5270
- M_{a} = exp[-0.1556 + 0.5(0.5270)] = 1.114 s_{a}^{2} = (1.114)^{2}[exp(0.5270) - 1] = 0.8611
Estimation of specific values that lie below the detection limit
Cohen's method is appropriate for use in cases where it is sufficient to calculate a mean residue value and where the basic distributional and other requirements are met. In general, the use of a mean value in a risk assessment is appropriate if a chronic analysis is being performed or if it is actually the mean value in an acute analysis which is of interest. In certain instances, it may not be sufficient to simply obtain the mean (and standard deviation) of a data set by use of Cohen's method. For example, it may be desired to perform a Monte Carlo analysis using data from a market basket survey in which single serving sized samples were analyzed and many NDs were obtained, or it may be required to insert data from field trials with many ND values into a Monte-Carlo analysis. In these cases, it is the entire set of individual residue values (or their estimates) that are desired and not simply the mean or values which are greater than the LOD.
In these cases, it may be possible to use the information provided by the uncensored portion of the data to impute those values that lie below the detection limit in the "censored" region of the data. It is important that the nontreated NDs be removed from the distribution prior to performing statistical calculations (as discussed under the Cohen's Method section). Of course, one can instead simply substitute ½ LOD or ½ LOQ, as appropriate, for the NDs in a Monte Carlo analysis. As in the case with Cohen's method used to calculate a mean residue value, those data that are greater than the detection limit should be demonstrated to follow a specific hypothesized distribution; it is this specific distribution that is used to extrapolate residue values into the less than detection limit (or censored) region of the data. The resulting imputed values can then be used directly in a Monte Carlo assessment.
A variety of ways are available to impute values that lie below the detection limit in the censored region of the data (i.e., imputing of left-censored data), and there is an extensive literature on this topic (see References section). The PMRA is not advocating specific ways in which this statistical imputation can or should be done but, rather, simply emphasizing that, whichever methodology is selected, it should be adequately supported. In general, these methods should be used only when it has been demonstrated that the relevant risk estimate (e.g., chronic risk based on mean input values or acute risk based on the distribution of input values and a high-end (output) exposure percentiles) or risk-management decision is sensitive to the assumption that ND values are equal to ½ LOD (or ½ LOQ). In general, these techniques will only be used if more than 10-15% of the data are NDs.
One popular method for this imputation procedure is Helsel's Robust Method (Helsel 1990; ILSI 1998). This method can be used to extrapolate a distribution to the region of the censored portion of the data and, hence, generate "replacement values" for those measurements that are simply reported as "below detection limit". As stated by Helsel (1990):
These methods combine observed data above the reporting limit with below-limit values extrapolated, assuming a distributional shape, in order to compute estimates of summary statistics. A distribution is fit to the data above the reporting limit by either MLE or probability plot procedures, but the fitted distribution is used only to extrapolate a collection of values below the reporting limit.
In general, the fitting of the distribution is done by either Maximum Likelihood Estimation (MLE) procedures or by probability plot procedures (which generally require that there be only one censoring point). Roughly speaking, the MLE procedures use, in a complex iterative mathematical optimization procedure, the reported values above the detection limit and the values of the detection limits to estimate the parameters of the family of distribution (e.g., normal, lognormal) which "maximize the likelihood" of observing the data actually observed. Once the values of the defining parameters have been obtained, they are used to generate replacement, or "fill-in", values for the observations below the detection limit. A sample plot in which Helsel's procedure and MLE techniques were used which Helsel's procedure and MLE techniques were used is shown below^{Footnote 1}. The plot used recent single-serving data generated by USDA's PDP. The probability plot procedures (also referred to as "regression on order statistics" or "regression on expected normal scores"), in contrast, can be easily computed with standard statistical software which estimate the intercept and slope (representing the mean and standard deviation, respectively) of a line fit to the data above the detection limit.
The plot used recent single-serving data generated by USDA's PDP. The probability plot procedures (also referred to as "regression on order statistics" or "regression on expected normal scores"), in contrast, can be easily computed with standard statistical software, which estimates the intercept and slope (representing the mean and standard deviation, respectively) of a line fit to the data above the detection limit. As stated in Helsel (1990):
The robust probability plotting method for a single reporting limit can be computed easily by most commercially available statistics software. Normal scores (NSCORES of Minitab or PROC RANK within SAS, for example) first are computed with all less-thans set to slightly different values all below the reporting limit. Second, a linear regression equation is developed using only the above-limit observations, where log of concentration is the y variable and normal scores the x variable. Estimates for the below-limit data then are extrapolated using this regression equation for normal scores for the below limit data. Finally, extrapolated values are retransformed into units of concentration, combined with above-limit concentration data, and summary statistics computed.
Figure: Plot of United States Department of Agriculture Pesticide Data Program Single Serving Residue Data in Which Censored Data was Extrapolated Using Helsel's and Maximum Likelihood Estimation Procedures
The figure is a sample plot in which Helsel's procedure and Maximum Likelihood Estimation techniques are used to impute values that lie below the detection limit in the censored region of data. The linear regression plot indicates that half the samples in the Pesticide Data Program have detectable residues and that half the samples have residue levels below the limit of the detection. The residue levels below the limit of detection have been extrapolated using regression equation of normal scores, where log concentrations is the y variable and normal scores the x variable. As expected, a normal distribution of the detected and extrapolated residue data is observed in the box plot.
The figure contains three primary graphical elements.
The first element is a 2-dimensional histogram of percentile along the abscissa, and the natural logarithm of residue concentration along the ordinate. The ordinate increases from minus 7 at the origin to minus 1 at the top left end of the axis. The top five bars are shaded in a light green, which indicates they are derived from uncensored data. The lower 5 bars are shaded a darker green to indicate these are imputed or filled in values calculated by Maximum Likelihood Estimation (MLE) methods. The maximum percentile occurs at an ordinate value of about minus 4, and decrease monotonically above and below this point. A red curve is fit to the histogram to emphasis the symmetry and normality of distribution for the real and censored data. The maximum ordinate value of the red curve is again at about minus 4, and decreases asymptotically from this point in both directions to minus 1 and minus 7.
The second element is a box-and-whisker plot which graphically depicts central tendency point, upper and lower quantiles of the data in the previously described histogram. The box and whisker plot is a visual descriptor, and does not line up with precisely with the ordinate of the histogram. The point of central tendency corresponds with minus 4, while the upper and lower quantile boxes are approximately one half of 1.78 (noting that log 10(1.78) = 0.25) above and below minus 4. The whiskers extend from approximately minus 1.5 down to minus 6.5, or a span of one-half of five on either side of the central value (noting that log 10(5) = 0.70).
The third element is 2-dimensional probability plot of normal scores along the abscissa from minus 3 at the origin, to plus 3 at the right. The ordinate in unlabeled; however the body text indicates this describes the logarithm of residue concentration. That is to say, the co-ordinate system is complementary to the previous two elements. A secondary abscissa scale is oriented along the top of this plot for the percentiles. These progress from 0.01 at the top left, and increase to 0.95 at the top right. Data above the ordinate value of zero are plotted as black squares and represent the log transformed values of uncensored residue data (i.e. real, measured values). These occur in the upper right hand portion of the plot. These transformed data are linearly regressed and used to extrapolate values of the points below the limit of detection (i.e. imputed estimates of censored values), located in the lower right hand portion of the plot. The estimated values below the ordinate value of zero are plotted as red x's and represent the log transformed values of uncensored residue data. The density of both real and imputed values are higher at the mid-point of the plot, and decrease to the upper right and lower left ends of the regression line. This reflects the density distribution of the data as presented in the histogram (element one). The intercept and slope for the regression equation are derived from the mean and standard deviation, respectively. Upper and lower bounds along the regression line are also plotted as red curves. These are roughly symmetrical about the regression line, with broader bounds at the extreme ends (top right and lower left had edges), and narrowest at the mid-point. Overall, the data points (real and extrapolated) and the regression line increase linearly from the lower left portion of the plot to the upper right portion. The real data (above zero) increase their wander from the regression line as the distance from zero increases. A vertical green bar is located at the mid-point to indicate the zero point along the quantile abscissa corresponds with the 50th percentile.
In either case, the extrapolated values below the detection limit can be combined with the actual values above the detection limit to produce a discrete data set that can be used as input data in a Monte Carlo probabilistic assessment. Details of this how this could be performed are available in the literature. Briefly, a distribution (e.g., normal, lognormal) that is defined by a mean and standard deviation could be used to impute those values that lie below the detection limit by using the inverse cumulative distribution function, M-1 , where M-1 (p) = z score and p = n/(N + 1) (assuming a normal distribution). Any imputed ND values used to "fill-in" the distribution (i.e., replace the <LOD values with more appropriate single-valued finite estimates) would be calculated as follows:
"Fill-in" value = z score × SD + mean.
For example, suppose that there are 100 data points of which 98 are above the detection limit and 2 are below the detection limit. Further supposed that calculation via Cohen's (or any other) method results in an estimated mean of the distribution of 10.0 and a standard deviation of 2.0. The values would then be ranked, and (since there are 100 total values), the first ND would occupy the 0.01 quantile, and the second ND would occupy the 0.02 quantile.
The corresponding p values would be calculated as p_{1} - 1/(100 + 1) = 0.0099 and p_{2} = 2/(100 + 1) = 0.0198 for the first and second ND values, respectively. Using a normal probability table, one would determine that M^{-1} (p_{1} ) = M^{-1} (0.0099) = -2.33 and M^{-1} (p_{2}) = M^{-1} (0.0198) = -2.06. The fill-in values associated with these two z scores are (-2.33 × 2.0) + 10.0 = 5.34 and (-2.06 × 2.0) + 10.0 = 5.88. Thus, 5.34 and 5.88 would be the two fill-in values associated with the two NDs.
List of abbreviations
- AAFC
- Agriculture and Agri-Food Canada
- ARs
- Anticipated Residues
- CFIA
- Canadian Food Inspection Agency
- EDF
- Empirical Distribution Function
- FDA
- Food and Drugs Act (Canada)
- FDR
- Food and Drugs Regulations
- FDAR
- Food and Drugs Act and Regulation
- FQPA
- Food Quality Protection Act (U.S.)
- IDL
- Intrument Detection Limit
- IQL
- Instrument Quantification Limit
- LLMV
- Lower Limit of Method Validation
- LOD
- Limit of Detection
- LOQ
- Limit of Quantitation
- MAFF
- Ministry of Agriculture, Fisheries, and Food (U.K.)
- MDL
- Method Detection Limit
- MLE
- Maximum Likelihood Estimation
- MRL
- Maximum Residue Limit
- NAFTA
- North American Free Trade Agreement
- NDs
- Nondetects
- ppm
- parts per million
- PMRA
- Pest Management Regulatory Agency
- PDP
- Pesticide Data Program
- PHIs
- Preharvest Intervals
- RDF
- Residue Distribution File
- SRD
- Successive Random Dilutions
- U.S.
- United States
- USDA
- United States Department of Agriculture
- U.S. EPA
- United States Environmental Protection Agency
- U.S. FDA
- United States Food and Drug Administration
Appendix I
To date, the PMRA has not issued formal guidance or procedures or made available a list of acceptable methodologies for the estimation of LOD and/or LOQ values for pesticide residue analyses. Due, in part, to the many valid operational definitions of LOD and LOQ and procedures used to estimate these, the PMRA believes it unwise to prescribe any one specific procedure or protocol as a standard universal requirement for pesticide registration submissions. Any reasonable, generally recognized procedure consistent with the aims and requirements of regulatory exposure estimation and risk-assessment practices of the PMRA may be considered and will be evaluated.
The PMRA notes, however, that there may be some confusion in the regulated community with respect to LOD and LOQ issues. This is likely due, in part, to the plethora of definitions for LOD and LOQ, a lack of distinction between the two, organizational preference for one over the other and the proliferation of several synonymous terms such as "limit of determination" or "limit of sensitivity". In a number of instances the PMRA has noted in residue chemistry submissions that these terms have been inappropriately used, used interchangeably or used without supporting documentation and/or information concerning their derivation. In many cases, the PMRA has found that a sample is reported to contain nondetectable residues when, upon further investigation, the proper designation should have been "nonquantifiable", or vice versa.
This confusion over the definition of LOD (or LOQ), and the implicit statistical concepts which are central to its definition, is alluded to by Berthoux and Brown (1994). The method limit of detection or method detection limit (MDL) is based on the ability of a measurement method to determine an analyte in a sample matrix, regardless of its source of origin. Processing the specimen by dilution, extraction, drying, etc. introduces variability, and it is essential that the MDL include this variability.
The MDL is a statistical concept, although it is often thought of as a chemical concept, because it varies from substance to substance and it becomes possible to measure progressively smaller quantities as analytical measurements improve. Nevertheless, the MDL is a statistic that is estimated from the data. As such, it has no scientific meaning until it is operationally defined in terms of a measurement process and a statistical method for analyzing the measurements that are produced. Without a precise statistical definition, one cannot determine a numerical value for the LOD or expect different laboratories to be consistent in how they determine the LOD.
Many definitions have been published. They may differ in detail, but broadly speaking they are all defined in terms of a multiple of the standard deviation of measurements on blank specimens, or, alternatively, on specimens that have a very low concentration of the analyte of interest. All definitions exhibit the same difficulty with regard to how the standard deviation of blank specimens is to be estimated. Without a precise statistical definition, one cannot determine a scientifically plausible value for the limit of detection, expect different laboratories to be consistent in how they determine the LOD or be scientifically honest about declaring that a substance has (or has not) been detected. Beyond the statistical definition, there must be a clear set of operational rules for how this measurement error is to be determined in the laboratory.
Most published definitions are weak with respect to these instructions, which must explain how to estimate the variances and what kind and number of blanks to be used. Given this confusion, and in the interest of informing registrants and other data submitters of one potential method for LOD/LOQ determination which would fully meet relevant scientific and statistical criteria, an illustrative example is shown below.
The information provided is only illustrative of a technique that a registrant may or may not choose to follow. Registrants are free to use any reasonable and scientifically and statistically supportable methodology. Regardless of whether this specific example methodology or another separate methodology is chosen, the procedures used by a laboratory to determine the LOD and LOQ should be fully explained and/or copies of any appropriate publications should be submitted with the analytical method description to the Agency. In addition, the PMRA expects that adequate supporting documentation (e.g., chromatograms, calculations, etc.) would be included in the submission.
Illustrative example
An analyst wishes to determine the LOD and LOQ for a specific method for measurement of a given pesticide in a given crop matrix. This method may be the proposed enforcement analytical method or simply a method that is used for measurement of residues in crop field trials or market basket surveys. The estimation of the LOQ and LOD of a specific method for a specific pesticide on a specific crop is done in the following two steps.
- The first step is to produce a preliminary estimate of the LOD and LOQ and to verify that a linear relationship between concentration and instrument response exists^{Footnote 2}. These preliminary estimates correspond to what some term the IDL (Instrument Detection Limit) and IQL (Instrument Quantitation Limit), respectively. The matrix of interest will be fortified (spiked) at the estimated LOQ in the next step for the actual estimation of LOD and LOQ of the method.
- The second step is to use the initial estimate of the LOD and LOQ determined in Step 1 to estimate the method detection limit and method quantitation limit in the matrix of interest.
These two steps are described in detail below.
Step 1
The analyst derives a standard curve for the method of interest. In this particular instance, the analyst prepares the standard solutions with the following concentrations of the pesticide of interest (all in ppm): 0.005, 0.010, 0.020, 0.050 and 0.100. For each concentration in the sample solution^{Footnote 3}, the following instrument responses (measured peak height) are recorded.
Concentration (ppm) | Instrument Response (peak height) |
---|---|
0.1 | 206 493 |
0.05 | 125 162 |
0.02 | 58 748 |
0.01 | 32 668 |
0.005 | 17 552 |
To verify that a linear response is seen throughout the tested range, the instrument response is plotted as a function of injected concentration. The results (and associated statistics) are shown in Figure 1. Note from these results that the instrument response appears to be adequately linear throughout the range of tested concentrations (0.005 to 0.100 ppm) and that the R2 value from the Summary of Fit box is adequate (0.99003). The standard deviation (as presented in the summary of fit box in Figure 1 as the Root Mean Square Error) is 8986.8^{Footnote 4}. The equation which describes this relationship (provided in the "Parameter Estimates" box of Figure 1) is as follows:
Y = 15120 + 1 973 098 × concentration
where Y is the instrument response (peak height).
The estimated LOD and LOQ are calculated as follows (assuming these values are set to be 3 and 10 standard deviations above the blank response, respectively):
- The Peak Height at the LOD (YLOD) is calculated at 3 times the standard deviation, while the Peak Height at the LOQ (YLOQ) is calculated at 10 times the standard deviation
- Y_{LOD} = 15120 + 3 × (8987) = 42 081
- Y_{LOQ} = 15120 + 10 × (8987) = 104 990
r = 0.9950
m = 1 973 098.5
b = 15 120
x_{i} | y_{i} | yhat | |y_{i}-yhat| | (|y_{i}-yhat|)^{2} |
---|---|---|---|---|
0.100 | 206 493 | 212 431 | 5938 | 35 260 680 |
0.050 | 125 162 | 113 774 | 11 388 | 129 696 698 |
0.020 | 58 748 | 54 579 | 4169 | 17 380 190 |
0.010 | 32 668 | 34 848 | 2180 | 4 750 400 |
0.005 | 17 552 | 24 982 | 7440 | 55 350 470 |
∑ = 242 438 439 |
S y/x = [242 438 439/(5 - 2)]^{½}
8989.6
where y_{i} is the observed instrument response and yhat is the predicted instrument response given the "best fit" regression.
Figure 1. Statistical Results Using JMP Software
Summary of Fit
- RSquare: 0.99003
- RSquare Adj: 0.986707
- Root Mean Square Error: 8986.837
- Mean of Response: 88124.6
- Observations (or Sum Wgts): 5
Term | Estimate | Std Error | t Ratio | Prob > |t| | Lower 95% | Upper 95% |
---|---|---|---|---|---|---|
Intercept | 15119.954 | 5834.672 | 2.59 | 0.0810 | -3448.891 | 33688.799 |
Concentration | 1973098.5 | 114317.5 | 17.26 | 0.0004 | 1609283.2 | 2336913.9 |
2. These values (peak height at LOD and peak height and LOQ) are then used to calculate the concentrations associated with these peak heights as follows:
Y = 15120 + 1 973 098 × concentration
Rearranging,
Concentration = (Y - 15 120) / 1 973 098
Therefore,
LOD = Y_{LOD} - 15120 / 1 973 098 = (42 081 - 15 120)/1 973 098 = 0.014 ppm => 0.014
LOQ = Y_{LOQ} - 15 120/1 973 098 = (104 990 - 15 120)/ 1 973 098 = 0.046 ppm => 0.05
Thus, the initial estimated LOD and LOQ are 0.014 and 0.046 ppm, respectively, which correspond to the IDL and IQL.
Again, these estimated LODs (or IDL) and LOQs (or IQLs) are expressed in terms of the solution concentration and not in terms of the matrix concentration. At this stage, the solution concentration (µg/mL solution) should be converted to the effective concentration in the matrix (e.g., µg/g of matrix).
Step 2
With the initial estimate of LOD (or IDL) and LOQ (or IQL) obtained and linearity verified, Step 2 involves estimating the LOQ and LOD in spiked matrix samples. This procedure uses the estimated instrumental LOQ and the procedure detailed in the U.S. EPA document: 40 CFR Part 136, Appendix B and in the Handbook of Environmental Analysis which provides a better estimate of LOQ and verify that method recoveries are acceptable.
The method calls for the analysis of seven or more (n) untreated control samples spiked at the estimated LOQ. The standard deviation of these samples is measured and the LOD and LOQ are determined as follows:
LOD= t_{0.99} × S
LOQ = 3 × LOD
where t = one-tailed t statistic at the 99% confidence level for n - 1 replicates and S = standard deviation of n samples spikes at the estimated LOQ
# of Replicates (n) | Degrees of Freedom (n - 1) | t_{0.99} | # of Replicates (n) | Degrees of Freedom (n - 1) | t_{0.99} |
---|---|---|---|---|---|
3 | 2 | 6.965 | 13 | 12 | 2.681 |
4 | 3 | 4.541 | 14 | 13 | 2.65 |
5 | 4 | 3.747 | 15 | 14 | 2.624 |
6 | 5 | 3.365 | 16 | 15 | 2.602 |
7 | 6 | 3.143 | 17 | 16 | 2.583 |
8 | 7 | 2.998 | 18 | 17 | 2.567 |
9 | 8 | 2.896 | 19 | 18 | 2.552 |
10 | 9 | 2.821 | 20 | 19 | 2.539 |
11 | 10 | 2.764 | 21 | 20 | 2.528 |
12 | 11 | 2.718 | 22 | 21 | 2.518 |
Concentration Detected (ppm) | % Recovery |
---|---|
0.0397 | 79.4 |
0.0403 | 80.6 |
0.04 | 80 |
0.036 | 72 |
0.0498 | 99.6 |
0.0379 | 75.8 |
0.0388 | 77.6 |
Average Concentration: 0.0404 ppm Standard Deviation: 0.0044 ppm Average Recovery: 80.7% |
Given that recoveries are adequate at the LOQ (average = 80.7%, range = 72.0-99.6%), the LOD and LOQ for the method are estimated as follows:
LOD = t0.99 × S (for 7 - 1 = 6 degrees of freedom)
= 3.365 × 0.0044 ppm
= 0.0148 ppm
LOQ = 3 × LOD
= 3 × 0.0148 ppm
= 0.0444 ppm
References
- Berthoux, P.M., and L.C. Brown 1994. Statistics for environmental engineers. Ch. 10. Simple methods for analyzing data that are below the limit of detection. 11. Estimating the mean of censored samples. pp. 81-95.
- Cohen, A.C., Jr. 1959. Simplified estimators for the normal distribution when samples are singly censored or truncated. Technometrics, 1: 217-237.
- Cohen, A.C., Jr. 1961. Tables for maximum likelihood estimates: singly truncated and singly censored samples. Technometrics, 3: 535-541.
- Gilbert, R.O. 1987. Statistical methods for environmental pollution monitoring. John Wiley and Sons, New York.
- Helsel, D.R. 1990. Less than obvious: statistical treatment of data below the detection limit. Environ. Sci Technol. 24: 1766-1774.
- International Life Science Institute (ILSI). 1998. Aggregate exposure assessment. Appendix 3. Analysis of data including observations below certified reporting limits.
- Keith, L.H., et al. 1983. Principles of environmental analysis. Anal. Chem. 55(14): 2210-2218. Ott, W.R. 1995. Environmental statistics and data analysis. Ch. 8. Dilution of pollutants. Ch. 9. Lognormal processes. Lewis Publishers, New York. pp. 192-293.
- Perkins, J.L., G.N. Cutter, and M.S. Cleveland 1990. Estimating the mean, variance, and confidence limits from censored (<limit of detection), log-normally distributed exposure data. Am. Ind. Hyg. Assoc. J. 51: 416-419.
- U.K. Ministry of Agriculture, Fisheries, and Food (MAFF). 1997. Data supplied with unit to unit variation of pesticides in fruit and vegetables. March 14. U.K. Ministry of Agriculture, Fisheries, and Food, Pesticides Safety Directorate, London.
- U.S. Environmental Protection Agency (U.S. EPA). 1998a. Guidance for data quality assessment: practical methods for data analysis. EPA QA/G9. QA-97 Version. Office of Research and Development. EPA/600/R-96/084. U.S. Environmental Protection Agency, Washington, D.C.
- U.S. Environmental Protection Agency (U.S. EPA). 1998b. Guidance for submission of probabilistic human health exposure assessments to the Office of Pesticide Programs. Draft document. November4. U.S. Environmental Protection Agency, Washington, D.C.
- U.S. Environmental Protection Agency (U.S. EPA). 1999. Classification of food forms with respect to level of blending. HED Standard Operating Procedure 99.6 (8/20/99) August 20. A-1.
- U.S. Environmental Protection Agency, Office of Pesticide Programs, Health Effects Division, Washington, D.C.
- U.S. Environmental Protection Agency (U.S. EPA). 2000. Assigning values to nondetected/nonquantified pesticide residues in human health food exposure assessments. U.S. Environmental Protection Agency, Washington, D.C.
Report a problem or mistake on this page
- Date modified: