Technical Support Center for Monitoring and Site Characterization
As of March 1, 2011, all updates to ProUCL starting with version 4.1.00 can be found at http://www.epa.gov/osp/hstl/tsc/software.htm.
Statistical Software ProUCL 4.0 for Environmental Applications For Data Sets with and without Nondetect Observations
Announcing ProUCL Version 4.00.05
ProUCL version 4.0 includes statistical methods that can be used to estimate exposure point concentration (EPC) terms, not-to-exceed values, and background threshold values (BTVs) for data sets with nondetect (ND) and without ND observations. ProUCL version 4.0 retains all of the capabilities of ProUCL version 3.0 and ProUCL version 4.0 addresses various statistical issues arising in: exposure and risk assessment studies, background evaluations, and background-versus-site comparison applications. Specifically, most of the statistical methods described and recommended in the Background Guidance Document for CERCLA Sites (EPA, 2002a), and in the Guidance Document to Compute 95% Upper Confidence Limits (EPA, 2002b), have been incorporated into ProUCL version 4.0. ProUCL version 4.0 has statistical methods that can be used to verify the attainment of cleanup standards (EPA, 1989), and to estimate screening levels (EPA, 1996) for data sets with and without NDs. Some of the statistical methods (e.g., two-sample hypotheses, upper prediction limit, and upper tolerance limits), as incorporated in ProUCL version 4.0, can be used in groundwater (GW) monitoring applications (EPA, 1992).
ProUCL version 4.0 has goodness-of-fit (GOF) tests for normal, lognormal, and gamma distributed data sets with or without NDs. For data sets with NDs, ProUCL version 4.0 can create additional columns to store extrapolated (estimated) values for the NDs obtained using regression on order statistics (ROS) methods, including: normal ROS, gamma ROS, and lognormal ROS (robust ROS) methods. ProUCL version 4.0 can process multiple contaminants (variables) simultaneously in a batch mode. ProUCL version 4.0 also has the capability of processing data by groups (a valid group column should be included in the data file). ProUCL version 4.0 has a couple of simple outlier test procedures, such as the Dixon test and the Rosner test. ProUCL version 4.0 offers useful graphical displays for data sets with or without NDs, including: histograms, multiple quantile-quantile (Q-Q) plots, and side-by-side box plots. The use of graphical displays provides additional insight about information (such as hidden data structures) contained in data sets that may not be revealed by the use of estimates (e.g., 95% upper limits) or test statistics, such as the GOF test statistics or the Rosner test. In addition to providing information about the data distribution (e.g., normal or gamma), Q-Q plots are very useful in identifying potential outliers or the presence of mixture samples (e.g., data from different populations) in a data set. Side-by-side box plots and multiple Q-Q plots are useful to visually compare two or more data sets (groups), such as: site-versus-background contaminant concentrations, surface-versus-subsurface concentrations, contaminant concentrations of groundwater monitoring wells (MWs), and the areas of concern (AOCs) of a potentially polluted site.
ProUCL version 4.0 has several parametric and nonparametric single-sample and two-sample hypotheses testing approaches. Single-sample hypotheses tests (e.g., Student’s t-test, the sign test, the Wilcoxon signed rank test, and the proportion test) can be used to compare site mean concentrations (or some site threshold value, such as an upper percentile) with some average cleanup standard, Cs (or a not-to-exceed compliance limit, A0), to verify the attainment of cleanup levels (EPA, 1989; EPA, 2006) after some remediation activities have been performed at the impacted site areas. Two-sample hypotheses tests (e.g., Student’s t-test, the Wilcoxon-Mann-Whitney test, the quantile test, and Gehan’s test) can be used for site-versus-background comparisons, comparing concentrations of site AOCs, and comparisons of contaminant concentrations of two or more MWs. The hypothesis-testing approaches can be used on both uncensored (without NDs) and left-censored (with NDs) data sets. ProUCL version 4.0 also has parametric (e.g., maximum likelihood estimate (MLE), t-statistic, and the gamma distribution), nonparametric (e.g., skewness-adjusted CLT and Kaplan-Meier), and computer intensive bootstrap (e.g., percentile and BCA) methods to compute upper confidence limits (UCLs), upper prediction limits (UPLs), and upper tolerance limits (UTLs) for both uncensored data sets and for data sets with ND observations. Some of the methods, such as the Kaplan-Meier (KM) method and the robust ROS methods, are applicable on left-censored data sets having multiple detection limits.
Please note that ProUCL was developed under the umbrella comprehensive statistical software package, Scout. The Scout 2008 software program contains all of the methods in ProUCL as well as many more sophisticated methods for a more comprehensive or exploratory statistical analysis of censored or uncensored data (including full intervals). The latest version of Scout 2008 (which contains ProUCL) may be down loaded at: http://www.epa.gov/nerlesd1/databases/datahome.htm
References
USEPA (1989). Methods for Evaluating the Attainment of Cleanup Standards, Vol. 1, Soils and Solid Media, Publication EPA 230/2-89/042.
USEPA (1992). Statistical Analysis of Ground-water Monitoring Data at RCRA Facilities. Addendum to Interim Final Guidance. Washington DC: Office of Solid Waste. July 1992.
USEPA (1996). Soil Screening Guidance: User’s Guide. Office of Solid Waste and Emergency Response, Washington D.C. EPA/540/R-96/018, April 1996.
USEPA. 2002a. Guidance for Comparing Background and Chemical Concentrations in Soil for CERCLA Sites. EPA 540-R-01-003-OSWER 9285.7-41. September 2002.
USEPA. 2002b. Calculating Upper Confidence Limits for Exposure Point concentrations at Hazardous Waste Sites. OSWER 9285.6-10. December 2002.
ProUCL 3.0. (2004). A Statistical Software. National Exposure Research Lab, EPA, Las Vegas Nevada, October 2004.
USEPA. 2006. Data Quality Assessment: Statistical Methods for Practitioners, EPA QA/G-9S. EPA/240/B-06/003. Office of Environmental Information, Washington, D.C. Download from: http://www.epa.gov/quality/qs-docs/g9s-final.pdf [198 pp, 2.4MB, About PDF]
Announcing ProUCL Version 4.00.05
ProUCL version 4.00.05 is an upgrade of ProUCL version 4.00.04 (EPA, 2009a). ProUCL version 4.00.05 consists of all of the statistical and graphical methods that have been available in previous ProUCL 4.0 versions, addressing various environmental issues for full data sets (that is, without nondetect (ND) observations), as well as for data sets with NDs or below detection limit observations. Several additions (e.g., a sample size determination module), enhancements (a file module), and modifications (such as adding p-values for the Wilcoxon-Mann-Whitney (WMW) test, also known as the Wilcoxon Rank Sum (WRS), Mann-Whitney-Wilcoxon (MWW) or Mann-Whitney U test) have been made in ProUCL version 4.00.05. Some software bug fixes (e.g., corrections in the adjusted gamma UCLs), as suggested by the users of previous ProUCL 4.0 versions, have been resolved in this latest version of ProUCL. With the inclusion of the sample size determination module, ProUCL version 4.00.05 is a comprehensive statistical software package equipped with statistical methods and graphical tools needed to address the many environmental sampling and statistical issues, as described in various CERCLA (EPA 2002a, 2002b, 2006) and RCRA (EPA 1989b, 1992b, 2002c, 2009b) guidance documents. For data sets with or without nondetect observations, ProUCL version 4.00.05 also provides statistical methods to address reference area and survey unit sampling issues described in the MARSSIM (EPA 2000) document. In addition to sample size determination methods, ProUCL version 4.00.05 offers parametric and nonparametric statistical methods (e.g., the sign test and the WMW (or WRS) test) often used to address statistical issues described in the MARSSIM (EPA 2000) guidance document. The user-friendly sample size determination module of ProUCL version 4.00.05 has a straight forward procedure to enter the desired or pre-specified decision parameters needed to compute the appropriate sample size(s) for a selected statistical application. The sample size module of ProUCL version 4.00.05 provides sample size determination methods for most of the parametric and nonparametric one-sided and two-sided hypotheses testing approaches available in the hypothesis testing module of the ProUCL 4.0 versions. Some specific changes made in ProUCL version 4.00.05 are listed as follows.
- File Option. This option has been upgraded to open Excel (*.xls) files by default. In the earlier versions of ProUCL, this option was available only for worksheet (*.wst), output (*.ost), or graphics (*.gst) files; Excel files had to be imported. In this new version, the import option can be used to read multiple worksheets from one Excel file. ProUCL version 4.00.05 can import worksheets until all of the available worksheets are read or until a blank or empty worksheet is encountered.
- Displaying All Menu Options. For the user’s convenience, ProUCL version 4.00.05 now displays all of the available menu options, even before opening a valid (e.g., non empty) data file. Nonetheless, a valid data file must be opened before activating a menu option or using a statistical or graphical method (that is, an option or method cannot be used on an empty (without any data) spreadsheet).
- Sample Size Module. Several parametric (assumes a distribution) and nonparametric sample size determination formulae, as used and described in various EPA guidance documents (e.g., EPA 1989a, 1989b, 1992, 2000, 2002a, 2002b, 2002c, 2006, and 2009b), have been incorporated into ProUCL version 4.00.05. The inclusion of this module will help the users to develop DQO based sampling plans with pre-specified values of the decision error rates (α = Type I, and β =Type II) and with a pre-specified width of the “gray” region, Δ, around the parameter of interest (e.g., the mean concentration or the proportion of the sampled observations that exceed an action level). Basic sample size determination formulae have been incorporated for the sampling of continuous characteristics (such as lead or 226Ra), as well as for attributes (for example, the proportion exceeding a specified threshold). Sample size formulae for the acceptance sampling of discrete objects (e.g., drums) have also been incorporated into this module.
- Adjusted Gamma UCL. A minor software bug in the computation of the adjusted level of significance (the adjusted probability level, β) used for calculating adjusted gamma UCLs has been corrected.
- Computation of Nonparametric Percentiles. There are several ways to compute nonparametric percentiles; also, percentiles obtained using various methods can differ slightly. For graphical displays, ProUCL version 4.00.05 uses the development software, ChartFX. Thus, box plots generated by ProUCL display percentiles (e.g., median and quartiles) as computed by ChartFX. In order to avoid confusion, the percentile algorithm used in ProUCL has been modified so that it now computes and displays comparable percentiles as computed by ChartFX.
- UCL Based Upon the Winsorization Method. In the computation of Winsorized UCLs, the sample standard deviation of the Winsorized data was being used instead of the approximate unbiased estimate of the population standard deviation from the Winsorized data, as detailed in the ProUCL version 4.00.04 Technical Guide. This has been corrected in ProUCL version 4.00.05.
- Displaying K Values Used in UTL. : The tolerance factor, K, based on the number of valid observations, the level of the confidence coefficient, and the coverage percentage, is displayed along with the UTL statistics and other relevant input parameters.
- Fixes in the p-Values Associated with the WMW (WRS) Test, the Gehan Test, and the Equality of Variances Test. More efficient algorithms have been incorporated into the ProUCL version 4.00.05 software to compute the p-values associated with the test statistics associated with those tests.
- Additional Critical Values Associated with Land's H Statistic. In addition to the 0.90 and 0.95 confidence coefficients; 0.975, 0.990, and 0.995 confidence levels have been incorporated into the ProUCL version 4.00.05 software to compute the H-UCLs based upon a lognormal distribution.
- Adjustment in the Precision Associated with the Lognormal ROS and the Gamma ROS Methods. The lower bound associated with the lognormal ROS and the gamma ROS extrapolated estimates have been extended from 1e-7 to 1e-10. ProUCL version 4.00.05 issues a warning message when extrapolated ROS estimates lie below 1e-10.
- Some terminology changes have been made in single sample hypothesis approaches available in the hypothesis testing module. Specifically, the phrase, “Compliance Limit,” has been replaced by the phrase, “Action Level.”
References for ProUCL Version 4.00.05
U.S. Environmental Protection Agency (EPA). 1989a. Methods for Evaluating the Attainment of Cleanup Standards, Vol. 1, Soils and Solid Media. Publication EPA/230/2-89/042.
U.S. Environmental Protection Agency (EPA). 1989b. Statistical Analysis of Ground-water Monitoring Data at RCRA Facilities. Interim Final Guidance. Washington, DC: Office of Solid Waste. April 1989.
U.S. Environmental Protection Agency (EPA). 1992. Statistical Analysis of Ground-water Monitoring Data at RCRA Facilities. Addendum to Interim Final Guidance. Washington DC: Office of Solid Waste. July 1992.
U.S. Environmental Protection Agency (EPA), 2000. U.S. Nuclear Regulatory Commission, et al. 2000 Multi‑Agency Radiation Survey and Site Investigation Manual (MARSSIM). Revision 1. EPA/402/R-97/016. Available at http://www.epa.gov/radiation/marssim/ or from http://bookstore.gpo.gov/ (GPO Stock Number for Revision 1 is 052‑020‑00814‑1).
U.S. Environmental Protection Agency (EPA). 2002a. Calculating Upper Confidence Limits for Exposure Point Concentrations at Hazardous Waste Sites. OSWER 9285.6-10. December 2002.
U.S. Environmental Protection Agency (EPA). 2002b. Guidance for Comparing Background and Chemical Concentrations in Soil for CERCLA Sites. EPA/540/R-01/003-OSWER 9285.7-41. September, 2002.
U.S. Environmental Protection Agency (EPA). 2002c. RCRA Waste Sampling, Draft Technical Guidance – Planning, Implementation and Assessment. EPA/530/D-02/002, 2002.
U.S. Environmental Protection Agency (EPA). 2006. Data Quality Assessment: Statistical Methods for Practitioners, EPA QA/G-9S. EPA/240/B-06/003. Office of Environmental Information, Washington, DC. Download from http://www.epa.gov/quality/qs-docs/g9s-final.pdf
U.S. Environmental Protection Agency (EPA). 2009a. ProUCL Version 4.00.04 User Guide (Draft). EPA/600/R-07/038, February 2009. Down load from http://www.epa.gov/nerlesd1/databases/datahome.htm
U.S. Environmental Protection Agency (EPA). 2009b. Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities. Unified Guidance Document (UGD). EPA/530/R-09/007, 2009.
Contact Information for ProUCL 4.00.05
The ProUCL software is developed under the direction of the Technical Support Center (TSC). As of November 2007, the direction of the TSC is transferred from Brian Schumacher to Felicia Barnett. Therefore, any comments or questions concerning ProUCL should be addressed to:
Felicia Barnett, (HSTL)
US EPA, Region 4
61 Forsyth Street, S.W.
Atlanta, GA 30303-8960
barnett.felicia@epa.gov
(404) 562-8659
Fax: (404) 562-8439
(Note: As of March 1, 2011, all updates to ProUCL starting with version 4.1.00 can be found at http://www.epa.gov/osp/hstl/tsc/software.htm.)
Down Load Software and Documents for ProUCL Version 4.00.05
Down Load Software and Documents for Past Versions of ProUCL
Facts Sheet for ProUCL 4.0 (PDF) (18 pp, 532KB)
![[logo] US EPA](http://www.epa.gov/epafiles/images/logo_epaseal.gif)