Jackknife
description | MAIA example | how it works | caveats
Description: You have a single sample which yields a single estimate of some test statistic of interest. In order to estimate the variance associated with the test statistic a jackknife algorithm resamples from the original sample by eliminating 1 of the values or cases each time. From the multiple samples you can calculate the test statistic as many times as their are observations in the sample. From the jackknifed estimates you can calculate other statistics of intereste such as variance.
MAIA example: Pan et al. (1996) developed a mathematical model that used diatoms to predict pH at stream sites. They tested their model's ability to predict pH by regressing predicted pH against measured pH for each site. To avoid circularity in their test of their model, they used a jackknife procedure to avoid using exactly the same data to build and test the model. For each iteration they omitted the diatom data for 1 stream site, estimated the pH for that site using data from all the other sites, and then calculated the r-squared value for the regression of predicted vs. observed pH. They repeated this process for all the sites, leaving out 1 at a time.
Figure
Figure. Measured pH is plotted against pH inferred from the diatoms present at a stream site. Agreement between measured and inferred pH was evaluated with and without jackknifing the data. Results based on jackknifing were most different from estimates without jackknifing for very acidic sites.
How the method works: The jackknife approximates a more general method, that is, the bootstrap. Rather than resampling randomly from the entire sample like the bootstrap does, the jackknife takes the entire sample except for 1 value, and then calculates the test statistic of interest. It repeats the process, each time leaving out a different value, and each time recalculating the test statistic.
Assumptions/limitations: The jackknife assumes that the values came from a sample that was collected randomly and that the observations in the sample are independent.
![[logo] US EPA](http://www.epa.gov/epafiles/images/logo_epaseal.gif)