Van Sickle, J., C.P. Hawkins, D.P. Larsen, and A.T. Herlihy. 2005. A null model for the expected macroinvertebrate assemblage in streams. J. N. Am. Benthol. Soc. 24(1):178-191. WED-04-085

Predictive models such as River InVertebrate Prediction And
Classification System (RIVPACS) and AUStralian RTVer Assessment
System (AUSRIVAS) model the natural variation across geographic
regions in the occurrences of macroinvertebrate taxa in data from
streams that are in *reference *condition, i.e., minimally
altered by human-caused stress. The models predict the expected
number of these taxa at any stream site, assuming that site also is
in reference condition. A significant difference between the ratio
of observed (O) and expected (E) taxa (O/E) and 1.0 indicates that
the site is not in reference condition. The standard deviation *
(SD) *of O/E values estimated for a set of reference sites is a
measure of predictive-model precision, with a small SD indicating
that the model accounts for much of the variability in E that is
associated with natural factors such as stream size and elevation.
We propose a null model for E that assumes fixed occurrence
probabilities for individual taxa across reference sites. The null
model explains none of the variability in E caused by natural
factors, so the SD of its O/E predictions is the upper limit
attainable by any predictive model. We also derive a theoretical
lower limit for SD of O/E that is caused only by replicate-sampling
variation among predictions from a perfect model. Together, the
null-model and replicate-sampling SDs estimate the minimum and
maximum precision, respectively, attainable by any predictive model
for a given set of reference-site data. A predictive model built
from data at 86 reference sites in the Mid-Atlantic Highlands
region, USA, had SD = 0.18 for O/E across those sites, while the
corresponding null model had SD = 0.20, indicating relatively little
gain from the predictive-model effort. In contrast, a model built
from 209 sites in North Carolina, USA, had predictive- and
null-model SDs of 0.13 and 0.28, respectively, indicating that the
North Carolina predictive model had relatively high gain in
precision over the null model. Replicate-sampling SDs of O/E for the
Mid-Atlantic and North Carolina data were 0.09 and 0.11,
respectively, suggesting that the North Carolina predictive model
had little room for further improvement, in contrast to the
Mid-Atlantic model. The precisions of null-model estimates were
lower than those of predictive models, so null models somewhat
underestimated the percentages of 447 and 1773 test assemblages from
the Mid-Atlantic region and North Carolina, respectively, that
differed significantly from reference conditions. The estimates
illustrate how a simple and easily built null model provides a lower
bound for the prevalence of impaired streams within a region.