Sampling Bias

Description: Unless the entire population is sampled, for example, every fish in every stream in a region, bias can creep into estimates. We expect some error associated with sample estimates, and that's ok. But when some factor affects the results in a consistent or systematic way, that's bias. Bias can lead to conclusions that are wrong. We can control for the factors we know about, such as upland vs. lowland ecoregion; but bias related to unknown factors is harder to spot. It requires that you remember something you forgot to consider in the first place, a tricky business. Randomization can protect you from bias because the design is less likely to be confounded by some factor that you didn't initially consider.
Simple example: Suppose you are collecting invertebrates from streams in a region with mountains and basins. Stream water in the mountains is much colder and so you spend less time with your hands in the water collecting invertebrates. The biased conclusion you might draw is that there were fewer invertebrates in mountain streams because sampling effort was less intense.
MAIA example: In the process of developing a regional fish IBI (Stoddard, pers. comm. and McCormick, et al. [in review]), 2 factors were found to influence the number of benthic fish species found at a site: human disturbance (which the authors wanted to assess) and watershed area (which they were not interested in for assessment purposes). If ignored, watershed size would bias any conclusions about the condition of the fish assemblage because small watersheds would appear to be more degraded than larger watersheds. They controlled for the influence of watershed size before testing metrics for correlation with human disturbance.
Figure
Figure: The number of benthic fish species increased with the size of the watershed for least disturbed, or reference, stream sites.
![[logo] US EPA](http://www.epa.gov/epafiles/images/logo_epaseal.gif)