Jump to main content or area navigation.

Contact Us

CADDIS Volume 4: Data Analysis

Getting Started

Matching Data
  • Authors: S.B. Norton, L. Alexander, S.M. Cormier, G.W. Suter II,
    P. Shaw-Allen, L.L. Yuan

Matching Data

When relationships between two or more variables are analyzed, it is essential that these data are appropriately matched, and that the process for matching data and interpreting results is documented.

Here, the basic definition of matched data is a set of biological and environmental measurements taken at the same time and place. When comparing multiple locations (e.g. impaired and reference sites), samples also must be taken at the same time across locations. A simple example illustrating why unmatched samples may not be valid for causal analysis is shown in Figure 3, which plots seasonal shifts in stream temperature in one geographic region. Temperature measurements taken in spring cannot be paired with biological measurements taken in summer from the same site, because shifts in temperature and community structure are likely to co-occur. For the same reason, samples taken in different seasons clearly cannot be used to compare conditions across otherwise similar sites. Temporal and spatial matching become more complicated when considering differences in the stability and scale of environmental variables, and the modes of action by which they affect organisms. However, at the simplest level, matched observations of environmental conditions and biological responses reflect conditions at the same point in time and space.

EMAP sample sites
Figure 3. Histograms of stream temperatures from samples taken in spring (green) and summer (red).
Source: U.S. EPA.

Spatial heterogeneity and temporal stability should be considered when further deciding how data should be matched. For example, large woody debris occurs in localized areas and changes relatively little over time. In the absence of other disturbances, large woody debris need not be re-sampled as frequently as variables such as total suspended solids, which may vary over time and under different flow conditions at a site. Similarly, land cover data taken from national land cover databases need not be matched as closely in time to biological data as the water chemistry parameters at a site.

Relevant spatial and temporal scales also should be considered when deciding how to match data. The mechanism by which a stressor exerts its effect will determine the appropriate temporal scale. For example, "grab samples" of instantaneous stream temperature collected at the same time as a biological sample may be less relevant than the seasonal average stream temperature. Dissolved oxygen, on the other hand, is best measured when it reaches its diurnal extremes to determine if critical concentrations occur. Diurnal cycles also may be present in concentrations of stressors such as metals (Nimick et al. 2003) and nutrients (Scholefield et al. 2005). The potential for time lags between exposure and effects also should be considered. For example, if a stressor, such as a diversion of water flow, prevents salmon from reaching the sea on their out-migration, the effect (i.e., destruction of the salmon run) may not be observed for several years.

Top of page


Jump to main content.