Premise 10 - Graphs Reveal Biological Responses
Click below to view some of the premises from Karr and Chu (1999).
FROM "Restoring Life in Running Waters" by James R. Karr and Ellen W. Chu. (Reprinted with permission from Island Press)
"Often the most effective way to describe, explore, and summarize a set of numbers (even a very large set) is to look at pictures of those numbers... [Of] all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful" (Tufte 1983: 9; see also Tufte 1990, 1997). Tufte's message is nowhere more important than in the display, interpretation, and communication of biological monitoring data.
Figure 11:
Figure 11: Example of two hypothetical metrics plotted against a gradient of human influence.
Here, statistical correlation and graphical analysis agree; metric A is a good
indicator, and metric B is not. (Compare figure 12).
Graphs reveal the biological responses important for evaluating metrics more clearly than do strictly statistical tools. They exploit "the value of graphs in forcing the unexpected" (Mosteller and Tukey 1977) on whoever looks at them, including researchers, who must then confront and explain the pattern in those graphs. For samples where the relationship between human influence and biological response is strong, statistics and graphs agree (Figure 11). In other cases, meaningful biological patterns can be lost by excessive dependence on the outcome of menu-driven statistical tests. Statistical correlation can miss an important relationship if the x-variable (e.g., percentage of area logged) is measured with low precision or if additional factors beyond those plotted on the x-axis influence metric values but are not included in the statistical analysis.
Figure 12:.

Figure 12: Hypothetical relationships between human influence and candidate biological metrics (from Fore et al. 1996). Metric A is more strongly correlated with resource condition (or r2 is higher if using regression) than Metric B, initially suggesting that it is a better metric. But comparing the metric's ability to distinguish between minimally disturbed sites (denoted by plus signs) and severely degraded sites (open boxes; ranges noted by arrows) shows that Metric B is actually a more effective measure of biological condition despite its smaller statistical correlation. (Compare Figure 11).
In Figure 12, for example, we plot two different aspects of biological condition against one measure of human influence, such as the percentage of upstream watershed that has been logged. Sites are assigned a plus or minus based on that measure and other aspects of human influence that are visible and documented but not plotted on the same graph. In forested watersheds, these other aspects might include whether roads were near or far from the stream channel, time since logging, or traits unique to particular watersheds. In some cases, such interacting factors may have degraded biological condition (roads near the stream channel would worsen logging's effects), or they may have allowed good conditions to persist (roads on distant ridges have less effect on streams). The distribution of pluses and minuses in Figure 12 illustrates the fallacy of assuming that a biological metric says nothing about condition because it does not correlate strongly with a single surrogate of that condition, as researchers perennially assume when a biological measure does not correlate with some measure of chemical pollution. Rather, we should conclude that the surrogate is not capturing significant components of human influence and look more closely for the biological explanations behind the data.
Not all aspects of human influence can be easily captured in a single graph or statistical test. When a number of variables influence condition, a single plot against one dimension of human influence will not tell the whole story (Figure 13); neither will a single statistical test. Graphs force us to search for insights that rote application of statistical tests cannot discover.
Figure 13

Figure 13: Taxa richness of Trichoptera plotted against the percentage of watershed area that was logged for 32 stream sites in southwestern Oregon. Metric correlation (Spearman's rho) was not significant because, alone, the percentage of area logged was an inaccurate measure of human influence; other factors, such as type of logging, presence of roads, and other human influences, were not included. When these other human influences were considered, to identify minimally disturbed sites (denoted by plus signs) and severely degraded sites (open boxes), the response of Trichoptera taxa richness visibly distinguished between different degrees of human disturbance.
Weak statistical correlation can also miss important biological patterns when the distribution of the data (e.g., Figure 14) does not lend itself to tests based on standard correlation techniques that detect only linear relationships. Yet nonlinear patterns are common in field data (Figure 15). Consider the plots in Figure 16, for example. The points fall into a wedge-shaped distribution, whose scatter shows little or no statistical significance but can be interpreted biologically. The upper bound of each plot is the hypotenuse of a right triangle (the maximum species richness line) that defines the number of species expected in minimally disturbed streams as a function of stream size (Fausch et al. 1984). The plots illustrate what Thomson et al. (1996) term a "factor ceiling distribution" (see also Blackburn et al. 1992 and Schart et al. 1998 on ecological inferences from the edges of scatter diagrams). In this case, the ceiling maximum species richness is defined by the evolution of the regional biota. Generally at sites where the number of fish species falls below the ceiling, some human activity in the adjacent or upstream watershed has reduced the number of species present; alternatively, sampling might have been inadequate, "dragging" species richness below the line.
Figure 14

Figure 14: Hypothetical relationship between human influence and a Metric A. Statistical correlation (Spearman's rho) is not significant, yet the graphic pattern strongly suggests a biological response. At low levels of human influence, Metric A is not a reliable indicator of biological condition, but where human disturbance is high, the metric does respond.
Figure 15
Figure 15: Relative abundance (percentage of total) of individuals belonging to tolerant taxa in samples of benthic invertebrates from 65 Japanese streams ranked according to intensity of human influence (See Figures 5 and 6). (Data provided by E. M. Rossano).
Figure 16
Figure 16: Number of fish species in relation to stream size (top) and watershed area (bottom); each point represents a site. The maximum species richness line through the highest points on each graph defines the number of species expected in minimally disturbed streams or watersheds. Points below that line represent sites where human activity has reduced the number of species present (from Fausch et al. 1984).
Graphs highlight idiosyncrasies in data distributions that, when examined closely, may provide insight into the causes of a particular biological pattern. At one extreme, outlying points on a graph may offer key insights about the complex influence of human activities in watersheds. The researcher can then explore what unique situations at those sites cause them to appear as outliers.
Even the spread of data can offer insights, as illustrated by the large range in B-IBIs at sites with 20% to 30% impervious area shown in Figure 17. Sites with high mayfly taxa richness (B and C) lie in reaches of two streams with relatively intact riparian corridors and wetlands. The site with low mayfly taxa richness (A) is located in a stream that receives fine material from an old coal mine. Sites A, B, and C had unique characteristics that were best understood by examining their specific contexts, not by applying a regression or correlation analysis. Finding these patterns then led to subsequent studies in the same and in other places to determine if those patterns were more general.
Figure 17
Figure 17: Average taxa richness of Ephemeroptera plotted against percentage of impervious area surrounding Puget Sound lowland streams (from Kleindl 1995). Site A, Coal Creek, had fewer Ephemeroptera than expected. This site has an active mine in its headwaters, and Ephemeroptera are known to be sensitive to mine wastes. Sites B and C had relatively intact riparian areas (wetlands).
Graphs also illustrate variation in behavior among taxa in response to a specific disturbance (Figure 18). For example, numbers of taxa for three orders of insects (stoneflies, mayflies, and caddisflies) declined downstream of the outflow from a streamside sludge pond in the Tennessee Valley, but the magnitude of change varied among the taxa (see also Premise 14). The same graph also reveals the direction and magnitude of change along a longitudinal transect down the stream.
Figure 18
Figure 18: Taxa richness of mayflies, stoneflies, and caddisflies for sites along the North Fork Holsten River in the Tennessee Valley in 1976 (from Kerans and Karr 1994). Arrow indicates the position of the streamside sludge pond. Taxa richnesses for all three orders decline at the sludge pond and slowly recover for sites downstream.
Graphs may sometimes allow researchers to avoid naive application of elaborate multivariate techniques (Beals 1973). Principal components analysis, the most often used ordination technique (James and McCullough 1990), defines statistically orthogonal factors, which may or may not be independent biologically; interpreting the results can therefore be complicated (Goodall 1954). Graphs can be a superior approach to methods that focus on maximum variance extracted because they reveal ecological rather than mathematical associations, a more appropriate criterion for organizing and understanding complex information (Beals 1973).
Complex ecological situations require unusual analytical means. Graphs can often be ecologists' most useful tools, permitting the exploration of ecological data "before, after, and beyond the application of 'standard analyses'" (Augspurger 1996). Rather than choose an inappropriately linear statistical model before plotting their data, ecologists should exploit the power of graphs for "reasoning about quantitative information" (Tufte 1983), and then choose and apply appropriate statistics. It is myopic to be a slave of standard statistical rules and procedures just as it would be to avoid statistics altogether.
References
Augspurger, C. 1996. Editor's note. Ecology 77: 1698.
Beals, E. W. 1973. Ordination: Mathematical elegance and ecological naivete. J. Ecol. 61: 23-35.
Blackburn, T. M., J. H. Lawton, and J. N Perry. 1992. A method for estimating the slope of upper bounds of plots of body size and abundance in natural animal assemblages. Oikos 65: 107-112.
Fausch, K. D., J. R. Karr, and P. R. Yant. 1984. Regional application of an index of biotic integrity based on stream fish communities. Trans. Am. Fish. Soc. 113: 39-55.
Fore, L. S., J. R. Karr, and L. L. Conquest. 1994. Statistical properties of an index of biotic integrity used to evaluate water resources. Can. J. Fish. Aquat. Sci. 51: 1077-1087.
Goodall, D. W. 1954. Objective methods for the classification of vegetation. III. An essay in the use of factor analysis. Aust. J. Bot. 2: 304-324.
James, F. C., and C. E. McCullough. 1990. Multivariate analysis in ecology and systematics: Panacea or Pandora's box? Annu. Rev. Ecol. Syst. 21: 129-166.
Kerans, B. L., and J. R. Karr. 1994. A benthic index of biotic integrity (B-IBI) for rivers of the Tennessee Valley. Ecol. Appl. 4: 768-785.
Kleindl, W. J. 1995. A benthic index of biotic integrity for Puget Sound lowland streams, Washington, USA. MS thesis, University of Washington, Seattle.
Mosteller, F., and J. M. Tukey. 1977. Data Analysis and Regression. Addison-Wesley, Reading, MA.
Scharf, F. S., F. Juanes, and M. Sutherland. 1998. Inferring ecological relationships from the edges of scatter diagrams: Comparison of regression techniques. Ecology 79: 448-460.
Tufte, E. R. 1983. The Visual Display of Quantitative Information. Graphics Press, Cheshire, CT.
Tufte, E. R. 1990. Envisioning Information. Graphics Press, Cheshire, CT.
Tufte, E. R. 1997. Visual Explanations. Graphics Press, Cheshire, CT.
![[logo] US EPA](http://www.epa.gov/epafiles/images/logo_epaseal.gif)