|
|
|||||||||
|
Chapter 9 (Part B): Biological Data AnalysisRBP Home | Table of Contents | Download the RBP | Chapter 1 | Chapter 2 | Chapter 3 | Chapter 4 | Chapter 5 | Chapter 6 | Chapter 7 | Chapter 8 | Chapter 9 | Chapter 10 | Chapter 11 | Appendix A | Appendix B | Appendix C | Appendix D
Step 1. Classify the Stream Resource
Site classification provides a framework for organizing and interpreting natural variability among streams; ecoregions are a principal example of a classification framework (Omernik 1995). However, classification variables can be at a coarser or finer scale than ecoregions or subecoregions, such as elevation and drainage area. Elevation was determined to be an important classification variable in montane regions of the country (Barbour et al. 1992, 1994, Spindler 1996). Spindler (1996) found that benthic data adhered more closely to elevation than to ecoregions. Ohio EPA (1987) found that stream size (or drainage area) was a covariate and not a determinant of stream classes. The number of fish species increased with stream size (Figure 9-3).
Classification is best accomplished with reference sites that reflect the most natural and representative condition of the region. Candidate reference sites that are based on minimally degraded physical habitat and water chemistry are used as the basis for stream classification. Quantitative criteria for reference sites aid in a consistent framework for selection. An example of quantitative criteria for identifying reference sites in a statewide study for Maryland (Roth et al. 1997) is presented below (a reference site must meet all 12 criteria):
Sites are initially classified according to distinctive geographic, physical, or chemical attributes. Refinement and confirmation of the site classes is accomplished using the biological data (Figure 9-4). Classification is used to determine whether the sampled sites should be placed into specific groups that will minimize variance within groups and maximize variance among groups. As an example, 3 ecoregionally based delineations (bioregions) were effective at partitioning the variability among reference sites in Florida (Figure 9-5). Components of Step 1 include:
Step 2. Identify Potential Measures For Each Assemblage
Metrics allow the investigator to use meaningful indicator attributes in assessing the status of assemblages and communities in response to perturbation. The definition of a metric is a characteristic of the biota that changes in some predictable way with increased human influence (Barbour et al. 1995). For a metric to be useful, it must have the following technical attributes: (1) ecologically relevant to the biological assemblage or community under study and to the specified program objectives; (2) sensitive to stressors and provides a response that can be discriminated from natural variation. The purpose of using multiple metrics to assess biological condition is to aggregate and convey the information available regarding the elements and processes of aquatic communities. All metrics that have ecological relevance to the assemblage under study and that respond to the targeted stressors are potential metrics for testing. From this "universe" of metrics, some will be eliminated because of insufficient data or because the range of values is not sufficient for discrimination between natural variability and anthropogenic effects. This step is to identify the candidate metrics that are most informative, and therefore, warrant further analysis. The potential measures that are relevant to the ecology of streams within the region or state should be selected to ensure that various aspects of the elements and processes of the aquatic assemblage are addressed. Representative metrics should be selected from each of 4 primary categories: (1) richness measures for diversity or variety of the assemblage; (2) composition measures for identity and dominance; (3) tolerance measures that represent sensitivity to perturbation; and (4) trophic or habit measures for information on feeding strategies and guilds. Karr and Chu (1999) suggest that measures of individual health be used to supplement other metrics. Karr has expanded this concept to include metrics that are reflective of landscape level attributes, thus providing a more comprehensive multimetric approach to ecological assessment (Karr et al. 1987). See Table 9-1 for potential metrics that have been useful for periphyton, benthic macroinvertebrates, and fish are summarized in Chapters 6, 7, and 8, respectively. Components of Step 2 include:
Step 3. Select Robust Measures Core metrics are those that will discriminate between good and poor quality ecological conditions. It is important to understand the effects of various stressors on the behavior of specific metrics. Metrics that are responsive to specific pollutants or stressors, where the response is well-characterized, are most useful as a diagnostic tool. Core metrics are those that represent diverse aspects of structure, composition, individual health, or processes of the aquatic biota. Together they form the foundation for a sound, integrated analysis of the biotic condition to judge attainment of biological criteria.
Discriminatory ability of biological metrics can be evaluated by comparing the distribution of each metric at a set of reference sites with the distribution of metrics from a set of "known" stressed sites (defined by physical and chemical characteristics) within each site class. If there is minimal or no overlap between the distributions, then the metric can be considered to be a strong discriminator between reference and impaired conditions (Figure 9-6). As was done with candidate reference sites (see Step 1), criteria are established to identify a population of "known" stressed sites based on physical and chemical measures of degradation. An example set of criteria established for Maryland streams for which failure indicated a stressed site for testing discriminatory power (Roth et al. 1997) is as follows:
Step 3 can be separated into 2 elements that correspond to discrimination of core metrics (element 1) and determination of biological/physicochemical associations (element 2). Components of these elements include: Element 1 Select core measures that are best for discriminating degraded condition
Element 2 Determine the associations/linkages between candidate biological and physicochemical measures
Step 4. Determine the best aggregation of core measures for indicating status and change in condition
The purpose of an index is to provide a means of integrating information from the various measures of biological attributes (or metrics). Metrics vary in their scale--they are integers, percentages, or dimensionless numbers. Prior to developing an integrated index for assessing biological condition, it is necessary to standardize core metrics via transformation to unitless scores. The standardization assumes that each metric has the same value and importance (i.e., they are weighted the same), and that a 50% change in one metric is of equal value to assessment as a 50% change in another. Where possible, the scoring criterion for each metric is based on the distribution of values in the population of sites, which include reference streams; for example, the 95th percentile of the data distribution is commonly used (Figure 9-7) to eliminate extreme outliers. From this upper percentile, the range of the metric values can be standardized as a percentage of the 95th percentile value, or other (e.g., trisected or quadrisected), to provide a range of scores. Those values that are closest to the 95th percentile would receive higher scores, and those having a greater deviation from this percentile would have lower scores. For those metrics whose values increase in response to perturbation (see Table 7-2 for examples of "reverse" metrics for benthic macroinvertebrates) the 5th percentile is used to remove outliers and to form a basis for scoring.
Alternative methods for scoring metrics, as illustrated in Figure 9-7, are currently in use in various parts of the US for multimetric indexes. A "trisection" of the scoring range has been well-documented (Karr et al. 1986, Ohio EPA 1987, Fore et al. 1996, Barbour et al. 1996b). A "quadrisection" of the range has been found to be useful for benthic assemblages (DeShon 1995, Maxted et al. in press). More recent studies are finding that a standardization of all metrics as percentages of the 95th percentile value yields the most sensitive index, because information of the component metrics is retained (Hughes et al. 1998). Unpublished data from statewide databases for Idaho, Wyoming, Arizona, and West Virginia, are supportive of this third alternative for scoring metrics. Ideally, a composite of all sites representing a gradient of conditions is used. This situation is analogous to a determination of a dose/response relationship and depends on the ability of incorporating both reference and non-reference sites. Aggregation of metric scores simplifies management and decision making so that a single index value is used to determine whether action is needed. Biological condition of waterbodies is judged based on the summed index value (Karr et al. 1986). If the index value is above a criterion, then the stream is judged as "optimal" or "excellent" in condition. The exact nature of the action needed (e.g., restoration, mitigation, pollution enforcement) is not determined by the index value, but by analyses of the component metrics, in addition to the raw data and integrated with other ecological information. Therefore, the index is not the sole determinant of impairment and diagnostics, but when used in concert with the component information, strengthens the assessment (Barbour et al. 1996a). Components of Step 4 include:
Step 5. Index thresholds for assessment and biocriteria
The multimetric index value for a site is a summation of the scores of the metrics and has a finite range within each stream class and index period depending on the maximum possible scores of the metrics (Barbour et al. 1996c). This range can be subdivided into any number of categories corresponding to various levels of impairment. Because the metrics are normalized to reference conditions and expectations for the stream classes, any decision on subdivision should reflect the distribution of the scores for the reference sites. For example, division of the Wyoming benthic IBI range (aggregation of metric scores) within each stream class provides 5 ordinal rating categories for assessment of impairment (Stribling et al. 1999, Figure 9-8). The 5 rating categories are used to assess the condition of both reference and non-reference sites. Most of the reference sites should be rated as good or very good in biological condition, which would be as expected. However, a few reference sites may be given the rating as poor sporadically among the collection dates. If a "reference" site consistently receives a fair or poor rating, then the site should be re-evaluated as to its proper assignment. Putative reference sites may be rated "poor" for several reasons:
An understanding of variability is necessary to ensure that sites that are near the threshold are rated with known precision (discussed in more detail in Chapter 4). To account for variance associated with measurement error in an assessment, replication is required. The first step is to estimate the standard deviation of repeated measures of streams. The standard deviation is calculated as the root mean square error (RMSE) of an analysis of variance (ANOVA), where the sites are treatments in the ANOVA. As an example, the question of precision was tested for the Wyoming Benthic IBI scores in the stream classes. This study showed that the 95% confidence interval (CI) around a single sample is ±8 points, on a scale of 100 (Table 9-2). What if a single site was sampled with no replication and found to be points below the biocriterion? The rightmost column (Table 9-2) shows that a triplicate sample is required for a 95% CI less than 5 points. These conclusions make 3 assumptions:
Components of Step 5 include:
Table 9-2. Statistics of repeated samples in Wyoming and the detectable difference (effect size) at 0.10 significance level. The index is on a 100 point scale (taken from Stribling et al. 1999).
Once the framework for bioassessment is in place, conducting bioassessments becomes relatively straightforward. Either a targeted design that focuses on site-specific problems or a probability-based design, which has a component of randomness and is appropriate for 305(b), area-wide, and watershed monitoring, can be done efficiently. Routine monitoring of reference sites should be based on a random selection procedure, which will allow cost efficiencies in sampling while monitoring the status of the reference condition of a state's streams. Potential reference sites of each stream class would be randomly selected for sampling, so that an unbiased estimate of reference condition can be developed. A randomized subset of reference sites can be resampled at some regular interval (e.g., a 4 year cycle) to provide information on trends in reference sites. A reduced effort in monitoring reference sites allows more investment of time into assessing other stream reaches and problem sites. Through use of Geographical Information System (GIS) and station location codes, assessment sites throughout the state can be randomly selected for sampling as is being done for the reference sites. This procedure will provide a statistically valid means of estimating attainment of aquatic life use for the state's 305(b) reporting. In addition, the multimetric index will be helpful for targeted sampling at specific problem areas and judging biological condition with a procedure that has been calibrated regionally (Barbour et al. 1996c). To evaluate possible influences on the biological condition of sites, relationships among total bioassessment scores and physicochemical variables can be investigated. These relationships may indicate the influence of particular categories of stressors on the biological condition of individual sites. For example, a strong negative correlation between total bioassessment score and embeddedness would suggest that siltation from nonpoint sources could be affecting the biological condition at a site. Considerations relevant to assessment and diagnostics of biological condition are as follows:
Discriminant analysis may be used to develop a model that will divide, or discriminate, observations among two or more predetermined classes. Output of discriminant analysis is a function that is a linear combination of the input variables, and that obtains the maximum separation (discrimination) among the defined classes. The model may then be used to determine class membership of new observations. Thus, given a set of unaffected reference sites, and a set of degraded sites (due to toxicity, low DO, or habitat degradation), a discriminant function model can identify variables that will discriminate reference from degraded sites. Developing biocriteria with a discriminant model requires a training data set to develop the discriminant model, and a confirmation data set to test the model. The training and confirmation data may be from the same biosurvey, randomly divided into two, or they may be two consecutive years of survey data, etc. All sites in each data set are identified by degradation class (e.g., reference vs stressed) or by designated aquatic life use class. To avoid circularity, identification of reference and stressed, or of designated use classes, should be made from non-biological information such as quality of the riparian zone and other habitat features; presence of known discharges and nonpoint sources, extent of impervious surface in the watershed, extent of land use practices, etc. One or more discriminant function models are developed from the training set, to predict class membership from biological data. After development, the model is applied to the confirmation data set to determine its performance: The test determines how well the model can assign sites to classes, using independent data that were not used to develop the model. More information on discriminant analysis is in any textbook on multivariate statistics (e.g., Ludwig and Reynolds 1988, Jongman et al. 1987, Johnson and Wichern 1992). An example of this approach is the hierarchical decision-making technique used by Maine DEP. It begins with statistical models (linear discriminant analysis) to make an initial prediction of the classification of an unknown sample by comparing it to characteristics of each class identified in the baseline database (Davies et al. 1993). The output from analysis by the primary statistical model is a list of probabilities of membership for each of four groups designated as classes A, B, C, and nonattainment (NA) of Class C (Table 9-3). Subsequent models are designed to distinguish between a given class and any higher classes as one group, and any lower classes as a second group. One or more discriminant models to predict class membership are developed from the training set. The purpose of the discriminant analysis here is not to test the classification (the classification is administrative rather than scientific), but to assign test sites to one of the classes. Stream biologists from Maine DEP assigned a training set of streams to four life use classes. In operational assessment, sites are evaluated with the two-step hierarchical models. The first stage linear discriminant model is applied to estimate the probability of membership of sites into one of the four classes (A, B, C, or NA). Second, the series of two-way models are applied to distinguish the membership between a given class and any higher classes, as one group. The model uses 31 quantitative measures of community structure, including the Hilsenhoff Biotic Index, Generic Species Richness, EPT, and EP values. Monitored test sites are then assigned to one of the four classes based on the probability of that result, and uncertainty is expressed for intermediate sites. The classification can be the basis for management action if a site has gone down in class, or for reclassification to a higher class if the site has improved. Table 9-3. Maine's water quality classification system for rivers and streams, with associated biological standards (taken from Davies et al. 1993).
Maine biocriteria thus establish a direct relationship between management objectives (the three aquatic life use classes and nonattainment) and biological measurements. The relationship is immediately viable for management and enforcement as long as the aquatic life use classes remain the same. If the classes are redefined, a complete reassignment of streams and a review of the calibration procedure would be necessary. This approach is detailed by Davies et al. (1993).
RIVPACS and its derivative, AusRivAS (Australian Rivers Assessment System) are empirical (statistical) models that predict the aquatic macroinvertebrate fauna that would be expected to occur at a site in the absence of environmental stress (Simpson et al. 1996). The AusRivAS models predict the invertebrate communities that would be expected to occur at test sites in the absence of impact. A comparison of the invertebrates predicted to occur at the test sites with those actually collected provides a measure of biological impairment at the tested sites. The predicted taxa list also provides a "target" invertebrate community to measure the success of any remediation measures taken to rectify identified impacts. The type of taxa predicted by the AusRivAS models may also provide clues as to the type of impact a test site is experiencing. This information can be used to facilitate further investigations e.g., the absence of predicted Leptophlebiidae may indicate an impact on a stream from trace metal input. These models are the primary ecological assessment analysis techniques for Great Britain (Wright et al. 1993) and Australia (Norris 1995). The models are based on a stepwise progression of multivariate and univariate analyses and have been developed for several regions and various habitat types found in lotic systems. Regional applications of the AusRivAS model, in particular, have been developed for the Australian states and territories (Simpson et al. 1996), and for streams in the Sierra and Cascade mountain ranges in California (Hawkins and Norris 1997). Users of these models claim rapid turn around of results is possible and output can be tailored for a range of users including community groups, managers, and ecologists. These attributes make RIVPACS and AusRivAS likely candidate analysis techniques for rapid bioassessment programs. Although the same procedures are used to build all AusRivAS models, each model is tailored to specific regions (or states) to provide the most accurate predictions for the season and habitat sampled. The stream habitats for which these models have been applied include the edge/backwater, main channel, riffle, pool, and macrophyte stands. The multihabitat sampling techniques used in many RBP programs have not yet been tested with a RIVPACS model. The models can be constructed for a single season, or data from several seasons may be combined to provide more robust predictions. To date the RIVPACS/AusRivAs models have only been developed for the benthic assemblage. Discussion of RIVPACS and AusRivAS is taken from the Australian River Assessment System National River Health Program Predictive Model Manual by Simpson et al. (1996). As is the case with the multimetric approach, a more thorough treatment of the RIVPACS/AusRivAS models can be obtained by referring to the citations of the supporting documentation provided in this discussion.
RBP Home | Table of Contents | Download the RBP | Chapter 1 | Chapter 2 | Chapter 3 | Chapter 4 | Chapter 5 | Chapter 6 | Chapter 7 | Chapter 8 | Chapter 9 | Chapter 10 | Chapter 11 | Appendix A | Appendix B | Appendix C | Appendix D
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||
|
|