Jump to main content.


Comparing Samples

MAIA invertebrate example

description | simple example | MAIA example | diatom example | how it works | caveats

Description: Different tests allow you to compare either the means of 2 groups, or the variances, or the overall shape of the distributions. Most often, you might ask whether some variable of interest is higher (or lower) for 1 group of cases than for another group.

Simple example: Suppose you have multimetric index values for sites upstream and downstream of a waste water treatment plant. You could compare the mean index values for the 2 groups of sites using a 2-sample test, for example, the t-test would be appropriate.

MAIA example: Klemm et al. (in review) tested invertebrate metrics by comparing ranges for impaired and reference sites. Rather than using a statistical test, they used a graphical equivalent. Box plots for each group of sites showed the values that represent the 25th and 75th percentile, called the quartiles. Between these 2 quartiles is the interquartile range. They used the overlap of the interquartile range for the 2 groups of sites to judge whether metrics were different for impaired vs. reference sites. Differences that are easy to see on a graph are usually statistically significant as well.

Figure

MAIA invertebrate example (Click for information about alternate access)

Figure: Percent Ephemeroptera and Plecoptera were significantly different at reference and impaired stream sites; percent chironomids and dominant taxon did not differ.

Diatom example: Comparing diatom index values: reference vs. test sites: A multimetric index should be tested against measures of human disturbance that are defined independently of the biological data.

Reference and test sites were defined for testing invertebrate metrics by Klemm, et al. (manuscript). Reference sites were characterized by low sulfate, high acid neutralizing capacity, good riparian condition, and low phosphorus, nitrogen and chloride.

The diatom index was higher for reference than test sites, indicating better biological condition at reference sites. The two groups were significantly different when compared with a t-test.

Diatom index - reference, test (Click for information about alternate access)

How the method works: A t-test averages the values for each group of cases and calculates the variance, that is, how much each of the values differs from the average. If values for the 2 groups overlap very little, they are deemed significantly different.

Assumptions/alternatives: If the data are distributed approximately normally, a t-test is appropriate for a 2-sample comparison, otherwise the nonparametric equivalent is the Mann-Whitney U-test. The data could also be put into an Anova model for 2 samples which is mathematically equivalent to a t-test. You could also use a regression model with group membership coded as a "dummy" variable, such that the new dummy variable has a value of 1 for each case in the first group and a 0 for those in the second group. All tests would give similar results in most cases.

All of these tests are fairly robust, that is, it's ok if the distributions aren't quite normal. The only violation that can really cause problems is if 1 of the samples is much more variable than the other; in that case, the results of significance testing are less reliable.

Biological Indicators | Aquatic Biodiversity | Statistical Primer


Local Navigation


Jump to main content.