Documentation for the National WSA Databrowser
Sections of this document:
The Wadeable Streams Assessment (WSA) is a first-ever statistically-valid study of the biological condition of small streams throughout the U.S. It establishes a national baseline we can use to compare to results from future studies. This information will help us evaluate the successes of our national efforts to protect and restore water quality.
The WSA contains data from EPA regional surveys that were designed to collect data that are representative of conditions throughout the sampled area. Random site selection ensures that summary statistics derived from the data are unbiased and can be used to guide planning and management efforts at a regional scale.
This databrowser displays summary statistics for measures of biological condition, water chemistry, and physical habitat for wadeable streams. Users may test for correlation among response variables and human influence. Users may also subset the data and compare summary statistics and patterns of correlation according to year, state, stream order, or other factors.
Data Sources
Data for the National WSA databrowser were obtained from the EPA Office of Water's WSA data page:
The Wadeable Streams Assessment (WSA) is a first-ever statistically-valid survey of the biological condition of small streams throughout the U.S. The U.S. Environmental Protection Agency (EPA) worked with the states to conduct the assessment in 2004-2005.
The data available from the Office of Water site were processed and combined into a single WSA_Data.csv file which contains 1289 records (the unique sites) each with 270 columns of data and metadata. See the Variables page for a complete list of the variables used in the databrowser.
Statistical Analysis
An overview of Probabilistic Survey methods is given on the Aquatic Resources Monitoring site. Details of the analyses are given in the page onMonitoring Design and Analysis. Additional information is available on the Office of Water WSA Background Materials page.
Technical Details
References
- Wadeable Streams Assessment: a Collaborative Survey of the Nation’s Streams. EPA 841-B-06-002. December 2006.
- Wadeable Streams Assessment (WSA) Background Materials
Interpreting Results
Map above box plots
Site locations are shown for three states in the Pacific Northwest (see figure below). High, medium and low values for the selected variables (EPT taxa richness in this case) are indicated by different symbols on the map. Because 2001 was selected as the focus category, sites collected in other years are marked by an x.
The box plots below the map show the values for EPT taxa richness according to year. Each category, or year, has two box plots. The transparent box plots represent unweighted results, that is, just the data values with no statistical adjustments made for the probability of site selection. The blue box plots represent weighted results, that is, after adjustments are made according to each site’s probability of being included in the sample.
Box plots for weighted and unweighted estimates of EPT taxa richness were very similar within each year indicating that the random survey design provided a representative sample of stream sites in this region.
Box plots show the 5th and 95th percentiles at the end of the whiskers with outliers noted that are more than 1.5 times the interquartile range. The box ends define the 25th and 75th percentiles. The bar in the middle is the median, or 50th percentile.
Map above CDF plots
Similar map information is shown for this plot. The box plots are replaced with plots of the cumulative distribution function (CDF) for EPT taxa richness. To read the CDF, look along the x-axis for a particular value of the variable. Draw a line straight up to the CDF line and then read over to the left on the y-axis to determine what percentage of the sites had EPT taxa richness values greater than or equal to the value selected on the x-axis.
CDFs are based only on "weighted" values. For example, a site in the Wenatchee Basin would contribute proportionately less to the overall calculation of the weighted mean than would a site in Idaho. This is because stream sites in the Wenatchee Basin had a higher probability of being included in the survey design; therefore, they represent fewer stream miles than a site randomly selected from the entire state of Idaho.
Scatter plot
The dependent variable (y-axis) is plotted against the independent variable (x-axis). Spearman’s correlation coefficient (r) and least-fit regression lines are shown separately for each category.
For the example plot below, EPT taxa richness declines as the percentage of substrate in silt and finer categories increases. The relationship was similar and consistent across all years.
![[logo] US EPA](http://www.epa.gov/epafiles/images/logo_epaseal.gif)