Skip to main content
U.S. flag

An official website of the United States government

Here’s how you know

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

HTTPS

Secure .gov websites use HTTPS
A lock (LockA locked padlock) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

    • Environmental Topics
    • Air
    • Bed Bugs
    • Cancer
    • Chemicals, Toxics, and Pesticide
    • Emergency Response
    • Environmental Information by Location
    • Health
    • Land, Waste, and Cleanup
    • Lead
    • Mold
    • Radon
    • Research
    • Science Topics
    • Water Topics
    • A-Z Topic Index
    • Laws & Regulations
    • By Business Sector
    • By Topic
    • Compliance
    • Enforcement
    • Laws and Executive Orders
    • Regulations
    • Report a Violation
    • Environmental Violations
    • Fraud, Waste or Abuse
    • About EPA
    • Our Mission and What We Do
    • Headquarters Offices
    • Regional Offices
    • Labs and Research Centers
    • Planning, Budget, and Results
    • Organization Chart
    • EPA History

Breadcrumb

  1. Home
  2. Causal Analysis/Diagnosis Decision Information System (CADDIS)
  3. Volume 4: Data Analysis

Selecting an Analysis Approach

  • How Can I Use My Data?
  • Establishing Differences from Expectations
  • Estimating Stressor-Response Relationships
Helpful Links
  • Analyzing Trait Data
  • Spatial/Temporal Co-occurrence
  • Stressor-Response from the Field
  • Stressor-Response from Other Field Studies
  • Verified Prediction

How Can I Use My Data?

The Stressor Identification approach (described in Volume 1) does not require a minimum data set. Existing data often are sufficient to determine the cause of impairment. When data are insufficient to support an assessment, Stressor Identification can be used to identify priority data needs. When available, you also can use larger monitoring datasets for quantitative causal analysis.
 

Analysis of State and Regional Monitoring Data

Analysis of state and regional monitoring data can inform two questions relevant to causal assessment:

Do Environmental Conditions or Biological Characteristics at a Test Site Differ from Expectations?
In most causal assessments, we start with information that observed biota at a test site are impaired. Often, this means the biota differ from reference expectations for that site. Data analysis can often determine whether certain stressors also differ from reference expectations.

Answering this basic question can support evaluation of spatial/temporal co-occurrence. To establish co-occurrence, we assess whether a stressor is present when the biological effect is observed. We can do this by comparing stressor levels at impaired and reference sites.
 
This same question often underlies evaluation of the verified prediction type of evidence. We hypothesize that a particular stressor caused the observed impairment. Then, based on this hypothesis, we make a prediction regarding biological characteristics at the impaired site. Suppose increased bedded sediment is a possible stressor. We might predict that clinger taxa are reduced at the impaired site (see the Helpful link for traits). We then might assess whether clinger taxa richness differs from reference expectations at the impaired site.

What is the Relationship Between a Stressor and a Response in a Particular Region?
Stressor-response relationships can provide an estimate of effect magnitude for a given stressor level. Different statistical approaches can be used to estimate stressor-response relationships with varying degrees of confidence. Stressor-response relationships may be derived using data from the case or from other field studies. Each of these represents a different type of evidence, and each may raise different issues. For example, other field studies may require consideration of covariants across the larger study area.

Helpful Links
  • Controlling for Natural Variation
  • Tests of Significant Difference
  • Variable Distributions (Box Plots)

Establishing Differences from Expectations

Establishing that biological or environmental characteristics at a test site differ from expected values is a key analysis for causal assessment. Expectations regarding site characteristics can be based on a single reference site (e.g., upstream of the test site) or on a set of comparable, regional reference sites. Analytical approaches range from a simple comparison of measurements to formal statistical tests.

  • Comparison of Values
    If only a single measurement is available at the test site and at the reference site, then one can only compare these two values. Interpretation of whether the difference between the two values is meaningful requires an understanding of the inherent variability of the measurements and an understanding of ecologically meaningful differences in value.
  • Box Plots
    Box plots graphically represent the distribution of a set of samples, providing a visual means of assessing whether a test site value deviates from the range of conditions observed at a single reference site or a set of similar reference sites.
  • Tests of Significant Difference
    Given enough data from different samples at a reference site, or data from several different reference sites, one can explicitly calculate the probability that the observation from the test site could have been observed at the reference site(s). A low probability (e.g. less than 5%) would suggest that the observation at the test site likely differs from expectations defined by reference sites.
  • Natural Variations
    In cases where expectations are defined using a set of regional reference sites, it is likely that natural variations will be present in the characteristic that is being compared. Analyzing natural variation can help distinguish differences between test and reference sites that are meaningful.
Helpful Links
  • Autocorrelation
  • Classification and Regression Trees
  • Confounding
  • Correlation Analysis
  • Exploratory Data Analysis
  • Interpreting Statistics
  • Multivariate Approaches
  • Propensity Score Analysis
  • Quantile Regression
  • Regression analysis
  • Scatterplots
  • Stressor-Response From the Field
  • Stressor-Response From Other Field Studies
  • Variable Distributions

Estimating Stressor-Response Relationships

Stressor-response relationships estimated from field data can potentially inform two types of evidence: stressor-response from the case and stressor-response from other field studies.

Stressor-Response From the Field (Using Data from the Case)

For this type of evidence, an association in which the magnitude of the biological response decreases as stressor levels decrease in measurements collected from the same stream would be consistent with a causal relationship. This relationship between stressor and response can be shown simply with a scatterplot. In cases in which the variability in the measured response data is too high to discern a response, a regression fit to the data may help assess whether biological response changes as hypothesized.
 
More information about analytical tools used to support this type of evidence  can be found on the pages describing scatterplots and regression analyses.
 

Stressor-Response From Other Field Studies

For this type of evidence, we use data collected from a larger study area to quantify the effects of the stressor on the biological response. Accurate estimates of effects can be difficult to obtain because of the strong possibility of covarying factors in field-collected data. In many cases, a more attainable analysis goal may be to simply determine whether the stressor of interest causes effects in the biological response.
 
A methodical approach to analysis can be helpful, including the following steps:
  1. Explore Associations Between Variables in the Data Set
    1. Identify extreme observations and autocorrelation that may influence estimates of relationships.
    2. Calculate correlations and view scatterplots to reveal relationships between pairs of variables.
    3. Use multivariate approaches to reveal relationships among groups of variables.
    4. Identify variables that may confound the estimated relationship.
       ​
  2. Estimate Effects
    1. Classification and regression trees can suggest possible discontinuities in relationships of interest.
    2. Regression analysis provides an estimate of the mean relationship between the biological response and stressor of interest. In some cases, the effects of possible confounding variables can be controlled by including them in the regression model, but estimates of effect may be unreliable when variables covary too strongly.
    3. Quantile regression provides a way to estimate the upper bound of the relationship between a stressor and a biological response. Under certain assumptions, this upper bound may provide a reasonably accurate estimate of the stressor-response relationship.
    4. Propensity score analysis provides a powerful means of controlling for the effects of covarying variables, and accurately estimating effects.
      ​
  3. Interpret Results
    1. Have most potential confounders been treated in the analysis?
    2. What do the significance test results mean?

Volume 4: Authors

Causal Analysis/Diagnosis Decision Information System (CADDIS)

  • CADDIS Home
    • About CADDIS
    • Frequent Questions
    • Publications
    • Recent Additions
    • Related Links
    • CADDIS Glossary
  • Volume 1: Stressor Identification
    • About Causal Assessment
    • Getting Started
    • Step 1. Define the Case
    • Step 2. List Candidate Causes
    • Step 3. Evaluate Data from the Case
    • Step 4. Evaluate Data from Elsewhere
    • Step 5. Identify Probable Causes
  • Volume 2: Sources, Stressors and Responses
    • About Sources
      • Urbanization
    • About Stressors
  • Volume 3: Examples and Applications
    • Analytical Examples
    • Worksheet Examples
    • State Examples
    • Case Studies
    • Galleries
  • Volume 4: Data Analysis
    • Selecting an Analysis Approach
    • Getting Started
    • Basic Principles & Issues
    • Exploratory Data Analysis
    • Basic Analyses
    • Advanced Analyses
    • PECBO Appendix
    • Download Software
    • Data Analysis Topics (A -Z)
  • Volume 5: Causal Databases
    • Learn about CADLink
Contact Us about CADDIS
Contact Us to ask a question, provide feedback, or report a problem.
Last updated on February 13, 2025
  • Assistance
  • Spanish
  • Arabic
  • Chinese (simplified)
  • Chinese (traditional)
  • French
  • Haitian Creole
  • Korean
  • Portuguese
  • Russian
  • Tagalog
  • Vietnamese
United States Environmental Protection Agency

Discover.

  • Accessibility Statement
  • Budget & Performance
  • Contracting
  • EPA www Web Snapshot
  • Grants
  • No FEAR Act Data
  • Plain Writing
  • Privacy
  • Privacy and Security Notice

Connect.

  • Data
  • Inspector General
  • Jobs
  • Newsroom
  • Regulations.gov
  • Subscribe
  • USA.gov
  • White House

Ask.

  • Contact EPA
  • EPA Disclaimers
  • Hotlines
  • FOIA Requests
  • Frequent Questions
  • Site Feedback

Follow.