Skip to main content
U.S. flag

An official website of the United States government

Here’s how you know

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

HTTPS

Secure .gov websites use HTTPS
A lock (LockA locked padlock) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

    • Environmental Topics
    • Air
    • Bed Bugs
    • Cancer
    • Chemicals, Toxics, and Pesticide
    • Emergency Response
    • Environmental Information by Location
    • Health
    • Land, Waste, and Cleanup
    • Lead
    • Mold
    • Radon
    • Research
    • Science Topics
    • Water Topics
    • A-Z Topic Index
    • Laws & Regulations
    • By Business Sector
    • By Topic
    • Compliance
    • Enforcement
    • Laws and Executive Orders
    • Regulations
    • Report a Violation
    • Environmental Violations
    • Fraud, Waste or Abuse
    • About EPA
    • Our Mission and What We Do
    • Headquarters Offices
    • Regional Offices
    • Labs and Research Centers
    • Planning, Budget, and Results
    • Organization Chart
    • EPA History

Breadcrumb

  1. Home
  2. Causal Analysis/Diagnosis Decision Information System (CADDIS)

Using R to Compute Weighted Average Inferences

  • Introduction
  • Using Taxon-Environment Relationships
  • Estimating Taxon-Environment Relationships
  • Computing Inferences
  • R Scripts

How to Compute Weighted Average Inferences

Helpful Links
Topics In R Scripts
  • Overview
  • Download Scripts and Sample Data
  • Loading Data
  • Central Tendencies
  • Environmental Limits
  • Parametric Regressions
  • Non-Parametric Regressions
  • Significance Tests
  • Area Under the ROC Curve
  • Curve Shape
  • Weighted Average Inference
  • Estimate Taxon-Environment Relationships Using taxon.env()
Other Pages And Websites
  • Weighted Average Inferences

PECBO Appendix Site Map
To compute a weighted average inference, we first need to compute central tendencies for all taxa using the regional EMAP-West data (site.species) and then use those central tendencies to assess test sites in a data set collected from western Oregon (site.species.or). Before beginning, make sure that you have downloaded both EMAP-West data and Oregon data (see Download Scripts and Sample Data in the Helpful Links box) and merged environmental and biological data.
 

Next, identify and save the names of taxa that are found in both data sets.

# Compare taxa names in tolerance value and assessment data.
# Make sure all taxa names are in capital letters only
names.tv <- toupper(names(site.species)[-1])
names.assess <- toupper(names(site.species.or)[-1])

# Combine taxa names from both datasets in one vector
# and then find taxanames that are repeated
names.all <- c(names.tv, names.assess)
names.match <- names.all[duplicated(names.all)]

print("Taxa in both databases")
print(sort(names.match))
    


To apply assessment tools, we need to compute central tendencies for as many taxa as possible. To do this, expand the list of taxa to include all taxa that occur in at least 20 sites in the EMAP-West data set. (The 20 site limit is imposed to avoid overfitting a model to a rare taxon.)


# Get names of all taxa in the data set
taxa.names.init <- names(site.species)[-1]

# Compute the number of occurrence of each taxon
getocc <- function(x) sum(x>0)
numocc <- apply(site.species[, taxa.names.init], 2, getocc)

# Save all taxa names that occur in at least 20 sites
taxa.names <- taxa.names.init[numocc >=  20]
    
Now, recompute central tendencies for the expanded list of taxa by running the central tendencies script again(see Central Tendencies in the Helpful Links box).  Make sure you run the script for all taxon names identified above. Depending on the number of taxa selected, this may take some time.
 

Continuous tolerance values (e.g., weighted averages) can be classified into tolerance categories, but it is preferable to use them in conjunction with a mean tolerance value metric.

Mean tolerance values are the best metric to use in conjunction with continuous-valued tolerance values such as weighted averages or optima. The following script assumes that weighted averages have been computed for all taxa listed in names.match. Other tolerance values can be substituted into the third line of code as desired.

# Only select taxa for which tolerance values
#   have been computed. 
mat1 <- as.matrix(dfmerge.or[, names.match])
            
# First get total abundance
tot.abn <- apply(mat1, 1, sum)        
            
# Use matrix multiplication to compute the sum of all 
# observed tolerance values, and then divide by total 
# abundance to get the mean tolerance value.
mean.tv <- (mat1 %*% WA[names.match])/tot.abn        

plot(dfmerge.or$temp, mean.tv, xlab = "Temperature", 
     ylab = "Mean tolerance value")

Causal Analysis/Diagnosis Decision Information System (CADDIS)

  • CADDIS Home
    • About CADDIS
    • Frequent Questions
    • Publications
    • Recent Additions
    • Related Links
    • CADDIS Glossary
  • Volume 1: Stressor Identification
    • About Causal Assessment
    • Getting Started
    • Step 1. Define the Case
    • Step 2. List Candidate Causes
    • Step 3. Evaluate Data from the Case
    • Step 4. Evaluate Data from Elsewhere
    • Step 5. Identify Probable Causes
  • Volume 2: Sources, Stressors and Responses
    • About Sources
      • Urbanization
    • About Stressors
  • Volume 3: Examples and Applications
    • Analytical Examples
    • Worksheet Examples
    • State Examples
    • Case Studies
    • Galleries
  • Volume 4: Data Analysis
    • Selecting an Analysis Approach
    • Getting Started
    • Basic Principles & Issues
    • Exploratory Data Analysis
    • Basic Analyses
    • Advanced Analyses
    • PECBO Appendix
    • Download Software
    • Data Analysis Topics (A -Z)
  • Volume 5: Causal Databases
    • Learn about CADLink
Contact Us about CADDIS
Contact Us to ask a question, provide feedback, or report a problem.
Last updated on February 7, 2025
  • Assistance
  • Spanish
  • Arabic
  • Chinese (simplified)
  • Chinese (traditional)
  • French
  • Haitian Creole
  • Korean
  • Portuguese
  • Russian
  • Tagalog
  • Vietnamese
United States Environmental Protection Agency

Discover.

  • Accessibility Statement
  • Budget & Performance
  • Contracting
  • EPA www Web Snapshot
  • Grants
  • No FEAR Act Data
  • Plain Writing
  • Privacy
  • Privacy and Security Notice

Connect.

  • Data
  • Inspector General
  • Jobs
  • Newsroom
  • Regulations.gov
  • Subscribe
  • USA.gov
  • White House

Ask.

  • Contact EPA
  • EPA Disclaimers
  • Hotlines
  • FOIA Requests
  • Frequent Questions
  • Site Feedback

Follow.