Skip to main content
U.S. flag

An official website of the United States government

Here’s how you know

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

HTTPS

Secure .gov websites use HTTPS
A lock (LockA locked padlock) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

    • Environmental Topics
    • Air
    • Bed Bugs
    • Cancer
    • Chemicals, Toxics, and Pesticide
    • Emergency Response
    • Environmental Information by Location
    • Health
    • Land, Waste, and Cleanup
    • Lead
    • Mold
    • Radon
    • Research
    • Science Topics
    • Water Topics
    • A-Z Topic Index
    • Laws & Regulations
    • By Business Sector
    • By Topic
    • Compliance
    • Enforcement
    • Laws and Executive Orders
    • Regulations
    • Report a Violation
    • Environmental Violations
    • Fraud, Waste or Abuse
    • About EPA
    • Our Mission and What We Do
    • Headquarters Offices
    • Regional Offices
    • Labs and Research Centers
    • Planning, Budget, and Results
    • Organization Chart
    • EPA History

Breadcrumb

  1. Home
  2. Causal Analysis/Diagnosis Decision Information System (CADDIS)

Using R for Parametric Regression

  • Introduction
  • Using Taxon-Environment Relationships
  • Estimating Taxon-Environment Relationships
  • Computing Inferences
  • R Scripts

How to Fit Parametric Regressions

Helpful Links
Topics In R Scripts
  • Overview
  • Download R Scripts and Sample Data
  • Loading Data
  • Central Tendencies
  • Environmental Limits
  • Parametric Regressions
  • Non-Parametric Regressions
  • Significance Tests
  • Area Under the ROC Curve
  • Curve Shape
  • Weighted Average Inference
  • Estimate Taxon-Environment Relationships Using taxon.env()
Other Pages And Websites
  • Parametric Regression

PECBO Appendix Site Map
Single variable parametric regressions for presence/absence of different taxa (see Parametric Regression page, Equation 3) are computed using the generalized linear model (GLM) function in R.
 
First make sure that you have loaded the sample biological and environmental data and merged them into a single data frame called dfmerge (see Download Scripts and Sample Data in the Helpful Links box).
 
Also make sure that you have selected the taxa for which you wish to calculate environmental limits and saved them in the vector taxa.names (see the description in the R Scripts for Central Tendencies page in the Helpful Links box).
 

Each model is stored within a single list of models.

# Create storage list for models
modlist.glm <- as.list(rep(NA, times = length(taxa.names)))

for (i in 1:length(taxa.names)) {
  # Create a logical vector is true if taxon is present and false if 
  # taxon is absent.
  resp <- dfmerge[,taxa.names[i]] > 0

  modlist.glm[[i]] <- glm(resp ~ poly(temp,2), data = dfmerge, 
                  family = "binomial")
  # Fit the regression model and store the results in a list.
  # Here, poly(temp,2) specifies that the
  # model is fitting using a second order polynomial of the 
  # explanatory variable.  glm calls the function that fits
  # Generalized Linear Models.  We specify in this case that
  # the response variable is distributed binomially.

  # Print out summary statistics for each model
  print(summary(modlist.glm[[i]]))
}

To plot the model results (similar to those shown on the Parametric Regression page, Figure 5) run the following script.  

# Specify 3 plots per page
par(mfrow = c(1,3), pty = "s")        
for (i in 1:length(taxa.names)) {

  # Compute mean predicted probability of occurrence            
  # and standard errors about this predicted probability.
  predres <- predict(modlist.glm[[i]], type= "link", se.fit = T)
    
  # Compute approximate upper and lower 90% confidence limits
  up.bound.link <- predres$fit + 1.65*predres$se.fit
  low.bound.link <- predres$fit - 1.65*predres$se.fit
  mean.resp.link <- predres$fit

  # Convert from logit transformed values to probability.
  up.bound <- exp(up.bound.link)/(1+exp(up.bound.link))
  low.bound <- exp(low.bound.link)/(1+exp(low.bound.link))
  mean.resp <- exp(mean.resp.link)/(1+exp(mean.resp.link))

  # Sort the environmental variable.
  iord <- order(dfmerge$temp)        

  # Define bins to summarize observational data as
  # probabilities of occurrence
  nbin <- 20            
  
  # Define bin boundaries so each bin has approximately the same
  # number of observations.    
  cutp <- quantile(dfmerge$temp, 
                   probs = seq(from = 0, to = 1, length = 20))
  # Compute the midpoint of each bin                
  cutm <- 0.5*(cutp[-1] + cutp[-nbin])

  # Assign a factor to each bin                    
  cutf <- cut(dfmerge$temp, cutp, include.lowest = T)
    
  # Compute the mean of the presence/absence data within each bin.
  vals <- tapply(dfmerge[, taxa.names[i]] > 0, cutf, mean)
                        
  # Now generate the plot
  # Plot binned observational data as symbols.
  plot(cutm, vals, xlab = "Temperature", 
       ylab = "Probability of occurrence", ylim = c(0,1),
       main = taxa.names[i])        
  # Plot mean fit as a solid line.        
  lines(dfmerge$temp[iord], mean.resp[iord])
  # Plot confidence limits as dotted lines.                
  lines(dfmerge$temp[iord], up.bound[iord], lty = "dotted")
  lines(dfmerge$temp[iord], low.bound[iord], lty = "dotted")    
} 

Causal Analysis/Diagnosis Decision Information System (CADDIS)

  • CADDIS Home
    • About CADDIS
    • Frequent Questions
    • Publications
    • Recent Additions
    • Related Links
    • CADDIS Glossary
  • Volume 1: Stressor Identification
    • About Causal Assessment
    • Getting Started
    • Step 1. Define the Case
    • Step 2. List Candidate Causes
    • Step 3. Evaluate Data from the Case
    • Step 4. Evaluate Data from Elsewhere
    • Step 5. Identify Probable Causes
  • Volume 2: Sources, Stressors and Responses
    • About Sources
      • Urbanization
    • About Stressors
  • Volume 3: Examples and Applications
    • Analytical Examples
    • Worksheet Examples
    • State Examples
    • Case Studies
    • Galleries
  • Volume 4: Data Analysis
    • Selecting an Analysis Approach
    • Getting Started
    • Basic Principles & Issues
    • Exploratory Data Analysis
    • Basic Analyses
    • Advanced Analyses
    • PECBO Appendix
    • Download Software
    • Data Analysis Topics (A -Z)
  • Volume 5: Causal Databases
    • Learn about CADLink
Contact Us about CADDIS
Contact Us to ask a question, provide feedback, or report a problem.
Last updated on February 7, 2025
  • Assistance
  • Spanish
  • Arabic
  • Chinese (simplified)
  • Chinese (traditional)
  • French
  • Haitian Creole
  • Korean
  • Portuguese
  • Russian
  • Tagalog
  • Vietnamese
United States Environmental Protection Agency

Discover.

  • Accessibility Statement
  • Budget & Performance
  • Contracting
  • EPA www Web Snapshot
  • Grants
  • No FEAR Act Data
  • Plain Writing
  • Privacy
  • Privacy and Security Notice

Connect.

  • Data
  • Inspector General
  • Jobs
  • Newsroom
  • Regulations.gov
  • Subscribe
  • USA.gov
  • White House

Ask.

  • Contact EPA
  • EPA Disclaimers
  • Hotlines
  • FOIA Requests
  • Frequent Questions
  • Site Feedback

Follow.