Skip to main content
U.S. flag

An official website of the United States government

Here’s how you know

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

HTTPS

Secure .gov websites use HTTPS
A lock (LockA locked padlock) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

    • Environmental Topics
    • Air
    • Bed Bugs
    • Cancer
    • Chemicals, Toxics, and Pesticide
    • Emergency Response
    • Environmental Information by Location
    • Health
    • Land, Waste, and Cleanup
    • Lead
    • Mold
    • Radon
    • Research
    • Science Topics
    • Water Topics
    • A-Z Topic Index
    • Laws & Regulations
    • By Business Sector
    • By Topic
    • Compliance
    • Enforcement
    • Laws and Executive Orders
    • Regulations
    • Report a Violation
    • Environmental Violations
    • Fraud, Waste or Abuse
    • About EPA
    • Our Mission and What We Do
    • Headquarters Offices
    • Regional Offices
    • Labs and Research Centers
    • Planning, Budget, and Results
    • Organization Chart
    • EPA History

Breadcrumb

  1. Home
  2. Causal Analysis/Diagnosis Decision Information System (CADDIS)

Using R to Compute the Area Under the ROC Curve

  • Introduction
  • Using Taxon-Environment Relationships
  • Estimating Taxon-Environment Relationships
  • Computing Inferences
  • R Scripts

How to Compute the Area Under the ROC Curve

Helpful Links
Topics In R Scripts
  • Overview
  • Download Scripts and Sample Data
  • Loading Data
  • Central Tendencies
  • Environmental Limits
  • Parametric Regressions
  • Non-Parametric Regressions
  • Significance Tests
  • Area Under the ROC Curve
  • Curve Shape
  • Weighted Average Inference
  • Estimate Taxon-Environment Relationships Using taxon.env()
Other Pages And Websites
  • Assessing Model Fit

PECBO Appendix Site Map 
The area under the receiver operating characteristic (ROC) curve provides an indication of how accurately a model classifies sites into sites with the taxon present and sites with the taxon absent (see Assessing Model Fit page in the Helpful Links box).
 

The area under the ROC curve for each model can be computed by first imagining a pair of sites where the species of interest is present at one site but absent at the other. We would expect that the probabilities of occurrence predicted by the regression model would be greater at the site where the species is present than at the site where the species is absent. The area under the ROC is equivalent to the proportion of all such pairwise comparisons in which this condition is satisfied. The following script performs this computation.

# Define storage vector for ROC
roc <- rep(NA, times = length(taxa.names))

for (i in 1:length(taxa.names)) {
  # Compute mean predicted probability of occurrence
  predout <- predict(modlist.glm[[i]], type = "response")

  # Generate logical vector corresponding to presence/absence
  resp <- dfmerge[, taxa.names[i]] > 0

  # Divide predicted probabilities into sites where
  # species is present ("x") and sites where the species is
  # absent ("y").
  x <- predout[resp]
  y <- predout[! resp]

  # Now perform all pairwise comparisons of x vs. y
  # and store results in a matrix
  rocmat <- matrix(NA, nrow = length(x), ncol = length(y))
  for (j in 1:length(x)) {
    rocmat[j,] <- as.numeric(x[j] > y)
  }

  # Summarize all comparisons to compute area under ROC
  roc[i] <- sum(rocmat)/(length(x)*length(y))
}
names(roc)<- taxa.names
print(roc)

Causal Analysis/Diagnosis Decision Information System (CADDIS)

  • CADDIS Home
    • About CADDIS
    • Frequent Questions
    • Publications
    • Recent Additions
    • Related Links
    • CADDIS Glossary
  • Volume 1: Stressor Identification
    • About Causal Assessment
    • Getting Started
    • Step 1. Define the Case
    • Step 2. List Candidate Causes
    • Step 3. Evaluate Data from the Case
    • Step 4. Evaluate Data from Elsewhere
    • Step 5. Identify Probable Causes
  • Volume 2: Sources, Stressors and Responses
    • About Sources
      • Urbanization
    • About Stressors
  • Volume 3: Examples and Applications
    • Analytical Examples
    • Worksheet Examples
    • State Examples
    • Case Studies
    • Galleries
  • Volume 4: Data Analysis
    • Selecting an Analysis Approach
    • Getting Started
    • Basic Principles & Issues
    • Exploratory Data Analysis
    • Basic Analyses
    • Advanced Analyses
    • PECBO Appendix
    • Download Software
    • Data Analysis Topics (A -Z)
  • Volume 5: Causal Databases
    • Learn about CADLink
Contact Us about CADDIS
Contact Us to ask a question, provide feedback, or report a problem.
Last updated on February 13, 2025
  • Assistance
  • Spanish
  • Arabic
  • Chinese (simplified)
  • Chinese (traditional)
  • French
  • Haitian Creole
  • Korean
  • Portuguese
  • Russian
  • Tagalog
  • Vietnamese
United States Environmental Protection Agency

Discover.

  • Accessibility Statement
  • Budget & Performance
  • Contracting
  • EPA www Web Snapshot
  • Grants
  • No FEAR Act Data
  • Plain Writing
  • Privacy
  • Privacy and Security Notice

Connect.

  • Data
  • Inspector General
  • Jobs
  • Newsroom
  • Regulations.gov
  • Subscribe
  • USA.gov
  • White House

Ask.

  • Contact EPA
  • EPA Disclaimers
  • Hotlines
  • FOIA Requests
  • Frequent Questions
  • Site Feedback

Follow.