CADDIS Volume 4: Data Analysis
Basic Analyses
Quantile Regression
Authors: P. Shaw-Allen, G.W. Suter II, S.M. Cormier, L.L. Yuan
Quantile Regression
Quantile regression models the relationship between a specified conditional quantile (or percentile) of a dependent (response) variable and one or more independent (explanatory) variables (Cade and Noon 2003). As with mean regression, the relationship is often assumed to be a straight line (Figure 1).
How do I run a quantile regression analysis?
A quantile regression tool is available in CADStat. Unlike regular linear regression, tools for quantile regression are less readily available, although algorithms like the one available in CADStat are available in R. Among commercial statistical packages, quantile regression is now available in newer versions of SAS/Stat. Blossom
, a freestanding (and free) statistical package, also fits quantile regressions and is available from the U.S. Geological Survey.
What do quantile regression results mean?
As with mean regression, programs generally provide estimated values for the coefficients along with their standard errors and p-values (see discussion of interpreting p-values). A measure of the degree the model accounts for observed variability in the response relative to a constant null model that is similar to R2 in mean regression may also be calculated. It is generally useful to plot the data and superimpose the fitted line (Figure 1).
How do I use quantile regression in causal analysis?
Quantile regression can be used to help describe stressor-response relationships. Quantile regression provides a means of estimating the location of the upper boundary of a scatter plot (e.g., the 90th percentile line in Figure 1). An assumption for using this upper boundary is that the wedge shape often observed in scatter plots of biological metrics results from the effects of other stressors co-occurring with the modeled stressor that cause additional negative effects on the biological response.
Interpretation of the results of quantile regressions in causal analysis is based on the proximity of observations from the site of the impairment to this upper boundary. These interpretations are qualitative and comparative. In the example shown in Figure 2, data from the impaired site (open red circles) are plotted on scatter plots comparing regional EPT richness with two candidate stressors (increased percent sand/fines and increased total nitrogen). Because the plots show the impaired site closer to the upper boundary of the percent sand/fines relationship compared to the total nitrogen relationship, we might conclude that percent sand/fines exerts a stronger influence on the observed EPT richness at the site in question. This analysis could support the case for percent sand/fines as the cause of the observed impairment and weaken the case for total nitrogen.
More information
Technical details for quantile regression are available here.
