Technical Information for Implementing Designs
Key to supporting the probability survey designs is the site selection process. This process requires a spatial representation of the target population (Sample Frame) and a methodology for site selection that can include, randomization, spatial balance, stratification, and equal or unequal weighting. These are required for random tessellation stratified survey designs.
Target Population Identification
During the Design Process an explicit definition of the target population is developed to meet the monitoring objectives. A precise, clear and well understood target population definition is a prerequisite to implementing the site selection processes.
Sample Frame Preparation
Several potential Frame Materials are generally evaluated in the process of Sample Frame development. Usual sources of frame materials for a stream monitoring program include:
- RF3 and NHD GIS coverages (Based on 1:100,000 maps) [see EPA OWOW and USGS pages]
- Digital Elevation Maps (DEMs)
- Hydrological Unit Codes (HUCs), River Basins, Watersheds, etc.
- Stream Segment codes: perennial, intermittent, natural, constructed, Strahler Order, etc.
- Political and Ecological Boundaries: States, EPA Regions, Ecoregions, Coastal Provinces, etc.
- Special Interest Areas: National Parks, Federal Lands, ESA Species Ranges, etc.
- Land ownership (public, private, etc)
- Land use and Land cover, including cities, roads, etc.
Selection of Grid for Generalized Random Tessellation Stratified (GRTS) Designs (see Stevens & OIsen, 2004)
- Create GIS Coverage for boundary of study area (as a unit square)
- Conduct Quadrant-Recursive Partitioning of the Unit Square to:
- Randomly order 0, 1, 2, 3 and assign to each recursive subdivision of the unit square
- Continue process until each grid cell has a high probability of <1 point per cell
- Ensure grid cell small enough to be included in small regions of Study area
- Ensure grid cell small enough such that cells along borders primarily contain study area
- Create auxiliary coverages for sub-population areas, special interest areas, intensification areas, strata (i.e., ecoregions, river basins, etc.)
- Additional Details and links to algorithms and GIS programs at Discrete Grids.
- Identify all target population units in the grid and their associated two dimensional address
- Construct sampling line using the randomized hierarchical addresses
- Select a systematic sample with a random start from sampling line
- Place sample in reverse hierarchical address order
- Order the sites from 1 to n
- Create base 4 address for site numbers
- Reverse base 4 address
- Sort by reverse base 4 address
- Renumber sites in RHO
This sequence of operations produces a spatially-balanced randomized ordered list of sample sites for the entire region or sub-region.
Unequal Probability Implementation
Two generic situations occur. One situation is use of an auxiliary variable to sample proportional to size. Examples are Strahler order categories or lake area categories. Second situation is when sub-regions of a study are designated to have intensified sampling. An example is a state-wide sample plus an intensified sample effort in a specific river basin or ecoregion within the study region.
Decisions must be made based on the design requirements and summaries of the sample frame data. The results are multipliers to be applied to units in the sample frame. These are necessary in the construction of the final units to be used in the site selection algorithm.Process for Site Selection
Steps site selection include: (following completion of Step 3: one-dimensional representation of target population units)
- If the design specifies equal probabilities of selection, skip to number 3
- Multiply the unit length for each target population unit by its appropriate weight, weight being the inverse of the probability of selection, to "stretch" the line segment in proportion to the weight, i.e. those units with highest probability of selection have line segments greater than 1.0.
- Divide the length of the weighted line by the desired sample size to get the length of the sampling interval. Select a random starting point within the first interval. Select that unit and each subsequent unit along the weighted line, separated by the sampling interval
- This creates a hierarchical randomized and systematically selected spatially balanced master sample. Includes assignment of new ordered sample identification, see Step 4 (above) for process of creating Reverse Hierarchical Order (RHO)..
- Selection of base sample and over sample from master sample. Currently the over sample is a positive integer multiple of the base sample size. Hence the base sample selection is a systematic subsample of the master sample. It is possible to generalize this to any size of over sample if use reverse hierarchical addressing ordering in the selection process.
- Assign base sample and over sample to panels. Over sample is generically assigned to panel 0. Base sample is assigned to panels systematically, cyclically assigning base sample sites to panels 1 through p until all sites are assigned to a panel.
- Apply reverse hierarchical address ordering to the base sample (ignoring panel assignment) and assign new sample identification.
- Apply reverse hierarchical address ordering to the over sample and assign new sample identification.
- Select nested subsamples within panels using reverse hierarchical address ordering. Note that this does not change the sample ordering assigned to a site in step 4.
- Assign analysis weights to sample sites. The design could result in multiple weights for sample sites. The two situations are designs with nested subsampling and designs with intensive study areas and core study-wide sample. In the latter case the weights will be conditional on achieved sample sizes.
- Create initial design file. Critical that this file have sites presented in an order that makes them easily used by those who implement the design. The over sample will appear at the end of the file presented in reverse hierarchical address ordering defined in step 5. The base design sites will be ordered by panel then by NestID then by reverse hierarchical address ordering defined in step 5.
- Finally determine Geographic Coordinates for selected sites and create the Design File
The software package psurvey.design is available that automates the design process. Illustrative examples, with input, R script, output files, and GIS maps, are provide to facilitate implementation of GRTS designs. Additional software packages required include R and ArcGis.