Jump to main content.

The U.S. EPA Reach File Version 3.0 Alpha Release (RF3-Alpha) Technical Reference

First Edition December 1994


A special thanks to those who contributed their valuable time and ideas to the development of this document: consultants Lucinda D. McKay, Sue A. Hanson, Robert C. Horn, Richard A. Dulaney, and Alan W. Cahoon, Mark V. Olsen of the Office of Research and Development's Environmental Monitoring Systems Laboratory in Las Vegas, NV (EMSL-LV), and Thomas G. Dewald of the Office of Wetlands, Oceans, and Watersheds (OWOW).

This document was prepared by Horizon Systems Corporation under sub-contract to Tetra Tech, Inc. as part of EPA contract number 68-C3-0303.


  1. Introduction
  2. History of the U.S. EPA Reach Files
    1. Reach File Version 1.0 (RF1)
    2. Reach File Version 2.0 (RF2)
    3. Reach File Version 3.0 (RF3)
  3. Description of the RF3 Production Process
    1. Compilation
    2. Validation
      1. Assessment Phase
      2. Revision Phase
      3. RF3-Alpha Updates
  4. RF3-Alpha "Things To Do" List
  5. Technical Description of RF3-Alpha
    1. Summary Description
    2. Data Entities
    3. Attributes
      1. Reach Entity Attributes
      2. Coordinate Entity Attributes
      3. Reach Entity - Coordinate Entity Relate Attributes
  6. RF3-Alpha Export Files
    1. ARC/INFO Export Files
    2. PCRF3 Export Files
  7. Processing RF3-Alpha Data Using ARC/INFO
    1. Creation of an ARC/INFO Coverage
    2. Polygon Topology
    3. Network Traversal Through RF3-Alpha
    4. Navigation Via RF3-Alpha Attributes
    5. ARC/INFO Limitations To Know
    6. RF3-Alpha Update ARC Macro Language (AML) Routines
  8. Appendix A: RF3-Alpha Attribute Descriptions
  9. Appendix B: RF3-Alpha Structure Record Layout
  10. Appendix C: RF3-Alpha Coordinate Record Layout


The U.S. Environmental Protection Agency's (EPA) Reach Files are a series of hydrographic databases of the surface waters of the continental United States and Hawaii. The structure and content of the Reach File databases were created expressly to establish hydrologic ordering, to perform hydrologic navigation for modeling applications, and to provide a unique identifier for each surface water feature, i.e., reach codes.

A key characteristic of the Reach Files are their attributes which define the connected stream network. These attributes provide connectivity regardless of the presence or absence of topologic continuity in the digital linework. Flow direction is inherent in the connectivity attributes. This attribute-level connectivity enables the Reach Files to provide hydrologic ordering of stream locations using reach codes (what is upstream and downstream of a given point in the stream network) as well as network navigation proceeding in either the upstream or downstream direction.

This technical reference addresses, primarily, the third version of the Reach File (RF3). A preliminary copy of RF3, known as RF3-Alpha, is available for use. Contained within this reference are a brief history of the Reach File, description of the on-going project that will produce the RF3-Final dataset, technical description of the RF3-Alpha file and additional material that will help users to evaluate and apply RF3-Alpha appropriately.

RF3-Alpha data is un-validated and given the nature of the shortcomings that have been identified in the RF3-Alpha data and the re-design work that is being incorporated into RF3 validation to support GIS applications, it is recommended that a conservative approach be taken when processing and applying these data. The final, validated RF3 will provide a much improved data product. In the mean time, access to the provisional Alpha data, accompanying documentation, and technical support is provided through the Office of Water's (OW) STORET User Assistance Group. STORET, EPA's national water quality data system, is currently undergoing a major re-design to address evolving user requirements and technology advancements including GIS. Both STORET and RF3 will play integral roles in EPA's future water quality data collection, analysis, and reporting activities.

History of the U.S. EPA Reach Files

The Reach File was first conceived in the 1970s with a proof-of-concept file, known as Reach File Version 1.0 Alpha (RF1A), completed in 1975. The first full implementation, referred to as Reach File Version 1.0 (RF1), was completed in 1982. The source for RF1 was the U.S. Geological Survey's (USGS) 1:250,000 scale hydrography that had been photo-reduced to a scale of 1:500,000 by the National Oceanic and Atmospheric Administration (NOAA). RF1 consists of approximately 68,000 reach segments comprising 650,000 miles of stream. While RF1 still supports broad-based national applications, the need to provide a complementary and more detailed hydrologic network motivated the development of Reach File Version 2.0 (RF2) in the late 1980's. RF2 was created by using the Feature File of the USGS Geographic Names Information System (GNIS) to add one new level of reach segments to RF1. RF2 contains 170,000 stream segments. Shortly thereafter, widespread interest in providing a more comprehensive, nationally consistent hydrologic database led to the development of the Reach File Version 3-Alpha (RF3-Alpha).

RF3 production is described as a two-step process involving the compilation of spatial and attribute data from a variety of different sources and the validation of the resulting file to ensure the integrity of the reach codes and hydrologic connectivity. The compilation of the data from RF1 and RF2, the USGS Geographic Names Information System (GNIS) database, and the 1988 USGS 1:100,000 scale Digital Line Graph (DLG) hydrography produced the interim database known as RF3-Alpha. RF3-Alpha contains nearly 3,200,000 individual hydrographic features (reaches) and over 93,000,000 coordinate points. RF3-Alpha is now undergoing the validation processing that will produce the RF3-Final database. As part of RF3-Alpha validation, EPA is coordinating closely with USGS to synchronize RF3-Alpha feature definitions and linework with the hydrologic component of the new USGS Digital Line Graph Enhanced (DLG-E) product. This collaborative effort will minimize duplicative work by enabling both organizations to share corrections and enhancements to their respective data sets.

RF3-Alpha has been available to a limited user community since early in 1993. EPA programs have utilized RF3-Alpha in ARC/INFO, EPA's standard Geographic Information System (GIS), on UNIX workstations, in custom software on the EPA IBM mainframe, and in ARC/VIEW (a query and display tool for ARC/INFO data) and custom software on Personal Computers (PCs). The unique reach code assigned to each reach has been used to link a number of EPA national databases to surface waters, e.g., STORET Water Quality Sampling Sites, Municipal and Industrial Facility Discharges, and Drinking Water Intakes. Any site within these linked databases can be associated with a specific location on a surface water feature: reservoir, lake, stream, wide river, coast line or other feature. The Reach File has also been used by the USGS and NOAA as the hydrography backbone for several of their programs. USGS uses RF3-Alpha in their National Water Quality Assessment Program (NAWQA) while NOAA uses RF3-Alpha to provide feature identifiers in their Flash Flood Prediction Models. The Subcommittee on Spatial Water Data of the Federal Geographic Data Committee (FGDC) is presently considering the RF3 for two purposes: (1) using the Reach Code as the national standard surface water feature identifier and (2) adopting the RF3 model for the hydrographic framework dataset of the National Spatial Data Infrastructure.

Reach File Version 1.0 (RF1)

Reach File Version 1.0 Alpha (RF1A) was designed in 1973, funded for development in 1974, and completed in 1975, as the reach file proof-of-concept. The surface water features were digitized by EPA from stable base acetate copies of the two-part USGS wall map that has a scale of 1:2,500,000. This first version of the reach file was a single national coverage used for database design testing and demonstration. It used natural watersheds which terminate at the confluences of large streams and at estuarine outlets. Its design was found to be well-suited for the integration of surface water databases and for hydrologically ordered retrievals of any data that might be indexed to it along river courses or through open water bodies.

Spatial data for RF1A were captured as latitude and longitude coordinate vectors representing the traces of stream reaches and shoreline reaches. The topological relationships among reaches were derived from the digital traces using largely user assisted interactive graphics software prepared specifically for the project. Each reach was assigned a unique reach code to be used as a reach identifier. These numbers were assigned without regard to hydrologic order because it was felt it would be unwise to burden the identifier with meaning that would likely need to be compromised to accommodate additional reaches within the database in the future. The reach identifiers were unique numbers within large watersheds. There were several hundred large watersheds covering the contiguous US and Alaska which ranged in size from less than 100 reaches to several hundred reaches. The reach code used in RF1A consisted of a nationally unique 6-digit major watershed number and a 6-digit reach code unique within the watershed.

In 1978, work was begun to build the Reach File into a fully functional database on the EPA's IBM mainframe computer. This work was undertaken in support of the EPA's effluent guidelines development process in the Office of Water. The approach was to create a file which was integrated with STORET, EPA's National Water Quality database, and yet was exportable to other agencies for the purpose of enhancing water data integration on both a national and local scale. The resulting database was completed as Reach File Version 1.0 (RF1) in 1982.

RF1 was digitized by EPA from aeronautical charts prepared by NOAA. A primary objective of the NOAA charts was to provide the best hydrography available in a single map series so that it could serve low altitude aircraft navigation. NOAA initially developed these charts from photographic copies of stable-base scribe coat masters of the USGS maps which have a scale of 1:250,000. Prior to the development of these aeronautical charts, the USGS 1:250,000 scale maps held the most up-to-date and detailed national hydrographic coverage of any national map series.

Although the NOAA charts retained all the hydrography shown on the USGS 1:250,000 scale maps, there was a need to improve these coverages because there was a significant disparity in the density of stream coverage from one USGS map to the next. Therefore, NOAA updated their copies of these maps to include more streams and waterbodies. The updates were based on remote sensing imagery and emphasis was placed on providing more hydrographic features on those maps which were "under par" relative to adjacent maps. This provided a more balanced hydrographic coverage for the nation as a whole than was available from any other source. Additionally, NOAA updated these maps on a continuing basis so that they would be current with changes, such as the construction of new reservoirs in many areas of the nation.

The NOAA aeronautical charts provided, by far, the most detailed complete national hydrographic coverage of any set of maps. They were photographically reduced to a scale of 1:500,000 yet they retained all of the USGS 1:250,000 scale hydrography and all of the NOAA hydrographic updates. Therefore, they became the maps of choice for the Reach File. They were optically scanned (to eliminate manual digitization errors), edge-matched, and transformed into a single contiguous vector database by EPA during the period of 1978-1982.

The resulting database, RF1, includes information on single line streams, a few double line wide rivers, a number of lakes and reservoirs and most estuaries. All of these features were represented by lineal constructs comprising either hydraulic transport reaches or shoreline reaches. Those representing hydraulic transport paths were referred to as transport reaches, whereas those representing the banks of wide rivers, lakes, and estuaries were referred to as shoreline reaches. Areal features such as lakes, reservoirs, wide rivers, and estuaries were represented by shoreline reaches and artificially constructed transport reaches which ran through the areal reach.

Reach File Version 2.0 (RF2)

An intermediate update, known as Reach File Version 2.0 (RF2), was constructed during the late 1980's to aid in building a much more detailed version (RF3) that would be based on the then upcoming USGS 1:100,000 Scale Digital Line Graph (DLG) database. RF2 was developed by overlaying, onto RF1, the coordinates of the hydrographic feature names extracted from the USGS Geographic Names Information System (GNIS), Version I.

The primary improvement to RF1 as a result of the GNIS overlay process was the addition of one new level of reaches and their names. RF2 included new reaches only when they discharged directly into an original RF1 reach. However, even more valuable than the improved density in RF2 were the lessons learned about the behavior of spatial datasets with disparate coordinate sources in what was essentially a blind overlay operation.

The RF2 development project essentially doubled the number of reaches in the Reach File, but the much awaited digital traces from the 1:100,000 scale DLG database were not available at the time RF2 was scheduled for completion. Consequently, RF2 was released, in 1988, with the sparse coordinates obtained from GNIS. Nevertheless, RF2 was a very valuable Reach File for many water quality reporting needs within EPA. From 1988 until RF3-Alpha's initial release in 1992, the Office of Water's National Water Quality Report, mandated under Clean Water Act (CWA) Section 305(b), program used RF2 to standardize the reporting on state priority waterbodies.

Reach File Version 3.0 (RF3)

The successful applications of RF1 and RF2 created the need and desire for a more comprehensive hydrologic database. To satisfy this need, the Reach File Version 3.0 (RF3) development project was begun in the fall of 1988 when the 1:100,000 scale DLG data became available. RF3 is being developed by EPA's Office of Water to provide a nationally consistent database to promote comparability for national, regional, and state reporting requirements such as those found in 305(b) and other sections of the Clean Water Act.

The specific goals for RF3 are to:

RF3-Alpha, the initial output of the RF3 production process, was constructed from four data sources:

The RF3 compilation was performed on a catalog unit basis. The DLG3 and GNIS files were divided into CU-based subsets using the 1:2,000,000 scale Catalog Unit Boundaries. The first step was to network the DLG3. This process put the DLG3 lines in a hydrologic network, assigned temporary "reach" identifiers and built a preliminary set of navigation attributes. The major and minor attribute codes contained in the DLG3 data were used to determine feature types and to assist in building the hydrologic network. No attempt was made to correct any errors that existed in the DLG3 attributes and, therefore, these errors are reflected in RF3-Alpha. After the downstream start point(s) were visually selected, an endpoint-to-endpoint method was used to find and connect all possible DLG lines into the network. DLG attribute codes were used to distinguish between single line streams, wide rivers, and lakes. Miscellaneous hydrography, such as point features and ditches and canals, were not networked, however, most of these features were included in RF3-Alpha and given reach codes. The next step in compilation was to overlay the RF2 onto the DLG network and transfer the RF2 reach codes, names, stream levels, and navigation attributes to the DLG network. The inter-CU reach connectivity was of particular importance because all of RF3-Alpha's inter-CU connections originated from RF1/RF2.

After the RF2 overlay, the GNIS file was overlaid to add names for the new RF3-Alpha reaches. This was similar to the RF2 overlay, except that it was performed on a named-feature-by-named-feature basis. The names assigned to reaches in RF1 and RF2 were not changed during this overlay. Only reaches that were newly added by DLG hydrography were candidates to receive names.

RF3-Alpha is now complete for 45 of the 48 contiguous states and Hawaii. In its present form, RF3-Alpha includes nearly 3,200,000 reaches representing streams, wide rivers, reservoirs, lakes, a variety of miscellaneous hydrographic features, and the coastal shorelines for the Atlantic and Pacific Oceans, the Great Lakes, the Gulf of Mexico and the Hawaiian Islands.

Description of the RF3 Production Process

As described above, the RF3 Production Process began in 1988 with acquisition of the DLG3 data from USGS, followed by 4 years of production effort which created the first test version, known as RF3-Alpha. RF3 was developed to provide a nationally consistent database to promote comparability for national and regional reporting requirements such as those found in 305(b) and other sections of the Clean Water Act. Other agencies have also identified a need for a nationally consistent database of surface waters which can be satisfied by RF3.

The RF3 Production Process (see Figure 1) consists of two parts: Compilation and Validation. The Validation Part, in turn, contains two phases: Assessment and Revision. After a test period which began in late 1992 and resulted in extensive feedback from users, EPA is now finalizing the Assessment Phase, performing the Revision Phase, and documenting the data content in preparation for final release of the file. The RF3 production parts, phases, and steps are detailed in the following discussion.

Figure 1. Reach File Production Process Version 3.0 (RF3)

Figure 1


RF3 compilation took the RF1/RF2 datasets, the USGS GNIS I file, and the 1988 USGS DLG3 hydrography file and put them together "spatially" to create the RF3-Alpha file. The original intent was to create an EPA mainframe capability for geo-referencing water data and for water quality modeling. The compilation process was performed on an IBM 3090 and on PCs and was, except for very cursory visual review, a "blind" procedure. The average person-time per CU was about 3 hours from start to finish including downloading the input datasets from the mainframe, automated compilation procedures, and uploading the output datasets back to the mainframe.

Compilation was completed in early 1993 and the preliminary dataset, RF3-Alpha, was released to a limited user community for QA and feedback. The EPA 305(b) and EMAP programs have used this initial file as their spatial dataset for surface waters. Based on user feedback, the "Things To Do" list for RF3-Alpha was established (see Section 4). Some of the early users applied GIS technology to RF3-Alpha and their requirements were very different from the original objectives, so the "Things To Do" list grew substantially in response to the new objectives introduced by the GIS user community.


Assessment Phase

The Production Process is now in the Assessment Phase of Validation. This phase consists of gathering "hard" documentation of user feedback and performing many automated QA/QC checks on the data. The first objective of Assessment is to learn as much as possible about the problems in the data and develop approaches to correct them. Recognizing that some problems that will not be correctable during this centralized validation process due to the lack of local knowledge, the second objective is to create documentation datasets for each catalog unit (CU) that contain CU-level and feature-level information about the data content. These "metadata" will be bundled with the spatial data comprising RF3 to provide users with an understanding of the current data lineage. With this information in hand, those users that are more knowledgeable about their local waters will be better prepared to correct and enhance RF3.

Anticipating that substantial changes to RF3-Alpha will occur in the course of addressing the "Things To Do" list, the Assessment Phase presents an opportune time to also assess the design of the RF3-Alpha feature representation and data structures. The original design was primarily inherited from RF1, which was developed to support mainframe applications, and needs to be re-evaluated to address evolving user requirements that are being driven by the recent advances in GIS technology. A new RF3 design for feature representation and data structures is currently under development.

Based on the results of the Assessment Phase of RF3-Alpha Validation, the activities for the Revision Phase will be established.

Revision Phase

The Revision Phase will occur in three steps, depicted in the boxes labeled "Step 1", "Step 2", and "Step 3" running top to bottom along the right side of the diagram in Figure 2.

Figure 2. Revision Process

Figure 2

Step 1 contains a mainframe process which will fix various data problems and restructure the file based on the new design for feature representation and data structures. Some of the corrections and enhancements being considered in this step are:

The interim RF3 dataset created in Step 1 will be input to Step 2 which will consist of GIS operations. Step 2 will perform a variety of "blind" GIS actions on the data, including the replacement of the RF3-Alpha linework with the latest DLG linework (1990) and adding all DLG attributes to RF3-Alpha. Finally, the "blind" pass will compose a workspace ready for a visual GIS pass. The visual GIS activities will perform thorough QA/QC of the work performed in the mainframe step and in the GIS "blind" pass and will make revisions, as needed.

The output dataset(s) from Step 2 will be input to Step 3 during which a final revision of RF3 will be made on the mainframe. The following changes will be implemented in Step 3:

Once completed, RF3-Final will be accessible via a variety of distribution mechanisms including the Internet.

RF3-Alpa Updates

The RF3-Alpha Validation is currently underway. During this period, RF3-Alpha data for small geographic areas is available for users to familiarize themselves with reach file concepts and conventions. Users are encouraged to provide feedback on their use and evaluation of the data.

It is not EPA's intention to incorporate user updates to the provisional RF3-Alpha data. Update specifications and routines are planned to support the standardized modification of RF3-Final. Users interested in modifying RF3-Alpha prior to the release of RF3-Final are encouraged to consider tagging their modifications. Should they later decide to apply the RF3-Alpha updates to RF3-Final, this tagging will allow them to more easily extract and carry forward the modifications. The resulting RF3-Final updates can then be provided for incorporation in the RF3-Master file. See Section 7.6 for information on existing prototype RF3-Alpha Update ARC Macro Language (AML) Routines.

EPA is particularly interested in corrections relating to stream names, stream levels, and hydrologic connectivity. This collection of attributes, along with the reach codes themselves, represents the primary value added information found in the RF3. In anticipation of the early need to incorporate updates for this type of information from state and local users, who have more familiarity with the waters in their respective areas, the initial set of update specifications and routines will focus on these types of modifications.

RF3-Alpha "Things To Do" List

The assessment of RF3-Alpha has resulted in a variety of redesign and data content improvement issues. The most significant of these, which are listed below, are discussed in this section.

  1. Incorporate Latest DLG Features, Linework and Attributes
  2. Correct Open Water Shoreline Reach codes
  3. Correct Networking at Divergences
  4. Correct Connectivity of Headwater Lakes
  5. Add Centerlines
  6. Correct Inter-CU Connections
  7. Migrate Reaches and Isolated Networks From Incorrect CUs to Correct CUs
  8. Expand Reach code, Redelineate Reaches and Improve Navigation Attributes
  9. Remove DLG "Neat" Lines
  10. Enhance Both Stream and Open Water Names
  11. Remove Zero Length Reaches
  12. Alter Open Water Navigation Attributes
  13. Correct Problems with Original DLG3 Data - Quad Overlays and Shifted Quads
  14. Improvement of the Usability in a GIS
  15. Update Version and Source Control

    *** Density Variations

Each of these potential improvements is described in the following discussion.

  1. Incorporate Latest DLG Features, Linework and Attributes

    This enhancement is designed to address a variety of problems/short-comings of the RF3-Alpha data and also to synchronize RF3-Alpha and the latest version of the USGS DLG hydrologic data. This synchronization will promote the utility and shared maintenance of both files. This task will accomplish the following corrections and changes:

    • The current RF3-Alpha coordinates are stored in latitude/longitude decimal degrees, with a precision of 0.0001 (+/- approximately 35 feet, or about 10 meters). During RF3 compilation, the coordinates were inadvertently truncated resulting in a south-east shift in the data ranging from 0 to 35 feet. This shift will be removed by replacing current RF3-Alpha linework, which was based upon 1988 DLG linework, with the more current 1990 DLG linework.
    • The original RF3 design goal was focused on producing a national dataset of connected hydrography. Less emphasis was placed on capturing miscellaneous, unconnected, hydrographic features. RF3-Alpha contains approximately, 95-98% of the original 1988 DLG hydrography. The synchronization of RF3-Alpha and DLG accomplished during the RF3 validation will provide RF3-Final users with access to the complete set of DLG data.
    • To ensure future correspondence between DLG (UTM to 2 decimal digits) and RF3 linework, coordinates will be stored at a precision of 0.0000001 degrees in RF3-Final.
    • During RF3 compilation, DLG lines were concatenated to construct reaches according to "confluence to confluence" definition. Due to this concatenation, DLG line and area attributes can no longer be associated with the original DLG lines and areas. The re-design of the Reach File will accommodate the linkage of DLG attributes at the sub-feature level.
    • RF3-Alpha currently only carries two line attributes and two area attributes from the DLG. As noted above, RF3/DLG synchronization will provide access to the complete set of DLG attributes from RF3-Final.
  2. Re-number Open Water Shoreline Reach Codes

    In RF3-Alpha, the two bottom-most shoreline reaches in an open water have the reach codes CU-SEG-MI and CU-SEG-(MI+0.01), respectively. The intent of this numbering scheme was to uniquely identify these two reaches as the most downstream reaches of the open water. During a subsequent effort to define update procedures for RF3, it was determined that this coding scheme made it difficult to sub-divide the most downstream, right-bank shoreline due to insufficient precision in the MI value to accommodate further subdivision. As part of the overall restructuring of the reach code, these shoreline reaches will be renumbered to more effectively accommodate subsequent sub-division.

  3. Correct Networking at Divergences

    The complex hydrography associated with multiple divergences occasionally "confused" the RF3 compilation software resulting in the assignment of erroneous connectivity attributes. In a few instances, mainstem reaches point downstream to the divergent reach as BOTH the downstream reach AND the divergent reach. The true downstream reach points correctly back to the upstream reach. These connectivity errors will be repaired as part of RF3 validation.

  4. Correct Connectivity of Headwater Lakes

    Headwater lakes in RF3-Alpha do not always point to their downstream reach (the downstream reach points correctly to the headwater reach). These connectivity errors will be repaired as part of RF3 validation.

  5. Add Centerlines

    Many users interested in using RF3 for modeling would like it to include transport paths through open waters. A variety of "blind" and "user-assisted" centerlining algorithms are being considered for creating open water transport paths in the RF3-Final dataset.

  6. Correct Inter-CU Connections

    Inter-CU connectivity was built directly from RF2. An error was introduced in the connectivity attributes when the RF2 reach which pointed to the upstream CU was subdivided by new tributaries during RF3 compilation processing. These connectivity errors will be repaired as part of RF3 validation.

  7. Migrate Reaches and Isolated Networks From Incorrect CUs to Correct CUs

    The compilation of RF3-Alpha was performed primarily on a CU-basis using the 1:2,000,000 scale CU boundaries. The CU boundaries were used to "clip" a CU's worth of data from the 1:100,000 scale DLG hydrography. Due to the spatial accuracy of the boundaries and the hydrography, the "clipping" sometimes resulted in headwaters (both individual reaches and small isolated networks of reaches) being erroneously incorporated into an adjacent CU. By using the more current and accurate 1:250,000 scale CU boundaries and the topologic connectivity of the RF3-Alpha linework, these headwaters reaches will be migrated back into the correct CU. For isolated hydrographic features that are not hydrologically connected, a predominance rule will be used to assign them to the 1:250,000-scale CU they fall within.

  8. Expand Reach Code, Redelineate Reaches and Improve Navigation Attributes

    Since version 1, the Reach File has provided a unique, permanent feature identifier, known as the reach code, as one of its most important attributes. 100% of the reach codes that appeared in RF1 were retained in RF2 and approximately 83% of the reach codes that appeared in RF2 were retained in RF3. A rapid transition to higher resolution data is anticipated for RF3 in the near future. To accommodate the dramatic increase in numbers of features for both the near term and the long term, an expansion of the reach code is necessary.

    During RF3 compilation, it was discovered that the current RF3-Alpha reach code structure could not accommodate the level of detail present within the 1:100,000-scale hydrography. As a consequence, there are CUs in RF3-Alpha where reach code assignment rules were altered to circumvent the problem.

    After considerable review of RF3-Alpha, a variety of re-design proposals have surfaced that will affect the structure of the reach code, the definition of what constitutes a reach, and the content of the navigation attributes. It is expected that these combined considerations will require the re-numbering of many, if not all, RF3-Alpha reaches during validation processing. To support the translation of RF3-Alpha reach codes to RF3-Final reach codes, a reach code translation and update transaction table will be constructed and provided for existing RF3-Alpha users.

  9. Remove DLG "Neat" Lines

    During RF3 compilation, some of the quad map "neat" line (map boundaries) from the original DLG data were inadvertently included. They will be removed as part of RF3 validation processing.

  10. Enhance Both Stream and Open Water Names

    A special GNIS I extract was used during RF3 compilation. During RF3 validation, the more complete and current GNIS II will be used to add names to both linear and open water features in RF3-Final. In addition, a variety of state reported name corrections and enhancements are being considered for incorporation into RF3-Final.

  11. Remove Zero Length Reaches

    During RF3 compilation, zero length reaches were introduced in order to maintain the hydrologic connectivity structure of "upright-upleft.", i.e., binary tree representation. RF3-Alpha fully supports only class 3 nodes, i.e. junctions of 3 reaches. When a class 4 or greater node was encountered during compilation, an artificial zero length reach was added to capture the connectivity attributes. Zero length reaches, it was later discovered, present a problem for ARC/INFO users. Within ARC/INFO, the CLEAN and/or BUILD commands delete zero length arcs resulting in the loss of the associated connectivity attributes. The revised RF3 design is intended to accommodate nodes of any class. Therefore, the need for zero length reaches will be eliminated and so will the zero length reaches themselves.

  12. Alter Open Water Navigation Attributes

    The current open water navigation methods use the shorelines to "walk" up or down the shores, in hydrologic order. Depending on the ordering of tributaries entering the open water, this "walk" may switch back and forth between the left and right banks. By collecting the shoreline reaches in a table, it is also possible to circumnavigate the shorelines in a clockwise or counter-clockwise direction. The revised RF3 design is intended to accommodate a polygonal reach for each 2-dimensional hydrographic feature thus allowing navigation to more intuitively proceed from the input tributaries to the open water reach to the output tributaries. Shoreline reaches will still exist with connectivity attributes that are ordered solely for the purpose of circumnavigation.

  13. Correct Problems with Original DLG Data - Quad Overlays and Shifted Quads

    Some errors in the original DLG have been discovered by RF3-Alpha users. For example, one quad map in Ohio contains the hydrography from two quads. In addition, there is a shifted quad in Missouri. These DLG problems will be identified and fixed during the RF3 validation processing.

  14. Improvement of the Usability in a GIS

    The design of the Reach File Version 1 (RF1), in the late 1970's, did not anticipate the use of the file in Geographic Information Systems (GIS). The primary applications of the version 1 data involved linking diverse databases through feature identifiers supplied by the uniquely-assigned reach codes and "looking" upstream and downstream using the navigation attributes. The spatial data portion of the reach file was used primarily to create output maps which visually illustrated the geographic locations of the various tabular attributes that were tied to the reach file. RF2 and RF3-Alpha use the original version 1 design.

    Over the past several years, GIS technology has had rapid growth and vast improvements in functionality and reliability. The use of GIS for analysis of natural resources and for environmental management continues to expand. As one of the first national coverages of spatial data, the Reach File has been a natural candidate for use in GIS. The use of RF3-Alpha with GIS software has identified many design changes that would improve RF3's usability in GIS.

  15. Update Version and Source Control

    To prepare the Reach File for an environment where it will be updated feature-by-feature rather than replaced in its entirety, it is necessary to provide for update version and source control. This design change will include compliance with the FGDC Metadata Standard.

    *** Density Variations

    Quite often there are significant disparities in the density of stream coverage from one USGS DLG map to the next. These disparities were inherited by RF3-Alpha, which relied upon the DLG as its source for linework. These density variations have been reported to the USGS and are to be addressed as part of the normal USGS DLG revision process. During the interim period, a number of density aids are being considered to assist RF3 users in dealing with these variations in their RF3 applications.

Technical Description of RF3-Alpha

Summary Description

The basic building block of RF3-Alpha is the "reach", which is a surface water feature graphically represented by a line. A reach is defined as one of the following:

The distinction of the reaches falling under the first two definitions are that these reaches are connected hydrologically with each other and are often referred to as "transport" reaches and collectively as the "transport path." Transport reaches may be part of the major drainage networks within the US or small isolated networks. By definition, the reaches in the third category are not hydrologically connected and represent miscellaneous hydrographic features.

Data Entities

RF3-Alpha is comprised of two entities -- a reach and a coordinate. A reach is a surface water feature as defined in the previous section. Belonging to each reach are one or more coordinates. If the reach is a point feature (i.e. a zero-length reach) then it has one and only one coordinate pair, otherwise a reach has a set of two or more coordinate pairs that define a line. If the reach is an isolated open body of water such as a lake with no inlets or outlets, the reach represents the entire shoreline of the waterbody and the first and last coordinates of its line are identical forming a closed polygon.

The master copy of RF3-Alpha is stored on EPA's IBM mainframe computer. On this host machine, the reach entities and their attributes are stored in a single table called the Reach Structure File (commonly referred to as the Structure File). The coordinate entities and their attributes are stored in a single table called the Latitude/Longitude File (commonly referred to as the LL File).


Reach Entity Attributes

The most important attributes of a reach are the Reach Code and the navigation attributes. The Reach Code provides each reach with a unique identifier which supports the linking of significant hydrologic data files to the Reach File and thus to each other. The reach navigation attributes provide the basis for hydrologic ordering and modeling by specifying the connectivity between reaches and the flow direction. Using the navigation attributes, it is possible to traverse the surface water network from upstream to downstream or downstream to upstream.

The Reach Code consists of three parts as follows:

Catalog Unit - an eight digit code uniquely assigned to a watershed and defined as a Federal Information Processing Standard (FIPS) maintained by the USGS. There are 2123 Catalog Units in the continental US plus Hawaii, Puerto Rico, the Panama Canal, and the US Virgin Islands. The data field name for the catalog unit is CU.

Segment - The segment number is a unique four digit number assigned to each new surface water feature within a given catalog unit. Segment numbers are assigned serially, starting at 0001, without regard for the hydrologic order of the segments. The data field name for the segment is SEG.

Marker Index - When a segment, that exists in the Reach File, is subsequently divided by a new tributary, the two pieces of the segment are assigned a marker index. Their segment numbers remain the same, thus identifying them as once being a single reach. The new downstream piece receives a marker index of zero. The new upstream piece receives a marker index which is defined as the proration or ratio of the distance from the base of the reach segment to the point of sub-division to the total length of the reach segment. Note that some Marker Indexes were assigned in RF2 when the spatial length of the reach was based on the sparse geometry from GNIS. DO NOT USE THE MARKER INDEX AS AN INDICATOR OF ACTUAL REACH LENGTH. The only valid use of the Marker Indicator is as a coding approach that enables the pieces within one Segment to be hydrologically ordered. The data field name for the marker index is MI.

Reaches are "connected" to each other via the navigation attributes that are attached to each reach. If a given reach is designated as the "instant" reach, its navigation attributes (with their data field names in parentheses) are as follows:

Upstream Left Reach (ULCU, ULSEG, ULMI)
Upstream Right Reach (URCU, URSEG, URMI)
Downstream Reach (DSCU, DSSEG, DSMI)
Divergent Reach (DIVCU, DIVSEG, DIVMI)
Complement Reach (CCU, CSEG, CMI)

Figure 3 illustrates the navigation attributes for the instant reach. The arrows in Figure 3 indicate direction of flow.

Figure 3. Navigation Attributes for the instant reach

Figure 3

The navigation attributes permit only two reaches to converge upstream of a given reach. At certain scales (1:100,000 as an example), there are often three or more reaches converging at the same apparent point. When this happens, one or more zero-length reaches will appear in the reach file at the point of confluence to accommodate the binary upstream structure of the navigation attributes. Specifically, one zero-length reach is added for each reach, beyond the first two, which enters a confluence. For example, if, as in Figure 4, four reaches A, B, C, and D form a confluence and discharge into reach E, two zero-length reaches Z1 and Z2 will be added. A and B will discharge into Z1. C and Z1 will discharge into Z2. D and Z2 will discharge into E.

Figure 4: Zero Length Reaches Z1 and Z2

Figure 4

While the Reach Code and the navigation attributes are the most important attributes in RF3-Alpha, the file has many other interesting and useful attributes. Appendix A presents the attributes associated with each reach. The attributes are stored in one physical table, the Structure File. Appendix B shows the position of each attribute within a record (table entry) in the Structure File.

Coordinate Entity Attributes

There are 93,000,000 coordinate entities stored in the RF3-Alpha LL File. Each coordinate entity has three attributes: a unique identifier, a latitude value and a longitude value. In RF3-Alpha, latitude and longitude are stored in decimal degrees to four decimal places. The unique identifier serves as a random access key for the table. The identifier is constructed from a three digit source code and a sequential number beginning at one for each coordinate entity that was obtained from the particular data source. For example, the DLG3 data, used to create much of RF3-Alpha, was obtained from USGS on 241 tapes. The coordinate entities from each tape were assigned identifiers with source codes equal to their original tape numbers (001 through 241). Additional data was subsequently obtained from California, Arizona and other sources. Each of these sources was assigned one or more unique source codes which were used to build their coordinate entity identifiers for the LL File.

In general, all the entries that represent one DLG3 line appear consecutively in the LL File. Prior to creation of the LL File, some DLG3 lines were concatenated. This occurred where two, and only two, DLG3 lines had end points within .0001 miles of each other and the lines contained the same DLG attributes. In general, the affect of this concatenation joined surface water features that crossed quad map boundaries. Coordinate entities for lines that were concatenated appear together in the LL File. Appendix C contains a layout for the LL File.

Reach Entity - Coordinate Entity Relate Attributes

In the Structure File, there are ten pairs of keys that point to groups of coordinates in the LL File. These keys contain the coordinate entity identifiers (i.e. the unique keys) for the LL File table entries. The data field names of the ten keys in the Structure File are LL1KEY1/LL2KEY1, LL1KEY2/LL2KEY2, ..., and LL1KEY10/LL2KEY10, respectively. The pairs of keys point to coordinates that are ordered from downstream to upstream.

In general, each pair of keys represents one original DLG3 line. The only exception to this is where the original lines were concatenated, as described above, in which case a pair of keys will represent the resulting concatenated line.

Occasionally, the DLG3 lines that represented connected surface water features do not physically touch each other. These discontinuities were classified into two different cases during RF3 compilation. The first case is where the gap is less than or equal to 0.0003 miles and the second case is where the gap is greater than 0.0003 miles.

Gaps less than 0.0003 miles: The RF3 compilation software assumed a connection. These assumed connections may exist in the Reach File in two places: (1) within a reach and (2) between reaches. When a gap appears within a reach, the assumed connection will occur between two coordinate groups defined by two of the ten pairs of keys in the Structure File. These assumed-connection gaps can be detected by observing that the last coordinate for one pair of keys is different from the first coordinate of the next pair of keys. (When the lines physically touch in DLG3, the last coordinate of one pair of keys is equal to the first coordinate of the next pair.) The gap can be eliminated by simply drawing a connecting line between the coordinate groups that make-up the trace for the reach. When the assumed-connection gap occurs between reaches, the gap can be eliminated by drawing a connecting line between the reaches using the navigation attributes to identify the reaches that should touch each other.

Gaps Greater than 0.0003 miles: When these larger gaps occurred in the DLG3 data, the technician, based on judgment, had the option to manually indicate to the compilation software that a connection should exist. If the connection was within a reach, it appears in RF3-Alpha as an assumed connection and can be closed as described above. If the connection is between reaches, a new end point is added to one of the reaches to make it physically touch its connecting reach. If the end point is added to the upstream end of the reach, it is stored in the ULAT/ULON reach attributes in the Structure File. If the end point is added to the downstream end of the reach, it is stored in the DLAT/DLON reach attributes in the Structure File. The fact that a reach was extended during RF3 compilation can be detected by comparing the DLAT/DLON with the first coordinate of the first pair of LL keys and by comparing the ULAT/ULON to the last coordinate of the last pair of LL keys. If DLAT/DLON differ from the first coordinate of the first pair of LL keys, then the reach was extended at the downstream end. If ULAT/ULON differ from the last coordinate of the last pair of LL keys, then the reach was extended at the upstream end.

To construct a coordinate trace for a given reach, the coordinates must be retrieved from the DLAT/DLON and ULAT/ULON attributes and from the LL File using the LL keys. The following process will provide a stream of coordinates ordered from downstream to upstream:

As described above, the resulting list of coordinates will usually contain duplicate coordinates. The duplicates will always be next to each other in the list. They may be easily removed as the list is being built or afterwards.

RF3-Alpha Export Files

There are two ASCII export formats available for RF3-Alpha data: an ARC/INFO format and custom format known as PCRF3. Each of these is discussed in the following sections.

ARC/INFO Export Files

The ARC/INFO format of the RF3-Alpha data is in a standard ARC/INFO ASCII export format and may be used to load RF3-Alpha data into ARC/INFO using the IMPORT command. Detailed information about processing RF3-Alpha data in ARC/INFO is provided in section 7.

In addition to the standard INFO files which accompany an ARC/INFO coverage, the following additional INFO files are present in an RF3-Alpha coverage:

File Name: <cover>.DS3

1. 1 CU 8 Integer
2. 9 SEG 4 Integer
3. 13 MI 5 Character
4. 18 UPMI 5 Character
5. 23 RFLAG 1 Character
6. 24 OWFLAG 1 Character
7. 25 TFLAG 1 Character
8. 26 SFLAG 1 Character
9. 27 REACHTYPE 1 Character
10 28 LEVEL 2 Integer
11 30 JUNC 2 Integer
12. 32 DIVERGENCE 1 Integer
13. 33 USDIR 1 Character
14. 34 TERMID 5 Integer
15. 39 TRMBLV 1 Integer
16. 40 PNAME 30 Character
17. 70 PNMCD 11 Character
18. 81 CNAME 30 Character
19. 111 CNMCD 11 Character
20. 122 OWNAME 30 Character
21. 152 OWNMCD 11 Character
22. 163 DSCU 8 Integer
23. 171 DSSEG 4 Integer
24. 175 DSMI 5 Character
25. 180 CCU 8 Integer
26. 188 CSEG 4 Integer
27. 192 CMI 5 Character
28. 197 CDIR 1 Character
29. 198 ULCU 8 Integer
30. 206 ULSEG 4 Integer
31. 210 ULMI 5 Character
32. 215 URCU 8 Integer
33. 223 URSEG 4 Integer
34. 227 URMI 5 Character
35. 232 SEGL 6 Numeric 2
36. 238 RFORGFLAG 1 Integer
37. 239 ALTPNMCD 8 Integer
38. 247 ALTOWNMC 8 Integer
39. 255 DLAT 8 Numeric 4
40. 263 DLONG 8 Numeric 4
41. 271 ULAT 8 Numeric 4
42. 279 ULONG 8 Numeric 4
43. 287 MINLAT 8 Numeric 4
44. 295 MINLONG 8 Numeric 4
45. 303 MAXLAT 8 Numeric 4
46. 311 MAXLONG 8 Numeric 4
47. 323 LN1AT2 4 Integer
48. 327 LN2AT2 4 Integer
49. 331 AR1AT2 4 Integer
50. 335 AR1AT4 4 Integer
51. 339 AR2AT2 4 Integer
52. 343 AR2AT4 4 Integer
53. 347 UPDATE1 6 Character
54. 353 UPDTCD1 8 Character
55. 361 UPDTSRC1 8 Character
56. 369 UPDATE2 6 Character
57. 375 UPDTCD2 8 Character
58. 383 UPDTSRC2 8 Character
59. 391 UPDATE3 6 Character
60. 397 UPDTCD3 8 Character
61. 405 UPDTSRC3 8 Character
62. 413 DIVCU 8 Integer
63. 421 DIVSEG 4 Integer
64. 425 DIVMI 5 Character
65. 430 DLGID 6 Integer
66. 436 filler 7 Character
** Redefined Items **
67. 1 RF3RCHID 17 Character
68. 163 DSRF3RCHID 17 Character
69. 180 CURF3RCHID 17 Character
70. 198 ULRF3RCHID 17 Character
71. 215 URRF3RCHID 17 Character
72. 413 DIVRF3RCHID 17 Character

File Name: <cover>.TID

1. 1 IDTIC 4 Binary
2. 5 XTIC 4 Float 3
3. 9 YTIC 4 Float 3
4. 13 NE_QD_ID 8 Character
5. 21 NE_QD_NAME 41 Character
6. 62 NW_QD_ID 8 Character
7. 70 NW_QD_NAME 41 Character
8. 111 SW_QD_ID 8 Character
9. 119 SW_QD_NAME 41 Character
10. 160 SE_QD_ID 8 Character
11. 168 SE_QD_NAME 41 Character

Generally, the .DS3 file contains one entry for each point, node, or line feature in the coverage. The order and names of the items in the .DS3 file are similar to the standard RF3-Alpha Structure File (see Appendix B). The .TID contains one entry for each corner of each 7.5 minute quad which touches the catalog unit. The 7.5 minute map names and IDs are provided in the fields of the table entry.

PCRF3 Export Files

PCRF3 is a software package used by the EPA 305(b) program to index 305(b) waterbodies to RF3-Alpha. PCRF3 requires a special format of the reach file that consists of a set of 6 files for each catalog unit. The files are named:

1. cccccccc.TRC 4. cccccccc.NUM
2. cccccccc.RF3 5. cccccccc.ALP
3. cccccccc.CUB 6. cccccccc.RFI

where cccccccc represents a valid catalog unit number.

The .TRC file contains the linework (i.e. latitude/longitude points for the reaches). The file consists of, for each reach, a header record and one or more coordinate records. The header record in the following format:

Field Name Position
1. Reach code 1-17
2. Number of coordinate points 18-21

Each coordinate record contains a latitude/longitude pair (in decimal degrees) as follows:

Field Name Position
1. Latitude 1-8
2. Longitude 9-16

The .RF3 file contains the standard RF3-Alpha structure record (see Appendix B) with the following changes:

LL1KEY1 - contains the record number of the first coordinate record in the .TRC file for the reach.
LL2KEY1 - contains the record number of the last coordinate record in the .TRC file for the reach.
LL1KEY2 through 10 and LL2Key2 through 10 - contain information that is only pertinent to the mainframe version of RF3-Alpha.

The .CUB file contains the 1:2,000,000 scale polygons that define the catalog unit boundary. The file consists of a header record containing the catalog unit number in positions 1-8. The remaining records contain latitude/longitude coordinate pairs, up to 4 pairs per record, in the following format:

l=ddmmss,l=dddmmss, l=ddmmss,l=dddmmss, l=ddmmss,l=dddmmss, l=ddmmss,l=dddmmss,
where l is alternately the latitude then longitude, dd is degrees, mm is minutes, ss is seconds

The .ALP, .NUM, and .RFI files are binary index files designed specifically for use with PCRF3 and only need to be retrieved from the mainframe, if PCRF3 usage is planned.

Processing RF3-Alpha Data Using ARC/INFO

This discussion provides prospective users of RF3-Alpha data, derived from 1:100,000 Scale USGS DLG data, with guidelines about its use with ARC/INFO GIS software. Capitalized words in bold typeface refer to ARC/INFO commands. This discussion assumes that RF3-Alpha data are obtained using the ARC/INFO export program on EPA's IBM mainframe located in Research Triangle Park (RTP), NC. For more information about accessing the IBM mainframe and running this export program, call STORET User Assistance at 800-424-9067.

Creation of an ARC/INFO Coverage

RF3-Alpha data are output from the mainframe ARC/INFO export program as an uncompressed ARC/INFO export file in ASCII format. Each file contains data for one USGS hydrologic catalog unit (CU). Files are named using the eight digit USGS CU number. However since filenames on the IBM must begin with an alphabetic character, an "a" is used as the first character for CU's beginning with a zero ("0"), a "b" is used for those beginning with a one ("1") and a "c" is used for those beginning with a two ("2"). The file for: CU# 01020304 would be named a1020304; CU# 11030004 would be named as b1030004. All files end with an ".e00" extension.

Once these data have been transferred to a user's local workstation, the data need to be converted to an ARC/INFO coverage using the IMPORT command. After importing the RF3-Alpha data, each coverage will contain a <cover>.DS3 and <cover>.AAT file, among others, in the INFO directory. The .DS3 file can be related to the .AAT using the redefined item "RF3RCHID" as the relate item. Since the .DS3 file comes sorted on this item the ORDERED option of the RELATE command will work best. If the .DS3 file is sorted on any item other than RF3RCHID, it should be sorted back on RF3RCHID before leaving INFO or TABLES.

The projection of the imported RF3-Alpha coverage will vary depending on the options used to generate the export file. The ARC/INFO export program on the IBM mainframe allows the user to pick from the original decimal seconds, standard Albers or UTM projections. Each time the projection is changed the RF3-Alpha coverage should be rebuilt with BUILD <cover> LINE to populate the LENGTH field in the .AAT with the correct value.

Once the RF3-Alpha coverage has been imported, running BUILD on the coverage with the NODE option will create a cover .NAT which will be needed to run ARC network tools (see below).

Another INFO data file created during the import process is <cover>.TID. This file can be related to the <cover>.TIC file using IDTIC as the relate item. The <cover>.TID file contains information about the 7.5 minute quadrangles that surround each tic (e.g. quad names and USGS quad id). Each tic in the RF3-Alpha data represents a 7.5 minute quad corner allowing users to register the RF3-Alpha data to USGS topographic maps.

Polygon Topology

There are a couple of important things to remember when working with RF3-Alpha data in ARC/INFO. Number one is DON'T CLEAN THE COVERAGE! The RF3-Alpha data model uses zerolength arcs to maintain attribute connectivity of complex confluences and divergences (refer to Section 5.3.1 - Reach Entity Attributes). Running CLEAN on these data will delete these features, thereby breaking attribute connectivity. If you must have polygon topology, make a separate COPY of the coverage and run CLEAN on it with the POLY option. Be careful to maintain this polygon coverage separately.

Network Traversal Through RF3-Alpha

There are two INFO items called UP (value = 1) and DOWN (value = 0) in the <cover>.AAT. These items can be used in the ARC IMPEDANCE command to set the impedance to restrict "flow" in a particular direction when using ARC network commands such as PATH, ALLOCATE, and TOUR. To restrict the network traversal to upstream only, use IMPEDANCE DOWN UP. To restrict to downstream traversal, use IMPEDANCE UP DOWN.

Another useful ARC network tool for traversing through the RF3-Alpha is the ARC TRACE command. Unlike the other ARC network commands, this one does not use the dynamic segmentation model to create routes, but instead simply creates selected sets of reaches. TRACE "expects" that the user is working with a directed network with arcs digitized in the direction of flow. However, RF3-Alpha data come in a downstreamtoupstream direction. Therefore the syntax for an upstream trace using unaltered RF3-Alpha data is TRACE DOWNSTREAM and for a downstream trace it is TRACE UPSTREAM. To avoid this confusion, it is recommended that users issue the FLIP command in ARCEDIT to have all arcs point downstream. This requires some special processing to correctly FLIP open water bodies as described below.

Currently, all RF3-Alpha open water bodies, such as lakes and reservoirs (OWFLAG = "O"), are digitized in a counterclockwise direction. (See Appendix A for valid values for REACHTYPE.) Consequently, looking in the downstream direction, the left bank originally points downstream while the right bank points up. In order to accommodate this situation, the following ARC Macro Language (AML) routine has been written to flip all reaches to point downstream. The AML works by first selecting all reaches. From this selected set, the AML deselects left bank reaches and flips all remaining reaches. This leaves all reaches pointing downstream.

&args rf3cov
ec %rf3cov%
ef arc
relate add
[unquote ' ']
sel all
unsel ds3//owflag cn '1' and ds3//usdir = 'R'

After flipping reaches, the user should run BUILD on the RF3-Alpha coverage with the LINE option. This will rebuild the arc/node topology.

As a carryover from the un-paneled DLG3 data which was used to build RF3-Alpha, the RF3-Alpha data contain some nodes that should connect but do not. The RF3-Alpha attributes dealing with upstream-downstream connectivity show the reaches to be hydrologically connected, but the arc-node topology contains a gap. ARC network analysis will not jump these gaps. Running TRACE UPSTREAM from the downstream most point in the network is a good way to locate these gaps. To correct them, the nodes can be snapped (or moved) in ARCEDIT.

Navigation Via RF3-Alpha Attributes

The related .DS3 file contains information on upstream/downstream connectivity of reaches. An alternative to the ARC network technique for traversing reaches would be to use these attributes. In ARC, CURSORS can be used to "fetch" the up- or downstream reach allowing traversal through the network without regard to arc-node topology. However, CURSORS processing can be much slower than ARC network processing.

ARC/INFO Limitations To Know

RF3-Alpha on the IBM mainframe is not limited to 500 vertices for each reach as is ARC/INFO. Therefore, reaches that contain more than 500 vertices will be split into multiple arcs when imported. Each of these arcs will be given the same attributes (reach code) in the .AAT file, but only one record will be created in the .DS3 file. If the RF3-Alpha attributes are being used to traverse a reach network and "split" reaches are encountered, the traversal will continue, but arcs will be skipped leaving "gaps" in the final traversed set of arcs.

RF3-Alpha Update ARC Macro Language (AML) Routines

A suite of RF3-Alpha update AMLs have been produced to allow the updating of reach names as well as simple topologic changes to the interim database such as splitting, adding, and deleting reaches. The AMLs use menus to guide the user through an ARCEDIT session with RF3-Alpha. They are designed to query for necessary variables and to code particular fields with appropriate information. They help to guide the updating process and to assure the updating is performed in a standardized and well documented manner. Because the RF3-Alpha data will undergo substantial structural changes before its final release, the present version of the Update AMLs will also require substantial revision to deal with these new data structures. Consequently updates of RF3-Alpha data will not be accepted by EPA (see Section 3.2.3 for more information on RF3-Alpha updates). The user will instead have to reconcile any updates made on RF3-Alpha data with the RF3-Final dataset. However, these AMLs do provide a simple to use mechanism for making changes to RF3-Alpha data and allow users to familiarize themselves with the update process. They operate on older RF3-Alpha data containing a .DS2 file as well as the more recent data with .DS3 files. The AMLs run on SUN and Data General UNIX workstations and were developed under ARC/INFO version 6.1.

To obtain a copy of the update AMLs, contact STORET User Assistance as described on the inside front cover.

Appendix A

RF3-Alpha Attribute Descriptions

Reach codes - The Unique Feature Identifier

Reach codes are the unique identifiers of all reaches; they consist of 17 digits. As shown below, each number is constructed to include the USGS eight-digit catalog unit code, a four-digit segment number, and a five-place fixed decimal number referred to as the marker index.

Catalog Unit - an eight digit code uniquely assigned to a watershed and defined as a Federal Information Processing Standard (FIPS) maintained by the USGS. There are 2123 Catalog Units in the continental US plus Hawaii, Puerto Rico, the Panama Canal, and the US Virgin Islands. The data field name for the catalog unit is CU.

Segment - The segment number is a unique four digit number assigned to each new surface water feature within a given catalog unit. Segment numbers are assigned serially, starting at 0001, without regard for the hydrologic order of the segments. The data field name for the segment is SEG.

Marker Index - When a segment, that exists in the Reach File, is subsequently divided by a new tributary, the two pieces of the segment are assigned a marker index. Their segment numbers remain the same, thus identifying them as once being a single reach. The new downstream piece receives a marker index of zero. The new upstream piece receives a marker index which is defined as the proration or ratio of the distance from the base of the reach segment to the point of sub-division to the total length of the reach segment. Note that some Marker Indexes were assigned in RF2 when the spatial length of the reach was based on the sparse geometry from GNIS. DO NOT USE THE MARKER INDEX AS AN INDICATOR OF ACTUAL REACH LENGTH. The only valid use of the Marker Indicator is as a coding approach that enables the pieces within one Segment to be hydrologically ordered. The data field name for the marker index is MI.

In order to facilitate easy handling of updates, a fourth variable is used to identify the upstream point of the reach.

Upstream Marker Index - This number is the marker index associated with the most upstream end of the reach. The basis of this number is (in general) the distance measured from the start of the segment to the upstream end of the instant reach. For reaches derived from RF1 and RF2, this marker index is proportionally related to the reach lengths in the RF1 and RF2 spatial representations, respectively. The data field name for the upstream marker index is UPMI.

Data Source

RFORGFLAG records the version of the Reach File in which a given reach first occurred. A "1" denotes that the reach was first included in the RF1 version of the file. A "2" identifies those reaches added when the RF2 version was created and a "3" indicates this reach was added during the RF3 compilation.

Navigation Between Reaches

The navigation attributes in RF3-Alpha support two reaches converging and two reaches diverging, as illustrated in Figure A-1. The navigation attributes are associated with the "instant" reach. Points of convergence connect two input reaches with one output reach, points of divergence connect one or two input reaches with two output reaches, and points of simple connection connect one input reach to one output reach.

Figure A-1. Navigation attributes in RF3-Alpha support two reaches converging and two reaches diverging

Figure A-1

Figure A-2 contains a list of the navigation attribute data elements.

Figure A-2. List of the navigation attribute data elements

Figure A-2

The CU, SEG, and MI, respectively, of the reach downstream of the instant reach. The downstream reach is the reach into which the instant reach flows, except that if the downstream end of the instant reach is a divergent junction, then the reach identified as the downstream reach is the major recipient of the outflow from the instant reach and the reach identified as the divergent reach is the recipient of the minor portion of the outflow.
The CU, SEG, and MI, respectively, of the complement reach. The complement is the reach that flows into the confluence at the downstream end of the instant reach.
The complement direction. Looking downstream, this is the left or right (L or R) side of the instant reach to which the complement reach is connected.
The CU, SEG, and MI, respectively, of the reach branching to the left from the upstream end of the instant reach.
The CU, SEG, and MI, respectively, of the reach branching to the right from the upstream end of the instant reach.
Looking upstream, this attribute specifies which reach (up left or up right) is the direction of the main network path.
The CU, SEG, and MI, respectively, of the divergent reach which is the minor recipient of outflow from the instant reach.
A one-digit divergence code which is assigned a value when there is a divergent junction at either end of a reach. A reach flowing into a divergent junction is referred to as the "input reach"; the reach which receives the majority of the flow from a divergent junction is the "major reach"; and the reach which receives the minority of the flow from a divergent junction is the "minor reach".

Value Definition
0 Not part of a divergent junction
1 Input reach where minor reach is downstream left
2 Input reach where minor reach is downstream right
3 Minor reach of a divergence
4 Combination of divergence values 1 and 3
5 Combination of divergence values 2 and 3
6 Major reach of a divergence
7 Combination of divergence values 1 and 6
8 Combination of divergence values 2 and 6

Stream Levels

Each reach is assigned a stream level which defines the hierarchical relationship between streams and tributaries in a given drainage network. A tributary to a given stream is always one level higher than the stream into which it flows. For instance, the Mississippi River is a level-one stream, the Ohio River is a level-two stream, and the Tennessee River is a level-three stream. Stream levels are useful in retrieval algorithms. A "level path" can be followed to identify all mainstem reaches for a given river. For instance, the mainstem of the Mississippi River can be readily identified by retrieving all level-one reaches upstream of the Mississippi River terminus.

Where RF1 and RF2 were successfully overlayed on the 1:00,000-scale DLG hydrography during RF3 compilation, the stream levels were inherited from RF1 and RF2. The RF1 stream levels were manually determined from source data during the creation of RF1. RF2 stream levels, by definition, were one level greater than those in RF1. For 1:100,000-scale DLG streams not successfully overlayed by RF1 or RF2 during RF3 compilation, stream levels were assigned based upon the assumption that the "straightest" path was the continuation of the current level. Where this approach did not adequately discriminate between the two upstream reaches, the rightmost reach was assumed to be the continuation of the current level.

The stream level of the instant reach.
The junction code pertaining to the junction at the downstream end of the instant reach. In the case of a convergent junction, this is equal to the level of the reach downstream of the instant reach. For simple junctions (junctions with one stream flowing into the junction and one stream flowing out of the junction) and divergent junctions this variable is assigned the value of zero.
The terminal base level: the level of the terminal reach of the stream system to which the instant reach belongs. Terminal reaches are assigned levels according to the manner in which the streams terminate, as follows:
TRMBLV Description
1 Stream outlet is Atlantic, Pacific, or Gulf of Mexico;
2 Stream outlet is one of the Great Lakes, or the Great Salt Lake;
3 Stream exits from US into Canada or Mexico;
4 Isolated drainage (flows into the ground).
Currently, TRMBLV is valued only for terminal reaches.

Sequence Numbers

The Reach File is designed with the capability to perform routing of flow and other constituents. While there are many attributes in the Reach File that assist in routing, using the sequence numbers, stored in the attribute SEQNO, is the fastest and easiest method. When the Reach File is ordered by the sequence numbers, it is in upstream to downstream order. Originally, used in RF1.

SEQNO: Not currently valued.

Navigation-Related Attributes

Each reach has attributes which define the start and stop of its level path. For example, if the instant reach is a level 3, then these attributes provide the terminal CU-SEG and start CU-SEG of this level 3 path.

STARTCU: The catalog unit number of the start reach of the same level path on the instant reach.
STARTSEG: The segment number of the start reach of the same level path on the instant reach.
STOPCU: The catalog unit number of the most downstream reach on the same level path as the instant reach.
STOPSEG: The segment number of the most downstream reach on the same level path as the instant reach.

Using the STARTCU/STARTSEG values, it is possible to "jump" through several CUs, finding the next STARTCU/STARTSEG in the each CU.

Reach Types

The term "reach type" refers to a one-character code which has been assigned to each reach. These type codes were generated, in part, from the DLG3 area and line attribute codes. Where DLG attribute codes appeared to be incorrectly assigned, temporary code changes were used to allow correct reach typing and to permit networking to be completed. For example, some lake shorelines had the DLG3 attribute code indicating that they were single line streams and, during compilation, the code would be temporarily changed to correctly indicate a lake shoreline. The original codes are stored in the DLG-3 code attributes.

The valid reach type codes are as follows:

A Artificial Lake (RF1/RF2) Reach
C Continental Coastline Reach
Refers to a reach which represents a coastline on the Atlantic, Pacific or Gulf coasts.
F Falls Reach
A reach which is either a waterfall, drop spillway, or a reach of rapids.
G Great Lakes Shoreline Reach
Refers to a reach which represents a coastline in the Great Lakes.
H Headwater Lake Reach
A headwater reach, identified as a lake, which has no reaches above it in the reach file. This type of reach has either one or two reaches connected to its downstream end.
I Island Shoreline Reach
Identifies a reach whose DLG3 attributes identified it as an island shoreline.
J Braided Stream Envelope
Stream reaches which are around the perimeter of an unnetworked braided stream system.
L Lake Shoreline Reach
A reach which follows the shoreline of a lake other than the Great Lakes.
N Isolated Stream Reach
A stream reach not having navigation links in to other reaches.
O Apparent Limit Reach
A non-transport reach, usually designated by the DLG attributes as a marsh or wetlands.
P Indefinite or Intermittent Shoreline Reach
A non-transport reach, usually designated by the DLG attributes as a shoreline without definite boundaries.
Q Questionable Shoreline Reach
A reach which could be either an island or another closed area.
R Regular Reach
A reach which has upstream and downstream reaches connected to it and which is not classified as another type of reach.
S Start Reach
A headwater reach which has no reaches above it in the reach file. This type of reach has either one or two reaches connected to its downstream end.
T Terminal Reach
A reach downstream of which there is no other reach (for example, a reach which terminates into an ocean, a land-locked lake, or the ground). This type of reach has either one or two reaches connected to its upstream end.
U Unknown Reach
Reach cannot be classified.
V Open Water Terminal Reach
A reach which is both a terminal reach and an artificial open water reach.
W Wide-River Shoreline Reach
A reach which identifies either the Right or Left bank of a wide river.
X Terminal Start Reach
A reach which is both a terminal reach and a start reach.
Z Terminal Entry Reach
A reach which is both a terminal reach and an entry reach.

Routing Flags

Each reach has four (4) one-character flag attributes which can be used in various routing schemes. The flags are given values of 0 or 1, corresponding to "no" or "yes", respectively. Only connected reaches can have routing flag values of 1. The four flags are defined below:

RFLAG - Routing Flag for transport reaches:

0 = Reach is not a connected reach.
1 = Reach is a connected reach.
OWFLAG - Open Water Flag: (valid if RFLAG = 1) (i.e. not set for isolated lakes)
0 = Reach is not in a lake, reservoir, wide river, bay, or other open water.
1 = Reach is part of an open water (types A, H, L, M, V, or W).
TFLAG - Terminal Flag:

0 = Reach is not a terminal reach.
1 = Reach is terminal (types T, V, X, or Z)
SFLAG - Start Flag:

0 = Reach is not a start reach.
1 = Reach is a start reach (types E, H, S, X, or Z).

Reach Length

The length of the instant reach measured to the nearest one-hundredth (0.01) of a mile is stored in the attribute SEGL. This length was calculated from all of the latitude/longitude coordinate values for the entire length of the reach.

NOTE: The difference between MI and UPMI will NOT always equal SEGL. MI/UPMI are proportionally related indexes based upon mileage calculations from RF1/RF2 and the DLG. For all mileage calculations, SEGL should be used.

Names and Name Codes

The name, or names, associated with a given reach may have a maximum length of thirty (30) characters. Pseudo-names are included where reach names could not be identified when data was being compiled for RF1 and RF2. Pseudo-names in all cases consist of an asterisk followed by a single letter, e.g. "*A". Each reach name has an eleven (11) digit name code associated with it in order to uniquely identify the surface water represented by the reach.

Names that were assigned to the original RF1 reaches are stored in upper case letters. These names were developed manually from the source maps used to compile RF1. Names that were assigned to reaches that originated in RF2 and RF3-Alpha are stored in upper and lower case letters. These names came from a blind conflation of the 1988-version of the Geographic Names Information System (GNIS) data onto the RF3-Alpha data.

Up to three different names and name codes may be associated with a given reach:

These attributes are the primary name and name code, respectively, associated with the reach. For connected reaches, including open water reaches, this is a stream name. Each connected shoreline reach around an open waterbody will bear the name of the stream which feeds that open waterbody. All reaches of a given stream have been assigned the same name and the same primary name code. Other streams having that same name will have different primary name codes. For example, a Back Creek in Virginia would have a different PNMCD from any other Back Creek in Virginia or any other state.
These attributes are the common name and name code, respectively, which are reserved for the storage of alternate names.
These attributes are the open water name and name code respectively. For open water reaches (OWFLAG =1), this is the name of the lake or wide river in which the reach resides.

TERMID: Terminal Stream System Identifier

Each reach within a given terminal stream system is assigned a 5-digit code, unique to that terminal system. This code can be used to readily identify all reaches in a given stream system.

Currently, TERMID is not valued.

Latitude/Longitude Coordinates

There are numerous attributes in the Structure File that contain latitude/longitude coordinates. These attributes can be used for certain types of geographically-based retrieval and analysis, without the need to access the Coordinate ("LL") File. All latitude/longitude data are given in decimal degrees to the nearest 0.0001 degree.

Large-scale areal retrievals can be performed using the minimum and maximum latitude/longitude pairs. These coordinates define the smallest north-south/east-west rectangle containing the reach.

North-south box window coordinates. The largest and smallest latitudes and longitudes found in the digitized trace of the instant reach.
Coordinate points for the end points of the reach. A "skeleton" trace can be generated using these coordinates.

Update Codes

Nine attributes are available to track updates to RF3-Alpha. Three sets of three variables allow for the historical tracking of the updates:

UPDATE: Date of the update.
UPDTCD: Unique code for a set of updates.
UPDTSRC: Code identifying the party performing the updates.

"1" series:
Set during the initial production. UPDTSRC equals "DLG100K" except for California where the first two digits are equal to the Teale Data Center dataset block number ("01" through "33") plus the six digit California-assigned line identifier (different from the DLG line id in their datasets). UPDATE equals the date when PCRF3 compilation was completed. UPDTCD equals "PCRF8RF3."
"2" and "3" series:
Set during future update procedures. Refer to "River Reach File (RF3-Alpha) Update and Quality Control Standards, Procedures and Management," internal EPA draft document, for more detailed instructions and descriptions on updates.

USGS DLG Attributes

Many of the attribute codes associated with the original DLG data were included in the Structure File. DLG3 major codes were not included, since only hydrographic features (major code 050) were included in RF3-Alpha. Two pairs of area attributes and two line attributes were kept.

LN1AT2, LN2AT2: Line attribute 1 and 2.
AREA1, AREA2: Area identifier 1 and 2.
AR1AT2, AR1AT4: First pair of area attributes.
AR2AT2, AR2AT4: Second pair of area attributes.
DLGID: This field is blank except for Arizona, California and Hawaii where it is equal to the DLG line id from their enhanced DLG files.

Appendix B

Structure Record Layer

Var Name Data Type Length Positions Description
CU numeric 8 1 - 8 Cataloging Unit
SEG numeric 4 9 - 12 Segment Number
MI numeric 5.2 13 - 17 Marker Index
UPMI numeric 5.2 18 - 22 Upstream Marker Index
SEQNO numeric 11.6 23 - 33 Hydro Sequence No.[future use]
RFLAG character 1 34 - 34 Reach Flag (0,1)
OWFLAG character 1 35 - 35 Open Water Flag(0,1)
TFLAG character 1 36 - 36 Terminal Flag (0,1)
SFLAG character 1 37 - 37 Start Flag (0,1)
REACHTYPE character 1 38 - 38 Reach Type Code
LEV numeric 2 39 - 40 Stream Level
JUNC numeric 2 41 - 42 Level of Downstream Reach
DIVERGENCE numeric 1 43 - 43 Divergence Code
STARTCU numeric 8 44 - 51 Start CU [future use]
STRTSG numeric 4 52 - 55 Start SEG [future use]
STOPCU numeric 8 56 - 63 Stop CU [future use]
STOPSG numeric 4 64 - 67 Stop SEG [future use]
USDIR character 1 68 - 68 Upstream Direction of main path
TERMID numeric 5 69 - 73 Terminal Stream ID [future use]
TRMBLV numeric 1 74 - 74 Terminal Base Level [future use]
PNAME character 30 75 - 104 Primary Name
PNMCD numeric 11 105 - 115 Primary Name Code
CNAME character 30 116 - 145 Common Name
CNMCD numeric 11 146 - 156 Common Name Code
OWNAME character 30 157 - 186 Open Water Name
OWNMCD numeric 11 187 - 197 Open Water Name Code
DSCU numeric 8 198 - 205 Downstream CU
DSSEG numeric 4 206 - 209 Downstream SEG
DSMI numeric 5.2 210 - 214 Downstream MI
CCU numeric 8 215 - 222 Complement CU
CSEG numeric 4 223 - 226 Complement SEG
CMILE numeric 5.2 227 - 231 Complement MI
CDIR character 1 232 - 232 Complement Direction
ULCU numeric 8 233 - 240 Upstream Left CU
ULSEG numeric 4 241 - 244 Upstream Left SEG
ULMI numeric 5.2 245 - 249 Upstream Left MI
URCU numeric 8 250 - 257 Upstream Right CU
URSEG numeric 4 258 - 261 Upstream Right SEG
URMI numeric 5.2 262 - 266 Upstream Right MI
SEGL numeric 6.2 267 - 272 Reach Length (Miles)
RFORGFLAG character 1 273 - 273 RF Origin flag(1,2,3)
ALTPNMCD numeric 8 274 - 281 Alt. Primary Name Code [future use]
ALTOWNMC numeric 8 282 - 289 alt. OW Name Code [future use]
DLAT numeric 8.4 290 - 297 Downstream Latitude
DLONG numeric 8.4 298 - 305 Downstream Longitude
ULAT numeric 8.4 306 - 313 Upstream Latitude
ULONG numeric 8.4 314 - 321 Upstream Longitude
MINLAT numeric 8.4 322 - 329 Minimum Latitude
MINLONG numeric 8.4 330 - 337 Minimum Longitude
MAXLAT numeric 8.4 338 - 345 Maximum Latitude
MAXLONG numeric 8.4 346 - 353 Maximum Longitude
NDLGREC numeric 4 354 - 357 No. of DLG Records
LL1KEY1 numeric 10 358 - 367 Starting DLG LL Key 1
LL2KEY1 numeric 10 368 - 377 Ending DLG LL Key 1
LL1KEY2 numeric 10 378 - 387 Starting DLG LL Key 2
LL2KEY2 numeric 10 388 - 497 Ending DLG LL Key 2
LL1KEY3 numeric 10 398 - 407 Starting DLG LL Key 3
LL2KEY3 numeric 10 408 - 417 Ending DLG LL Key 3
LL1KEY4 numeric 10 418 - 427 Starting DLG LL Key 4
LL2KEY4 numeric 10 428 - 437 Ending DLG LL Key 4
LL1KEY5 numeric 10 438 - 447 Starting DLG LL Key 5
LL2KEY5 numeric 10 448 - 457 Ending DLG LL Key 5
LL1KEY6 numeric 10 458 - 467 Starting DLG LL Key 6
LL2KEY6 numeric 10 468 - 477 Ending DLG LL Key 6
LL1KEY7 numeric 10 478 - 487 Starting DLG LL Key 7
LL2KEY7 numeric 10 488 - 597 Ending DLG LL Key 7
LL1KEY8 numeric 10 498 - 507 Starting DLG LL Key 8
LL2KEY8 numeric 10 508 - 517 Ending DLG LL Key 8
LL1KEY9 numeric 10 518 - 527 Starting DLG LL Key 9
LL2KEY9 numeric 10 528 - 537 Ending DLG LL Key 9
LL1KEY10 numeric 10 538 - 547 Start DLG LL Key 10
LL2KEY10 numeric 10 548 - 557 Ending DLG LL Key 10
LN1AT2 numeric 4 558 - 561 DLG Line Attribute 1
LN2AT2 numeric 4 562 - 565 DLG Line Attribute 2
AREA1 numeric 4 566 - 569 DLG Area ID 1
AREA2 numeric 4 570 - 573 DLG Area ID 2
AR1AT2 numeric 4 574 - 577 DLG Area attribute
AR1AT4 numeric 4 578 - 581 DLG Area attribute
AR2AT2 numeric 4 582 - 585 DLG Area attribute
AR2AT4 numeric 4 586 - 589 DLG Area attribute
UPDATE1 character 6 590 - 595 Update Date #1(mmddyy)
UPDTCD1 character 8 596 - 603 Update type Code #1
UPDTSRC1 character 8 604 - 611 Update Source #1
UPDATE2 character 6 612 - 617 Update Date #2(mmddyy) [future use]
UPDTCD2 character 8 618 - 625 Update Type Code #2 [future use]
UPDTSRC2 character 8 626 - 633 Update Source #2 [future use]
UPDATE3 character 6 634 - 639 Update Date #3(mmddyy) [future use]
UPDTCD3 character 8 640 - 647 Update Type Code #3 [future use]
UPDTSRC3 character 8 648 - 655 Update Source #3 [future use]
DIVCU numeric 8 656 - 663 Divergent CU
DIVSEG numeric 4 664 - 667 Divergent SEG
DIVMILE numeric 5.2 668 - 672 Divergent MI
DLGID numeric 6 673 - 678 DLG number (special use)
filler character 7 678 - 685 Filler [future use ]

Appendix C

Coordinate Record Layout

Var Name Data Type Length Start/Stop Description
1. LLKEY character 4 1-4 Unique Coordinate Key, Single Precision binary representation of a 10 digit integer stored in 4 bytes.
2. LATITUDE character 3 5-7 Latitude in decimal degrees to the 1 10,000th of a degree. Value is converted to a 6 digit integer in a single precision binary representation of 4 bytes. Byte 1 is discarded and bytes 2 through 4 are stored in the 3 byte character field.
3. LONGITUDE character 3 8-10 Longitude in decimal degrees to the 1 10,000th of a degree Value is converted to a 7 digit integer in a single precision binary representation of 4 bytes. Byte 1 is discarded and bytes 2 through 4 are stored in the 3 byte character field.

Local Navigation

Jump to main content.