Note: EPA no longer updates this information, but it may be useful as a reference or resource.
Summary of National Dialogue Session With EPA's GIS Workgroup
New York, New York
May 13, 2008
One component of EPA’s National Dialogue on Access to Environmental Information is a series of stakeholder listening sessions to elicit input on the types of environmental information that EPA’s stakeholders use, how they use it, and their preferred formats, channels, and venues for receiving this information. A summary of the findings from a dialogue listening session with the EPA Geographic Information Systems (GIS) Workgroup is presented below. This session was part of a more broad discussion on the potential future of Federated GIS at the Agency, and the discussion has relevance both specifically to that topic as well as more generally to the National Dialogue.
Types of Geospatial Information that Participants Use
The EPA GIS Workgroup includes members from all program and regional offices across the Agency, as well as a number of external partners and contractors that work as a group to advance our shared ability to incorporate place-based approaches into a wide range of business processes. One of the main concepts discussed during this session was the role of EPA as a significant consumer of geospatial data and services that are offered by other organizations, and a relatively minor producer of geospatial information that is made available for use by our partners. Much of the data required by this group is defined in OMB Circular A-16 as “framework” geospatial data. The framework layers that are most widely used throughout EPA include:
Biological Resources: This dataset includes data pertaining to or descriptive of (nonhuman) biological resources and their distributions and habitats, including data at the suborganismal (genetics, physiology, anatomy, etc.), organismal (subspecies, species, systematics), and ecological (populations, communities, ecosystems, biomes, etc.) levels.
Cadastral: Cadastral data describe the geographic extent of past, current, and future right, title, and interest in real property, and the framework to support the description of that geographic extent. The geographic extent includes survey and description frameworks such as the Public Land Survey System, as well as parcel-by-parcel surveys and descriptions.
Climate: Climate data describe the spatial and temporal characteristics of the Earth's atmosphere/hydrosphere/land surface system. These data represent both model-generated and observed (either in situ or remotely sensed) environmental information, which can be summarized to describe surface, near surface and atmospheric conditions over a range of scales.
Cultural and Demographic Statistics: These geospatially referenced data describe the characteristics of people, the nature of the structures in which they live and work, the economic and other activities they pursue, the facilities they use to support their health, recreational and other needs, the environmental consequences of their presence, and the boundaries, names and numeric codes of geographic entities used to report the information collected.
Digital Ortho Imagery: This dataset contains georeferenced images of the Earth's surface, collected by a sensor in which image object displacement has been removed for sensor distortions and orientation, and terrain relief. For very large surface areas, an Earth curvature correction may be applied. Digital orthoimages encode the optical electromagnetic spectrum as discrete values modeled in an array of georeferenced pixels. Digital orthoimages have the geometric characteristics of a map, and image qualities of a photograph.
Earth Cover: The Earth Cover theme uses a hierarchical classification system based on observable form and structure, as opposed to function or use. This system transitions from generalized to more specific and detailed class divisions, and provides a framework within which multiple land cover and land use classification systems can be cross-referenced. This system is applicable everywhere on the surface of the Earth. This theme differs from the Vegetation and Wetlands themes, which provide additional detail.
Elevation Terrestrial: This data contains georeferenced digital representations of terrestrial surfaces, natural or manmade, which describe vertical position above or below a datum surface. Data may be encapsulated in an evenly spaced grid (raster form) or randomly spaced (triangular irregular network, hypsography, single points). The elevation points can have varying horizontal and vertical resolution and accuracy.
Federal Land Ownership Status: Federal land ownership status includes the establishment and maintenance of a system for the storage and dissemination of information describing all title, estate or interest of the federal government in a parcel of real and mineral property. The ownership status system is the portrayal of title for all such federal estates or interests in land.
Flood Hazards: National Flood Insurance Program has prepared flood hazard data for approximately 18,000 communities. The primary information prepared for these communities is for the 1 percent annual chance (100-year) flood, and includes documentation of the boundaries and elevations of that flood.
Geodetic Control: Geodetic control provides a common reference system for establishing coordinates for all geographic data. All NSDI framework data and users' applications data require geodetic control to accurately register spatial data. The National Spatial Reference System is the fundamental geodetic control for the United States.
Geographic Names: This dataset contains data or information on geographic place names deemed official for federal use by the U.S. Board on Geographic Names as pursuant to Public Law 80-242. Geographic Names information includes both the official place name (current, historical, and aliases) and locative direct (i.e., geographic coordinates) and indirect (i.e., State and County where place is located) geospatial identifiers and categorized as populated places, schools, reservoirs, parks, streams, valleys, and ridges.
Geologic: The geologic spatial data theme includes all geologic mapping information and related geoscience spatial data (including associated geophysical, geochemical, geochronologic, and paleontologic data) that can contribute to the National Geologic Map Database as pursuant to Public Law 106-148.
Governmental Units: These data describe, by a consistent set of rules and semantic definitions, the official boundary of federal, state, local, and tribal governments as reported/certified to the U.S. Census Bureau by responsible officials of each government for purposes of reporting the Nation's official statistics.
Hydrography: This data theme includes surface water features such as lakes, ponds, streams and rivers, canals, oceans, and coastlines. Each hydrography feature is assigned a permanent feature identification code (Environmental Protection Agency Reach Code) and may also be identified by a feature name. Spatial positions of features are encoded as centerlines and polygons. Also encoded is network connectivity and direction of flow.
Shoreline: Shorelines represent the intersection of the land with the water surface. The shoreline shown on NOAA Charts represents the line of contact between the land and a selected water elevation. In areas affected by tidal fluctuations, this line of contact is the mean high water line.
Soils: Soil data consist of georeferenced digital map data and associated tabular attribute data. The map data describe the spatial distribution of the various soils that cover the Earth's surface. The attribute data describe the proportionate extent of the various soils as well as the physical and chemical characteristics of those soils. The physical and chemical properties are based on observed and measured values, as well as model-generated values. Also included are model-generated assessments of the suitability or limitations of the soils to various land uses.
Transportation: Transportation data are used to model the geographic locations, interconnectedness, and characteristics of the transportation system within the United States. The transportation system includes both physical and non-physical components representing all modes of travel that allow the movement of goods and people between locations.
Vegetation: Vegetation data describe a collection of plants or plant communities with distinguishable characteristics that occupy an area of interest. Existing vegetation covers or is visible at or above the land or water surface and does not include abiotic factors that tend to describe potential vegetation.
Watershed Boundaries: This data theme encodes hydrologic, watershed boundaries into topographically defined sets of drainage areas, organized in a nested hierarchy by size, and based on a standard hydrologic unit coding system.
Wetlands: The wetlands data layer provides the classification, location, and extent of wetlands and deepwater habitats. There is no attempt to define the proprietary limits or jurisdictional wetland boundaries of any federal, state, or local agencies.
In addition to Federal framework layers, the Workgroup members also heavily rely on commercial basemap and imagery data and services offered by Microsoft, Google, ESRI, TeleAtlas and GlobeXplorer. These commercially available datasources are made available to Agency GIS users through a number of enterprise license agreements, and the use is monitored by the Office of Environmental Information (OEI) and the GIS Workgroup leadership to determine the ongoing need for and value of these subscriptions.
Several organizations at the Agency also produce geospatial data and services that are needed for their own missions, and a few produce data that are widely disseminated in the environmental community and heavily used by others. The most well known and established geospatial data published by EPA include the suite of AirNow products, the Facilities Registry System and the National Hydrographic Dataset “NHDPlus.”
GIS Workgroup information requirements are captured in the Agency’s newly developed Geospatial Services Segment Architecture. At the dialogue session, EPA’s Geospatial Information Officer (GIO) provided a brief overview of the contents of this document, and announced that an Agency-wide review period will be opening this summer for comments. Through the capture of these comments and the Enterprise Architecture process, the group expects to have a vetted data architecture available for release in the early Fall at the latest, and work will then begin on building solutions against this target.
Formats for Geospatial Information
The development of these solutions was the main focus of the dialogue session, and participants discussed the current and anticipated future requirements for data formats and access methods to the store of “common need” national geospatial data for the Agency. There is general agreement in the group that geospatial data are required in a number of different formats to meet different analytical and mapping requirements by Agency GIS users. While not a completely inclusive list, the primary formats of data required by the GIS Workgroup include:
- ESRI Shapefiles
- ESRI File Geodatabase
- ESRI Personal Geodatabase
- KML (Keyhole Markup Language)
- GeoRSS (Geospatially enabled RSS feeds)
- Direct Database Connections
- WMS (Open Geospatial Consortium Web Map Services)
- WFS (Open Geospatial Consortium Web Feature Services)
- CS-W (Open Geospatial Consortium Catalogue Service for the Web)
- WPS (Open Geospatial Consortium Web Processing Services)
- Other geospatial analytical services available over SOAP and REST interfaces
The GIS community at the Agency has benefited greatly from the work on data and services standards development that have been undertaken by the Open Geospatial Consortium and its many partners. Through the OGC interoperability standards development process, EPA is in a position to influence the development and adoption of internationally accepted standards for delivering geospatial information, and thus, the requirements from this dialogue group for data formats are likely to be much more clear and refined than they may be for other groups.
Channels and Venues for Delivery of Geospatial Information
For the EPA GIS community, this is the area that requires the most attention as we move forward with implementing components of the Access Strategy. During the dialogue session, the group discussed the current state of geospatial data access in their respective organizations, and found that in many cases, there is room for improvement in how we deliver data and services to these users.
The group agreed that there is a large amount of redundant work taking place in the community, as multiple organizations have traditionally worked independently to build their own local datastores containing framework geospatial data that is made available by other organizations. While the exact amount of effort spent in this area is difficult to quantify, the group agreed that eliminating this redundant effort would be of great benefit to everyone. EPA’s GIO asked the group to work with him in the coming weeks to try and better estimate how much time and money organizations generally spend on the compilation of these stovepiped, redundant datastores.
The proposed solution to this problem as discussed during the dialogue session is a centralized archive of national geospatial data, maintained by OEI, and made available to GIS users throughout the Agency and our network of partners who have access to intranet resources. The characteristics of the proposed system would include:
The ability for regions and programs to replicate all or parts of the central database into their own systems.
A published update and maintenance schedule to better communicate to the user community when changes are being made, with well defined change control governance.
Complete, FGDC compliant metadata registered for all data in the EPA GeoData Gateway.
Multiple access mechanisms to discover and access data holdings (see Formats section above for the list of key data formats that should be delivered).
The group also recognized that in the long term, the needs for this system should diminish, and that the solution should be relatively temporary in nature (available for a period of two to five years, for example). This is due to the fact that the vast majority of data required in the archive are maintained and made available by other organizations. Ultimately, EPA staff and our partners need to be able to access these types of information as services directly from the responsible data stewards (i.e.: other Federal agencies as well as state and local governments). For the time being, we are unable to access much of the data that is needed for Agency business in our required formats and through our required distribution mechanisms. Thus there is a need to take on this responsibility in our enterprise in the short term, while working with our partners in collaborative organizations like the Federal Geographic Data Committee, Geospatial Line of Business, and National States Geographic Information Council to fulfill our longer term requirements.
Representatives from EPA Regions 1, 2, 3, 5, 6, 8, 9 and 10, Office of Environmental Information, EPA Chesapeake Bay Program, Office of Pollution Prevention, Office of Water, Office of Research and Development, DigitalGlobe, NYSDEC, NYC DOHMH, Horizon Systems, Synergist, General Dynamics/ERT, NYC DOHMH, Lockheed Martin, ESRI, NYC Parks, RTI, HRWA, Geodecisions, NJDEP, CSC, and Indus.