Envirofacts Data Service API
Envirofacts provides a single point of access to U.S. EPA environmental data contained in U.S. EPA databases. Interested parties from State and local governments, EPA or other Federal agencies, or individuals can search for information about environmental activities that may affect air, water, and land anywhere in the United States. Envirofacts makes it easy to find information using an address, ZIP Code, city, county, water body, or other geographic designation. Envirofacts make it easy to find information from all sources or within specific environmental subject areas, such as Waste, Water, Toxics, Air, Radiation, and Land. Experienced users can use more sophisticated capabilities such as maps or customized reporting.
Envirofacts gathers and aggregates data on activities that impact the environment from a wide variety of sources, including: ICIS-AIR, Biennial Reporting (BR), Superfund Enterprise Management System (SEMS), Enforcement and Compliance History Online (ECHO/IDEA), Cleanups in My Community, Grants Information and Control System (GICS), Information Collection Rule Database (ICR), Permit Compliance System (PCS), RADNet database, Resource Conservation and Recovery Act Information (RCRAInfo), Safe Drinking Water Information System (SDWIS), Toxics Release Inventory (TRI), Toxic Substances Control Act (TSCA), and UV Index. Envirofacts also provides access to data registries, including the Facility Registry Service (FRS), the Locational Reference Tables database (LRT), and the Substance Registry Services (SRS). To help users visualize this information and link it to geographical features and landmarks, Envirofacts uses geospatial datasets from a variety of sources, including EPA, the U.S. Census Bureau, the U.S. Geological Survey (USGS), and Bing Maps.
With access to nearly 20 EPA data sources, a wide range of search and reporting tools, web widgets, geographic integration and analysis tools, and extensive online documentation of the source data sets, Envirofacts is a major factor in helping EPA meet its Open Government Initiative objectives. Building on this theme of providing as much access as possible to EPA data systems, Envirofacts now provides a data service API for custom searches of many of the these databases.
EF Data Service API
Envirofacts has developed a RESTful data service API to all of its internal data holdings. This functionality provides a user of Envirofacts with the ability to query any table through the use of a URL. The default output is in XML, which can be utilized in other applications, as well as tools such as Microsoft Excel or Access. Output options of CSV and Excel can be requested in the URL. The entire Envirofacts database metadata is available online, so that all tables and columns within Envirofacts are documented. Having the table structures available in this manner makes most of Envirofacts data easily available through the service.
Strengths and Limitations of the Envirofacts RESTful Service
This is a simple to use, well-documented way to access multiple data sources without the need of user database connections. The resulting output can be easily used with desktop applications such as Excel or Access, or used in creating Web mash-ups.
The current release of the service, while providing easy access to a tremendous amount and range of data, is limited to querying up to three tables. Also, the output is limited to 10000 rows of data at a time, but a user can pick which 10000 rows of data and then return to retrieve the next 10000.
Constructing a Search
To build a search, users create a URL with a specific set of parameters. This is done by creating a string using the following format:
- Table Name – At least one table name is required. Up to three table names can be entered. When inserting multiple tables into the URL, they must share an ID or common column, so that the tables can be joined or linked. To retrieve an accurate output it is best to use tables that share an ID column. For example, within the tables that make up the TRI Facility Information, they each share an ID column known as TRI_FACILITY_ID. This can be visually seen within the Envirofacts model pages for various subject areas like TRI Facility Information. Please refer to the Envirofacts database metadata to locate tables that can be joined via ID columns within the RESTful data service.
- Column Name – This is an optional entry. Enter a column name and value to limit the results. Multiple columns may be used within the URL to limit the data from a table or tables. The column name is not case sensitive.
- Operator – This is an optional entry. This parameter allows users to pass in an operator with the query. Default output is "=" and does not require an operator, but users can enter "<", " >", "!=", "BEGINNING", "CONTAINING", operators as well via the URL. The "BEGINNING" operator will return rows where the start of a column value is equal to the search value. While the "CONTAINING" will return rows where the search value is contained within the column value.
OPERATOR USAGE = The database will only return rows where the column value is equal to the search value. != The database will only return rows where the column value is NOT equal to the search value. < The database will only return rows where the column value is less than the search value. > The database will only return rows where the column value is greater than the search value. BEGINNING The database will only return rows where the start of column value is equal to the search value. A comparison is done, character by character, up to the last character entered for the search value. CONTAINING For Character fields only. The database will only return rows where the search value is contained within the column value. As an example if the search value entered is "ABC" and the column value is "CCABCDD" then the row will be accepted. Using the same search value of "ABC" if the column value was "AABBCC" then the row will be rejected.
- Column Value – This is an optional entry (except when using #2 - Column Name). The column value is queried against the database without modification, so this value is case-sensitive. Use the program system model and queries within Envirofacts to double check the case for the value.
- Rows – This is an optional entry. Specify the rows to display by entering 'rows/ <first_row> : <last_row>' Results numbering starts at 0. So to get the first five hundred rows enter rows/0:499 If rows is not specified, the default is the first 10000 rows.
- Output Format – This is an optional entry. The default output is in XML; however, output options of JSON, CSV or Excel can be requested in the URL. The output format is not case sensitive.
7. Count - This is an optional entry and is shown as #7 in the above URL image. Count shows the total number of records that will be returned for this search once the Count option is removed. When Count is used, Excel, CSV, or XML cannot be specified. The column name is not case sensitive.
returns result set records 500 to 505 from the TRI_FACILITY table where STATE_ABBR = "VA".
returns the first 10000 records from PCS_PERMIT_FACILITY table where the MAJOR_DISCHARGE_INDICATOR = "M".
returns the first 20 records from T_DESIGN_FOR_ENVIRONMENT table as a JSON output.
returns results from the TRI_FACILITY and TRI_REPORTING_FORM tables where STATE_ABBR = "HI" in the TRI_FACILITY table.
returns results from the TRI_FACILITY, TRI_REPORTING_FORM and TRI_CHEM_INFO tables where STATE_ABBR = "VA" in the TRI_FACILITY table in Excel format .
returns results from the TRI_FACILITY table where ZIP CODE "Begins With" 60085 in a JSON format.
returns results from the TRI_FACILITY table where STATE_ABBR = "VA" in a CSV format.
returns results from the TRI_FACILITY table where STATE_ABBR = "HI" and COUNTY_NAME = "HONOLULU".
returns results from the TRI_FACILITY table where STATE_ABBR = "VA" and results from the TRI_REPORTING_FORM table where REPORTING_YEAR = 2010.
returns results showing the total number of records in the TRI_FACILITY table.
returns results showing the total number of records in the TRI_FACILITY table where STATE_ABBR = "HI".
Using Envirofacts Metadata to Construct a Search
Envirofacts employs a set of graphical database models to convey information about the database relationships and individual table structures. Searches can be constructed for any data system that has an Envirofacts data model and table metadata. Users can follow these links to the models to ascertain the necessary table and column names to build their custom searches:
- Envirofacts Model
- Biennial Reporting (BR)
- Superfund Enterprise Management System (SEMS)
- Facility Registry Service (FRS)
- Greenhouse Gas (GHG)
- Next Generation Grants System (NGGS)
- Locational Information (LRT)
- Permit Compliance System (PCS)/Integrated Compliance Information System (ICIS)
- Resource Conservation and Recovery Act Information (RCRAInfo)
- Safe Drinking Water Information System (SDWIS)
- Toxics Release Inventory (TRI)
The starting points are contained in the links above, but information about system databases and their tables are also available through any of the Data System “Overview” pages, which can be found by navigating through the Envirofacts site to one of the system data information links. For example, from the “About the Data” tab there are “More information on” links to all “Overview” pages. The “Overview” pages have links to a graphic model page showing the database structure and to a table and column information page showing the list of tables and individual table structures. A model link is also available on the sidebar of the data system pages. The following set of screens provide an example of working through the metadata to find the desired table and column names.
Starting from an overview page:
Clicking on the "Model" link in the right side menu brings the user to a page showing the subject areas:
Clicking on a subject area (for example SITES) brings the user to a diagram showing all tables in the subject area and their relationships:
Clicking on a table will take the user to that table’s metadata page:
Clicking on a column name will then take the user to a column description page:
Working with the Output
The result set is XML that can be ported into an application or used as part of a Web mash-up. This is an example of what the output looks like with a return of two records from a search of the table PCS_PERMIT_FACILITY:
<pcs_permit_facilityList Count="2" Rows="1-2" >
<NAME_2>GRANITE POINT TANK FARM</NAME_2>
<CITY_NAME>GULF OF ALASKA</CITY_NAME>
<NAME_2>TRADING BAY TREATMENT FACILITY</NAME_2>
<CITY_NAME>GULF OF ALASKA</CITY_NAME>
After saving this as an XML file, it can be opened in Excel as an XML List, or in MS Access using the “Get External Data/Import” tool to create a table.
Envirofacts will consider additional search types such as beginning with or between. Pending the modification of the Envirofacts database architecture, the output may be updated to include additional rows of data (i.e., 1000 at a time).
User Comments/Feedback Requests
For Envirofacts Data Service API usage questions or suggestions, please contact the Envirofacts team.