STORET/WQX: Concepts and Definitions
STORET/ WQX Concepts and Definitions
The WQX schema and the STORET Warehouse are organized around five main information categories that represent the who (organizations), why (projects), where (monitoring locations), what and how (activities and results) of water quality monitoring:
The group or entity responsible for the data set, either for collecting and otherwise generating the data, or sponsoring the activity for which the data set was created;
The activity during and for which the data set was created;
Also referred to as stations, carry the identification and description of the physical location at which monitoring occurs;
Water quality sampling, observation, and measurement activities that occur at monitoring locations; comprehensive descriptors of the event during which samples were collected or the measurements performed;
The findings of the sampling events, measurements, and field activities.
In the STORET Warehouse and WQX, organizations are the primary owners of data. Everyone who submits data to the STORET Warehouse is identified by an Organization ID and name. So, data owners use the Organization ID to identify themselves upon data submission through WQX, and also to identify themselves or others upon data retrieval from the STORET Warehouse. For example, data that EPA has collected under the Environmental Montoring and Assessment Program (EMAP) for coastal areas is identified under the Organization ID "EMAP_CS" and name "Environmental Monitoring and Assessment Program".
The Organization ID ties together all of an organization's data. For example, when submitting data via WQX, Organizations set up their own project descriptions, and maintain their own lists of monitoring locations at which they sample. All of this information, along with monitoring activities and results are linked together by the Organization ID. Organizations can set up their own organization specific preferences or usual practices for monitoring activities. These specific preferences or usual practices may include methods used to collect samples, methods used to prepare samples, methods used in an organization's lab, bibliographic references, and many others. For example, organizations always identify their sample collection methods by their own codes, and provide their Organization ID as the owner of that sample collection method, so data users can identify the source.
Many data owners have previously submitted data using the distributed STORET database, and are now submitting data via the Water Quality Exchange (WQX). These data owners are often identified with two Organization IDs. So, data belonging to the EMAP data owners for coastal data can be found under both "EMAP_CS" and "EMAP_CS_WQX". For further discussion on why data owners sometimes have two Organization IDs, please see the paper on transitioning to WQX.
The WQX schema provides an organization with the ability to provide descriptions, in summary form, of the projects it conducts. The descriptions contain essential information concerning purpose, procedures, standards and methods, and quality goals. The descriptions may also include information on individuals who manage and participate in the projects. Project descriptions permit linking data quality objectives and other quality control plan items to a broad spectrum of data. Data owners may provide attached objects such as .pdf files or other documents in their submissions so that data users may access these files when going forward in using the data.
Each project in the STORET Warehouse and WQX can involve one or more monitoring locations ("stations"), and a station can participate in multiple projects. Other information such as the sampling design for a project, and other project specific attributes for monitoring locations (such as weighting factors or target populations) can also be provided. Field or sampling type activities and their analytical results are linked to all the projects they support.
Monitoring locations are where monitoring occurs. All data concerning field work or sample collection is tied to the specific location at which these types of activities are conducted, linking water quality measurements to the place they represent. Each monitoring location within the STORET Warehouse has a location, whose latitude and longitude are fully defined, including the method by which latitude and longitude were obtained. Other locational information includes the Hydrologic Unit Code (or HUC), state and county within which the station lies.
Each monitoring location has a unique identifier, assigned at the discretion of the organization that "owns" the data. Monitoring locations may have alternate identifiers, for example, if two or more organizations sample at the same place, the monitoring location may have different ways of being identified. Finally, the monitoring location defines the water body type it occurs on, such as a river, stream, lake or well.
The WQX schema and STORET Warehouse use the term "Activity" to refer to any individual action one does at a monitoring location to collect data. So, a series of field measurements or a set of habitat observations are considered activities, along with a collected water sample. It is important to note that the WQX convention is that each of these separate actions are separate activities, each with their own identifier. For a given activity, a data owner can provide additional information about how a sample was collected or prepared, or how biological monitoring was performed (e.g. how long electro fishing was performed, or whether the observation took place on a transect).
Activities have a range of possible types, from Routine Sample to Quality Control Sample-Lab Blank. These different types may or may not require the association of a monitoring location, however, all activities must be linked to one or more projects. For example, a QC type activity to calibrate a lab instrument does not require a monitoring location to be associated with it, whereas a field observation requires a location. Activities also occur in different media, for example "biological" if biological monitoring is occurring, or "water" if a sample is collected for water chemistry analysis.
All results belong to an activity. Overall, sample wide information such as collection depth, collection methods or gear used are associated with activities, whereas results contain information such as analysis methods or detection limits for particular results.
Results are the actual water quality measurement values, observations, or analysis outcomes. Within the STORET Warehouse and WQX, results obtained through analysis of samples and "in situ" measurements are linked to the activity to which they relate.
Results within the WQX schema and the STORET Warehouse include the recorded value of a parameter or characteristic (e.g. pH value of 7.6). The analytical method used to obtain the results are included with the results information, along with detection limits, if applicable. Other information such as depth at which a particular result was taken or additional biological attribute information for a certain organism are also captured.
It should be noted that the term that WQX and the STORET Warehouse use to identify the parameter or thing being measured is "Characteristic". In other words, Characteristics in WQX and the STORET Warehouse define what is being measured or analyzed for (e.g. atrazine, pH, or a habitat observations such as vegetative cover). When searching for data in the STORET Warehouse, most users are interested in finding data for particular parameters, or characteristics. So, when searching for data in the Warehouse, understand that if you want data on dissolved oxygen, you'll need to search for dissolved oxygen as a "Characteristic".