Jump to main content.


Data Validation

Note: EPA no longer updates this information, but it may be useful as a reference or resource.


Introduction: Why is data validation important?
Objectives
Definitions
Example Quality Control Flags
AIRS Null Data Reason Codes
Data Validation Procedures and Results
Examples of Problems Encountered in Databases (and Validation Actions)
Summary
References


Definition

Why is Data Validation Important?


[Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


DATA VALIDATION OBJECTIVES 

  • Produce a database with values that are validated and of a known quality.
  • Evaluate the internal, spatial, temporal, and physical consistency of the data.
  • Intercompare data to identify errors, biases, or outliers. 
  • [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 1

    Data Validation Routine


    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Outliers

    Level 0 Data Validation

     Level 1 Data Validation

    Level 2 Data Validation

    Level 3 Data Validation


     [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


     
      Flag   Description   Explanation
      0 Valid Observations judged accurate within the performance limits of the instruments.
     1 Estimated Observations required additional processing because original values were suspect, invalid, or missing.
     7 Suspect Values judged to be in error because they violate reasonable physical criteria or do not exhibit reasonable consistency, but a specific cause of the problem is not identified.
     8 Invalid Values judged to be inaccurate or in error, known cause of the inaccuracy or error.
    9 Missing Observations not collected. Values assigned -999.


    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    AIRS NULL DATA REASON CODES

    CODE

    DESCRIPTION

    9973

    SAMPLE TIME OUT OF LIMITS

    9974

    SAMPLE FLOW RATE OUT OF LIMITS

    9975

    INSUFFICIENT DATA (CAN'T CALCULATE)

    9976

    FILTER DAMAGE

    9977

    FILTER LEAK

    9978

    VOIDED BY OPERATOR

    9979

    MISCELLANEOUS VOID

    9980

    MACHINE MALFUNCTION

    9981

    BAD WEATHER

    9982

    VANDALISM

    9983

    COLLECTION ERROR

    9984

    LAB ERROR

    9985

    POOR QUALITY ASSURANCE RESULTS

    9986

    CALIBRATION

    9987

    MONITORING WAIVED

    9988

    POWER FAILURE (POWR)

    9989

    WILDLIFE DAMAGE

    9990

    PRECISION CHECK (PREC)

    9991

    Q C CONTROL POINTS (ZERO/SPAN)

    9992

    Q C AUDIT (AUDT)

    9993

    MAINTENANCE/ROUTINE REPAIRS

    9994

    UNABLE TO REACH SITE

    9995

    MULTI-POINT CALIBRATION

    9996

    AUTO CALIBRATION

    Source: AIRS User's Guide, Volume III: AIRS Codes and Values, 1989.


    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]



    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 2

    Reported Regional Ozone Conditions

    Example of identification of suspect data values from the Northeast (NESCAUM, 1993). The ozone concentration of 139 ppb reported at Cape Elizabeth on May 26, 1992 at 4:00 a.m. appears erroneous when viewed in a spatial and temporal context.

    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 3

    Examples of identification of suspect data values

    Examples of identification of suspect data values

    Examples of identification of suspect data values from the Northeast (NESCAUM, 1993). At the top, two values are anomalously high when inspected both temporally and spatially. At the bottom, reported isolated low values were probably the result of misplaced decimal points.

    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 4

    Example of problems encountered with AIRS Data

    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 5

    Frequency of 0 ppb Ozone in AIRS Data ata site.

    Example of problems encountered with AIRS data. The figure shows that there were an abnormally large number of zero ozone concentrations at a site during a few years possibly indicating monitor or reporting problems. (Level 0, AIRS data)

    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 6

    Plot of surface winds on June 27, 1991

    Plot of surface winds on June 27, 1991 at 1900 CDT. The calm wind at Bloomington, Illinois was identified as suspect (SUS) during the data validation process. (Roberts et al., 1994)

    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Figure 7

    Example of questionable data identified during the data validation.

    Example of questionable data identified during the data validation: (a) constant wind directions measured at Cocodrie, Louisiana from July 31 - August 2, 1993 and (b) high surface winds at a surface station in Grand Isle, Louisiana on August 29, 1993 at 0800 CST (SAI et al., 1995). 

    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    SUMMARY

    Analysis/Procedure Objectives
    Level 0 Data Validation Convert instrument output to scaled scientific units.
    Level 1 Data Validation Review for accuracy, completeness, and internal consistency.
    Level 2 Data Validation Review/compare for external consistency against other independent data sets.
    Level 3 Data Validation

     

    Ongoing evaluation as part of data interpretation process.

     Tools and methods include:

    Spreadsheets, statistical software, Voyager, VOCDat, LapG; time series, scatter, and spatial plots; correlations among species.


    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


     

    DATA VALIDATION REFERENCES

    LADCO (1995) Lake Michigan Ozone Study. 1994 data analysis report, version 1.1. Report prepared by Lake Michigan Air Directors Consortium, Des Plaines, IL, May.

    NESCAUM (1993) 1992 regional ozone concentrations in the northeastern United States. Report prepared by the Ambient Monitoring and Assessment Committee and the Data Management Committee of the Northeast States for Coordinated Air Use Management, Boston, MA.

    NESCAUM (1995) Preview of the 1994 ozone precursor concentrations in the northeastern U.S. 5/1/94 draft report prepared by the Ambient Monitoring and Assessment Committee of the Northeast States for Coordinated Air Use Management, Boston, MA.

    Roberts P.T., Dye T.S., Korc M.E., and Main H.H. (1994) Air quality data analysis for the 1991 Lake Michigan Ozone Study. Final report prepared for Lake Michigan Air Directors Consortium, Des Plaines, IL by Sonoma Technology, Inc., Santa Rosa, CA, STI-92022-1410-FR, September.

    Stoeckenius T.E., Ligocki M.P., Shepard S.B., and Iwamiya R.K. (1994a) Analysis of PAMS data: application to summer 1993 Houston and Baton Rouge data. Draft report prepared by Systems Applications International, San Rafael, CA, SYSAPP94-94/115d, November.

    Stoeckenius T.E., Ligocki M.P., Cohen B.L., Rosenbaum A.S., and Douglas S.G. (1994b) Recommendations for analysis of PAMS data. Final report prepared by Systems Applications International, San Rafael, CA, SYSAPP94-94/011r1, February.

    Systems Applications International, Sonoma Technology Inc., Earth Tech, and Alpine Geophysics (1995) Gulf of Mexico Air Quality Study. Vol 1: Summary of data analysis and modeling. Final report prepared for U.S. Department of the Interior, Minerals Management Service, Gulf of Mexico OCS Region, New Orleans, LA, OCS Study, MMS 95-0038.

    U.S. Environmental Protection Agency (1984) Quality assurance handbook for air pollution measurement systems, volume ii: ambient air specific methods (interim edition), EPA/600/R-94/0386, April.U.S. Environmental Protection Agency (1989) AIRS user's guide volume iii: AIRS codes and values. Office of Air Quality Planning & Standards Technical Support Division, Researc Triangle Park, NC, June.


    [Workbook Table of Contents] [Top of Data Validation] [Previous Section] [Next Section]


    Local Navigation


    Jump to main content.