- Can data be changed and corrected in AQS once it has been loaded? Are there any negative implications if data are corrected (does it look "bad")? Do states and local agencies change data?
- Is it better to report no data or null value coded data?
- Have questions about PM 2.5 Field Blank reporting? (Links to a list of questions and answers prepared by Lewis Weinstock, 1/9/07)
- What are the most common load errors?
- What is the easiest way to create all of the toxics monitors at a site?
- My Load job failed. How do I find out what went wrong?
- Can I delete all of my unposted records and start over?
- Can I delete some of the raw data records that are in pre-production status?
- Can I input data in the "old" AQS format?
- Why are there differences in some numeric field lengths between the data coding manual and the Input Transaction Format document?
- Why are there so many more records counted on the Edit Load Summary report than in my input transaction file?
- Why doesn't my SCAN report show any values?
- I have data for the Census tract, Block Group, Block Number that I can't add.
- Why is AQS complaining about the format of my sample values?
- My Zip code is missing!
- How does the Load function differ with Precision & Accuracy data?
- If I update a site, do I have to include the new LDP data?
- "Not enough space"? My LOAD job failed.
Can data be changed and corrected in AQS once it has been loaded? Are there any negative implications if data are corrected (does it look "bad")? Do states and local agencies change data?
Data in AQS can be corrected by the agency that submitted the data at any time and there are no negative implications if the data are corrected. EPA’s objective is to obtain the most accurate data possible so agencies are actually encouraged to correct their data if they find their initial submission was incorrect for any reason (e.g., incorrect method code, data reported to the wrong monitor, additional data validation information, etc). About 4% of the data in AQS is routinely changed by the submitting agency for various reasons.
One thing to consider when correcting data is that it becomes more complicated to do so once the data have been “certified” by an agency as accurate and complete. The main issue with corrections after certification is that any annual summary records that have been certified will loose their “certification flags” in AQS if any raw data is changed for a specific year. If this happens, agencies will have to resubmit their data certification request for the data. (Though States are required to certify their data every year by July 1, tribes are not required to do so. However, tribes are encouraged to voluntarily certify their data).
It is better to report null data codes (i.e., codes that explain common reasons why some data values are missing) for data values because it documents why specific values are missing which can assist the agency, EPA and others in evaluating and interpreting the data. Using null data codes might also help agencies identify changes that can be made to reduce lost data. (Also, some Regional Offices may require that null data codes be reported).
In the event that a monitor will be out of service for an extended period (e.g., several months), it is probably better for the agency to enter sampling end dates in AQS and then reopen the monitor when the monitor begins to operate again (though it is up to the agency to decide what to do in this situation).
(Links to a list of questions and answers prepared by Lewis Weinstock, 1/9/07)
Below is a link to a spreadsheet containing load error counts for all files submitted by states and tribes during the period January - September 2006. The file contains the error message, the type of AQS data the error applies to, and the total times that error was counted. The data are sorted with most frequent errors first. The errors are separated by data type, so the same error message may appear multiple times. For example, once for raw data and once for precision data.
If you have trouble understanding what an error message really means you may consult the AQS Data Coding Manual for the data type and field in question, call the AQS Help Desk/EPA Call Center (1-866-411-4372), or contact Nick Mangus at email@example.com.
In general terms, AQS has about 18,000 files submitted each year. About 34% of these files have at least one load error. (And once that is corrected, about 25% have at least one statistical problem/outlier identified in the Statistical/Critical Review process). 80-100 Million values are ultimately loaded each year.
This error count data is posted in spreadsheet format for you to further manipulate the data to meet your needs. Note that the single most common error accounts for 43% of all errors and the top 15 errors account for 99% of all errors.
The easiest way is to extract data for an existing site, modify the data so that it correctly describes your site, and then load it. Here’s a step by step description:
First, find a site that has the same or similar monitors (pollutants and methods) as yours. You may do this by asking someone familiar with the monitors in your agency or someone in another agency you know already has toxics sites established. If you don’t have any leads, you can query the AQS system to find a site set up similar to yours. You can do this in the "Maintain" forms or using the Monitor Description report (AMP 390) and for the "pollutant type" selection choosing CORE_HAPS or the appropriate SPECIATION that you need.
Note that the monitor description report will be very long for toxics sites, since it makes a page for each parameter, but it runs fast. Just think about looking at it on-line versus printing it out. Also note that the newer the site, the better off you will be since the later it was created the more QA checks it has passed - this will reduce the likelihood of errors during your load. The date selection for the Monitor Description report applies to when the monitor was active, not created, so you will have to look at the report after it’s generated for the "Begin date". Anything after 2001 should pass the QA checks with no problem.
Second, run the Extract Site/Monitor Data report (AMP 500) for this site. This report generates AQS input transactions. Now you have to use a text editor to go through and change all of the state, county, site, etc. IDs to make sure they are correct. Also be sure that agency roles, dates, and all of the other fields are correct. You are not just reloading the data - it has to be changed to correctly reflect your situation. Please note that this file is not always sorted when it is created, so you may want to sort it and you will have to delete the last line that indicates how many records are in the file.
Finally, you can load the data via the AQS batch load process and the site and/or monitors will be created.
If you don't get an immediate, on-screen message, wait for an email from AQS. Each batch job you submit in the process of loading data should initiate an email to you. After the first 8 - 12 lines, you should see lines that start describing the problem. Be sure to look through the entire email. If you see something like, "login incorrect", then you probably need to synchronize your passwords.
Yes, but be aware that you may have data in both the staging tables as well as pre-production data in the production tables. There is a menu option under Correct to Delete by Screening Group which will delete ALL records in your screening group -- yours as well as those entered by any other users within your screening group. If this option is not active, you have not been granted permission to do this. Check with your agency's designated AQS contact to find out who is authorized within your agency. Contact your EPA regional AQS person if your agency decides that you are the person to perform this function. The EPA regional contact will need to notify the AQS staff at EPA headquarters to have this function added for any user.
Pre-production data may be deleted from the Maintenance option. Query to select only the pre-production data to be deleted (e.g., all data in status "R" or "S" from a particular session). Once only that data is visible from your screen, select the Delete all selected from the Maintenance menu.
Raw data in pre-production status (i.e., a status indicator of "R" or "S") are physically located in the same tables as the data in production status. (This allows the calculations needed for the Scan Report and Statistical Evaluation Report to be done. These reports show the results if your new data is posted to production.)
To delete this pre-production data, you have to query for it and then delete the records one-by-one or use Delete selected records. Do not create delete transactions for pre-production data. You just highlight the record and click on the Remove Record button (). Be sure to Save your changes.
Yes, the new system recognizes and automatically converts data from the old format to the new format -- EXCEPT for site and monitor transactions and the old transactions Z (Minimum Detectable Value) and 4 (Missing Data Reason, aka, Null Data).
The conversion from old to new format is at the beginning of the Load File job.
Why are there differences in some numeric field lengths between the data coding manual and the Input Transaction Format document?
There's a note and a chart explaining the reason for these differences. In general, use the Input Transaction Format's description for preparing input transactions.
Why are there so many more records counted on the Edit Load Summary report than in my input transaction file?
Any data in the old format is converted to the new format is at the beginning of the Load File job. The new format for raw data has only one sample value per record whereas the old format had up to eight sample values per record. So, an input file of 30 days with hourly values in the old format (90 transactions) would be 720 records in the new format.
The Scan report is a list of actual values compared to historic data. If there are no actual values, there is nothing to report. This could occur when the only data for a monitor is "null values".
Data from the US Census bureau has not been loaded yet, so the LOVs for these fields are not populated and you can not enter values for them. This is a known issue and will be fixed.
"Validation error RAW DATA, Sample Value, Invalid Number or number format, -20205"
Here's a sample RD transaction that caused this error:
There is a leading blank in the sample value field. The old system handled this but the current version (2.0) does not. This should be fixed in the next release. For now, delete any spaces in the sample value field and resubmit that record.
If you have lots of records like this and your original data file is in the "old" format, you could run it through the edits on the mainframe system and then pull off the screening file and submit it to the new AQS. (In other words, let the mainframe system do the edit level 1 and 2 checks for you, strip off leading blanks, and then load the data into the new system.) OR, you could wait for the next version.
Some zip codes were not in the file used to create the zip code table. So far, only zip codes assigned to individual buildings seem to be missing. If you notice a missing zip code, please let us know. Email or call Jake Summers with the details.
P & A data goes straight into production in the database unless there's an error with it. You do not have to run the POST job for this data.
Yes. When you update a site (or monitor), all required fields must be completed. If the field didn't exist when the data was converted from the mainframe system to the new system, it was not required on the conversion. But, any new sites or updated sites must include the required LDP fields, as well as any values for any other missing required fields.
Your emailed job log includes the following:
An error occurred while allocating g_common_buf.
Not enough space
Bulk load failed.
This message indicates a lack of space on the EPA server at the time your job was running. You can try resubmitting your job during non-peak hours. You may want to call the Help Desk to verify that there are no other problems with EPA's server.