Frequent Questions about PDF/A
- What is PDF/A?
- Why use PDF/A?
- What is the difference between PDF and PDF/A?
- What are the PDF/A-1 restrictions?
- Are PowerPoint files good candidates for converting to PDF/A?
- Can PDF/A files contain copyright information, like TIFF can?
- How can you best make PDF/A files text searchable?
- What about "mixed" objects in PDF, like audio and video? Can these be used in PDF/A?
- Which text recognition software works well together with Acrobat 8/9/X?
- Can PDF files be converted to PDF/A?
- Can I create PDF/A with Acrobat 6?
- What are the different ways to create a PDF/A file?
- How can I find out if a font is embedded?
- What are special font considerations?
- Are there programs that check and confirm the validity of a PDF/A file?
- Can PDF/A files contain an electronic signature?
- What does PDF/A mean by "long term"?
- Does PDF/A-1 replace other archival file formats?
- What long-term preservation needs does PDF/A-1 address?
- When should PDF/A be used?
- Does NARA accept PDF/A files?
- How did PDF/A get started and who is involved?
- Where can I get more information?
Portable Document Format/Archive (PDF/A) is an International Organization for Standardization (ISO) standardized version specialized for the digital preservation of electronic documents. PDF/A ensures that documents are able to be reproduced exactly the same way in years to come. The format forbids dynamic content; restrictions are applied to the use of PDF objects; everything that is required to render the document, fonts, color profiles, images, etc., is 100% self-contained in the PDF/A file.
Currently, there are two variations on PDF/A-PDF/A-1a and PDF/A-1b. The standard specifies two levels of compliance:
- PDF/A-1a: Level "A" compliance: exact visual reproduction, mapping text to Unicode and structuring of the document content, preservation of a document's logical structure and content text stream in natural reading order, especially important when the document must be displayed on a mobile device
- PDF/A-1b: Level "B" compliance: exact visual reproduction
PDF/A stores objects (e.g. text, graphics), allowing for an efficient full-text search in an entire archive. Files stored as Tagged Image File Format (TIFF) cannot be searched. TIFF is a raster format and must first be scanned with an OCR (optical character recognition) engine.
PDF/A files require only a fraction of the memory space of original or TIFF files, without loss of quality. The smaller file size is especially advantageous for electronic file transfers (FTP, e-mail attachment, etc.)
PDF/A format can be optimized. The optimization can be focused on images (e.g. scanned checks) or extracting structured data (e.g. voucher information). TIFF treats all file information the same.
Metadata like title, author, creation date, modification date, subject, keywords, etc., can be stored in a PDF/A file. PDF/A files can be automatically classified based on the metadata, without requiring human intervention.
PDF files may include videos, GIS data and 3-dimensional images. PDF/A, however, is highly restrictive and does not support dynamic PDF content.
PDF/A-1 allows various different metadata, including bookmarks, to be saved in the document. Extensible Metadata Platform (XMP), a technology that unifies different metadata methods, is used for the metadata in a PDF/A-1 file. However, PDF files with dynamic objects like audio and video cannot be converted to PDF/A because they rely on an external player that may not be available in the future.
Generally speaking, NARA-acceptable PDF records must comply with PDF versions 1.0 through 1.4, have no encryption or security settings, have the fonts embedded, follow the NARA transfer guidance for scanned images, and have multimedia/special features negotiated beforehand in a "notification process".
One of the key differences between PDF and PDF/A is the restrictions that PDF/A places on PDF.
PDF/A-1 files must include:
- Embedded fonts
- Device-independent color
- Extensible Metadata Platform (XMP) metadata
PDF/A-1 files may not include:
- LZW Compression
- Embedded files
- External content references
- PDF Transparency
Yes. You might have to take preparatory steps such as ensuring that annotations are also carried over into the PDF/A file before saving.
Yes. PDF/A gives you the possibility to save various different metadata (for example, the copyright) in the document. XMP, a technology that unifies different metadata methods, is used for the metadata in a PDF/A file.
If a PDF/A file is created from a digital text document, the text will automatically be recognized. For a scanned paper document or image, OCR can be used to make it searchable. In this case, only PDF/A-1b is possible, not the more stringent PDF/A-1a.
PDF files with dynamic objects like audio and video cannot be converted to PDF/A. PDF/A must guarantee an exact reproducibility, which is not possible with embedded objects like sound or movies. These types of objects require an external player (and quite often in a specific version). There is no guarantee that the player application will be available in the future.
Acrobat 8/9/X Professional comes with its own OCR software that can be used to convert scanned pages into searchable text. Note: most EPA staff do not have this version of Adobe Acrobat.
Yes. But not all PDF features may be transferrable to PDF/A. PDF/A is based on PDF 1.4. Certain features in newer PDF versions (like transparency and layers) were not (fully) introduced with PDF 1.4 and are therefore not supported by PDF/A. In this case, the transparency has to be removed and the layers flattened in order to create a PDF/A-1 document. The next version of PDF/A - PDF/A-2 - is based on the PDF specification 1.7 and will allow a lot of the newer features.
No. The Adobe Acrobat 8 Professional version is the first version of Acrobat that fully supports PDF/A.
- Print to PDF/A on a client computer
- Print to PDF/A using a print stream on a server
- Scan to PDF/A (paper to PDF/A)
- Convert existing image files to PDF/A
- Convert existing PDF files to PDF/A
- Export a document to PDF/A format
- Create PDF/A "on-the-fly" from data or a database
When a PDF/A file is created, the program ensures that the fonts are embedded. If the fonts are not embedded, you don't have a valid PDF/A file. You can verify if fonts (and which ones) are embedded in any PDF file by checking under "Properties" in Acrobat and the Adobe Reader. In addition, PDF/A validation tools will inform you if fonts have or have not been embedded, and whether the files conform to PDF/A or not.
Many fonts have restrictions on use, embedding and exchange. PDF/A requires fonts to be embedded. Therefore, organizations using PDF/A-1 must take extra precautions to be sure that the fonts they use are properly licensed to allow embedding.
There are a few tools on the market that will do this. In addition to Adobe Acrobat Professional 8, there are the pdfaPilot and pdfInspektor4 from Callas Software, the 3-Heights PDF Validator from PDF Tools, the LuraDocument PDF Validator from LuraTech, PDF/A Live! from Intarsys, the PDF/A Longlife Suite from Seal Systems and the PDF Appraiser from Apago.
Yes. There are a number of tools, strategies and software solutions available for this. Even Acrobat Professional can be used to digitally sign PDF/A files.
PDF/A defines long-term as: 'the period of time long enough for there to be concern about the impacts of changing technologies, including support for new media and data formats, and of a changing user community, on the information being held in a repository, which may extend into the indefinite future."
No. PDF/A-1 was developed to allow PDF to be used as an archival format in a well-defined and robust manner.
Characteristics identified as objectives for PDF/A:
- Device Independent - Can be reliably and consistently rendered without regard to the hardware or software platform
- Self-contained - Contains all resources necessary for rendering
- Self-documenting - Contains its own description
- Unfettered - Absence of technical file protection mechanisms
- Available - Authoritative specification publicly available
- Adoption - Widespread use may be the best deterrent against preservation risk
PDF/A should be used as a way to standardize the use of PDF for electronic document storage and ensure that these documents will be available well into the future. This is important to support business needs that require reliable rendering of electronic documents over the long term.
As a file format specification, users will need to establish their own capture methodology that meets domain specific policies and procedures (e.g., for reliability, integrity, compliance, comprehensiveness).
For permanent records in PDF, federal agencies will need to implement PDF/A-1 in conjunction with additional requirements identified in guidance from the National Archives and Records Administration (NARA) for transferring permanent PDF records.
It is important to be aware that:
- PDF/A-1 alone does not guarantee preservation
- PDF/A-1 alone does not guarantee exact replication of source material
NARA may accept transfers of permanent records in PDF/A format that additionally meet the current transfer requirements for electronic records in PDF.
The PDF/A activity was initiated through the joint sponsorship of AIIM, Association for Information and Image Management and NPES, The Association for Suppliers of Printing, Publishing, and Converting Technologies. Under the auspices of TC-171 Document Management Application Subcommittee 2 Application Issues, a Joint Working Group (WG5) was formed with representatives from ISO Technical Committees 42, 46, 130, and 171. Librarians, archivists, PDF software developers, government agencies, imaging experts, graphics experts and others collaborated to develop PDF/A. Initial meetings were held in mid-2002 and the standard was approved in June 2005. Technical experts from 15 national standards bodies provided input throughout the development process.
Current actively involved organizations include: AIIM, NPES, Adobe, Appligent, Callas, EMC/Documentum, Global Graphics, Harvard University, Merck & Co., Inc., PDF Sages, the Administrative Office of the US Courts, the Internal Revenue Service and NARA.
There is a considerable amount of information-copies of initial requirements, meeting minutes, meeting notes and presentation materials-about PDF/A listed on the AIIM website at http://www.aiim.org/standards.
In addition, see:
Frequently Asked Questions (FAQs) about Transferring Permanent Records in PDF/A-1 to NARA