Lab-6808: Working with PDF and Java

Expected Duration: 120 minutes
Contacts: Duane Nickull
Begin Product Tab Sub Links

Exercise 0: Background Material

 

The PDF ISO standard has experienced a large growth in adoption by government and enterprises.  Many of these have requirements to round trip information between a J2EE environment and PDF forms or static documents.  This hands on labs will be about 25% presentation and 75% coding and working with the PDF libraries.
The core Java PDF libraries will be explored included how to create PDF documents, how to read and write to and from file systems, how to get PDF attachments, how to access metadata libraries and more.
The lab environment will be set up with JDK, JBoss and Adobe LiveCycle ES.  Developers wishing to continue with the development will be able to take the environment home.
Proposer Notes:  (The syntax may change to reflect new re-factoring of the core libraries:

Before any code is written, approximately 20 minutes of background information will be given about the PDF file format, it's render process and how PDF is persisted on a file system.  We will briefly look at advanced topics like signing PDF with digital signatures and document certification.


Teachers:  Find the accompanying slide deck in the folder <eclipse_root>/HandsOnLabCourseware/*_TEACHERS_GUIDE.pdf
Open up the slide deck as well and walk through these slides:

  • Teacher:

  • Discuss COS based PDF vs XDP. 
    Discuss ZIP format
    Discuss XMP – show XMP

First thing to do:  Thank Jim King for the slides!!!!


We will see a lot of pages like this one so let me tell you carefully what you are looking at. I opened the example 01 PDF file in Microsoft Word as a "text" file. The characters from the PDF file have been formatted into three columns and I have also inserted some line breaks and indentations in order to make the text more readable.  I also added headings and footings including page numbers.


Since the page displays "Hello World" you might expect the PDF file to have that character string somewhere within it.  You would be right and I have highlighted this in red. We will work our way outward from this string and see what supporting material is required to turn that text string into a complete PDF document.  Notice that "Hello World" is enclosed in parenthesis. This is how strings are represented in both PDF and PostScript.

- Note that the material in the file is organized into 6 objects a "%PDF-1.2" header and a trailer.
- Each of the objects has a number followed by a zero, begins with "obj" and ends with "endobj".
- Jim King refers to PDF as "object oriented PostScript" because this object structuring is something that PDF has but PostScript does not.

  • Note also that the file starts with "%PDF-1.2" which indicates that this is a PDF file following the 1.2 version of the PDF specification. 
  • Discuss backwards and forwards compatibility.

 

Back to top
Next exercise