Exercise 3: XMP Metadata
This exercise will explain what XMP metadata is. Here are the slides:

You can see XMP metadata by opening up the PDF document "DuanesWorld.pdf" with a text editor (or vi, emacs, pico…) and show the XMP near the end.

About XMP (This is from marketing department, if you have real questions about the technology, ask the instructors):
Adobe's Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. With XMP, desktop applications and back-end publishing systems gain a common method for capturing, sharing, and leveraging this valuable metadata — opening the door for more efficient job processing, workflow automation, and rights management, among many other possibilities. With XMP, Adobe has taken the "heavy lifting" out of metadata integration, offering content creators an easy way to embed meaningful information about their projects and providing industry partners with standards-based building blocks to develop optimized workflow solutions.
By providing a standard, W3C-compliant way of tagging files with metadata across products from Adobe and other vendors, XMP is a powerful solution enabler. As an open-source technology, it is freely available to developers, which means that the user community benefits from the innovations contributed by developers worldwide. Furthermore, XMP is extensible, meaning that it can accommodate existing metadata schemas; therefore, systems don't need to be rebuilt from scratch. A growing number of third-party applications already support XMP.
Make sure you open up the file (Use a text editor, not Acrobat or Reader)<lab_root>/PutUnderWorkspace/JavaOne2009_docs/DuanesWorld.PDF file and find the XMP. To find the XMP metadata in the document, you can simply search for the string "xmp". XMP will look similar to this:

Extracting XMP metadata. XMP is based of W3C RDF. We are going to extract the XMP above and save it out to an XML file.
- Objective: Learn how to grab the metadata out of a PDF package.
- Expected duration: Approximately 15 minutes
Step 1 – close all other projects. Navigate in Eclipse and open up the java file PDFExtractXMP.java
Step 2 – note that it is similar to the previous files except we have added some base functionality to allow for writing files out to the local hard drive. The code at the end is shown below. This is code added to the project to write out data to a file system.
/**
* method to save InputStream to a file.
*/
public static boolean saveFile(InputStream is, String filePath)
throws Exception
{
boolean retVal=false;
byte[] buffer = new byte[10240];
FileOutputStream outStream = null;
try
{
outStream = new FileOutputStream(filePath);
int len=0;
while (true)
{
len = is.read(buffer);
if (len == -1)
break;
outStream.write(buffer, 0, len);
}
outStream.close();
retVal = true;
}catch (IOException io) {
System.out.println("Writing the array of bytes into the file "
+ filePath + " failed.");
throw new Exception(
"Writing the array of bytes into the file "+ filePath +
" failed in saveFile");
}
return retVal;
}
// save text string to a file.
public static boolean saveStringToFile(String text, String filePath)
{
boolean b = false;
try {
BufferedWriter outTxtFile = new BufferedWriter(new FileWriter(filePath));
outTxtFile.write(text, 0, text.length());
outTxtFile.close();
b = true;
} catch (IOException e) {
System.out.println("Error saving text file.");
}
return b;
Step 3: We need to add code to grab the XMP and pass it over to the saveFile() method.

Look for the TODO comment then add the following code.
/**
* TODO: Add Method PDFExtract to open and extract data from a PDF file.
* The solution code is commented out right below. When writing the solution,
* you may use this as reference material.
*/
System.out.println("\nExtracting document metadata ...");
String DocMetadataFile = "DocMetadata.xml";
boolean b = false;
InputStream inputStream;
inputStream = doc.exportXMP();
if(inputStream == null)
System.out.println("No document level metadata was exported.");
else {
System.out.println("Document metadata was exported.");
try {
b = saveFile(inputStream, DocMetadataFile);
} catch (Exception e) {
System.out.println("Error saving metadata file.");
System.out.println(e);
}
if(b == true)
System.out.println ("Document metadata was saved to file : " + DocMetadataFile);
else
System.out.println("Document metadata was not saved.");
}
Now try to run it again. Your completed code should look like this:
package org.duanesworldtv.livecycle.samples;
import java.io.InputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.FileNotFoundException;
import java.io.IOException;
// this is from xpaaj.jar - check licenses before using.
// LiveCycle ES has newer JAVA libraries for manipulating PDF
import com.adobe.pdf.*;
public class PDFExtractXMP {
public static void main(String[] args)
throws FileNotFoundException, IOException
{
// get pdf filename
String inPdfName;
if(args.length != 1 ) {
System.out.println("\nCommand line format: java PDFExtractData pdf-file");
return;
} else {
// message
System.out.println("\nPDF data extraction using Adobe PDF Libraries.");
inPdfName = new String(args[0]);
PDFExtract(inPdfName);
}
}
/**
* TODO: Add Method PDFExtract to open and extract data from a PDF file
*/
public static void PDFExtract(String inPdfName)
throws FileNotFoundException, IOException
{
// open PDF
System.out.println("\nOpen PDF Document ... ");
PDFDocument doc = null;
FileInputStream inPdfFile = new FileInputStream(inPdfName);
try {
doc = PDFFactory.openDocument(inPdfFile);
} catch (IOException e) {
System.out.println("Error opening PDF file :" + inPdfName);
System.out.println(e);
}
if(doc == null)
System.out.println("Cannot open PDF file : " + inPdfName);
else
System.out.println( inPdfName + " was successfully opened.");
try {
System.out.println("Document version is " + doc.getVersion());
} catch (Exception e) {
System.out.println("Error getting document version: " + e);
}
try {
System.out.println("Number of pages is " + doc.getNumberOfPages());
} catch (Exception e) {
System.out.println("Error getting number of Pages " + e );
}
// Export the XMP metadata
System.out.println("\nExtracting document metadata ...");
String DocMetadataFile = "DocMetadata.xml";
boolean b = false;
InputStream inputStream;
inputStream = doc.exportXMP();
if(inputStream == null)
System.out.println("No document level metadata was exported.");
else {
System.out.println("Document metadata was exported.");
try {
b = saveFile(inputStream, DocMetadataFile);
} catch (Exception e) {
System.out.println("Error saving metadata file.");
System.out.println(e);
}
if(b == true)
System.out.println ("Document metadata was saved to file : " + DocMetadataFile);
else
System.out.println("Document metadata was not saved.");
}
}
/**
method to save InputStream to a file.
*/
public static boolean saveFile(InputStream is, String filePath)
throws Exception
{
boolean retVal=false;
byte[] buffer = new byte[10240];
FileOutputStream outStream = null;
try
{
outStream = new FileOutputStream(filePath);
int len=0;
while (true)
{
len = is.read(buffer);
if (len == -1)
break;
outStream.write(buffer, 0, len);
}
outStream.close();
retVal = true;
}catch (IOException io) {
System.out.println("Writing the array of bytes into the file "
+ filePath + " failed.");
throw new Exception(
"Writing the array of bytes into the file "+ filePath +
" failed in saveFile");
}
return retVal;
}
// save text string to a file.
public static boolean saveStringToFile(String text, String filePath)
{
boolean b = false;
try {
BufferedWriter outTxtFile = new BufferedWriter(new FileWriter(filePath));
outTxtFile.write(text, 0, text.length());
outTxtFile.close();
b = true;
} catch (IOException e) {
System.out.println("Error saving text file.");
}
return b;
}
}
/******* End of file *********************************/
Note that before you run this, you may have to manually set the runtime arguments for the classs. To do this,click "Run -> Run Configurations", ensure you have selected the correct class and then select the "arguments" tab and enter the path to the PDF document as per the instructions in exercise 1. Your console output should appear as follows:

Now navigate to the directory where the file was saved and open it. You should find a valid XMP file (XML file).

Discussions:
- there are other methods to set() XMP metadata
- libraries are also available in other programming languages (C, CPP, AS3 etc.)
If you did, congtratulations!
Proceed to exercise 4
Back to top
Next exercise