Java PDF – Generating PDF Files Using Individual Template Components

javapdf

EDIT: Updated flow diagram to better explain the (likely unnecessary) complexity of what I'm doing.

We at the company I work for are attempting to create complex PDF files using Java iText (the free version 2.1 line). The documents are built piece-by-piece from individual "template" files, which are added to the final document one after another using the PdfStamper class, as well as filled using AcroForms.

Visual Example

The current design performs a loop which runs for each template that needs to be added in order (as well as the custom logic needed to fill each). In each iteration of the loop, it does the following:

Creates a PdfReader to open the template file
Creates a PdfStamper that reads from the PdfReader and writes to a "template buffer"
Fills AcroForm fields, as well as measures the height of the template by getting the location of an "end" AcroField.
Closes the PdfReader and PdfStamper
Creates a PdfReader to read a "working buffer" that stores the current final document in progress
Creates a PdfStamper that reads from the PdfReader and writes to a "storage buffer"
Closes the PdfReader, opens a new PdfReader to the "template buffer"
Imports the page from the "template buffer," adds it to the ContentByte of the PdfStamper
Closes the PdfReader and PdfStamper
Swaps the "storage buffer" with the "working buffer" in order to be ready to repeat.

Here is a diagram visually explaining the above process, which is performed for each iteration of the "loop" that performs each template:

Flow diagram

However, as discovered through this Stack Overflow question (where example code can also be viewed) and the response by iText author Bruno Lowagie, this methodology of using the PdfStamper can cause significant issues. "Abusing" the PdfStamper by creating and closing the stamper too many times can cause corruptions in the resulting file that only effect some programs, generating a seemingly good document that may fail in certain contexts.

What is the alternative? Mr. Lowagie's response suggests that there is a simpler or more direct way to use PdfStampers, though I do not quite understand it myself yet. Could this be done using only a single stamper? Could it be done without using a rotating series of buffers?

Best Answer

I'm always disappointed when I read "we are using iText 2.1" because that's really not a wise choice as explained here, but this is a question about design, so here is a possible approach:

enter image description here

You create a new document Document document = new Document(); (step 1), you create a PdfWriter instance (step 2), you open the document (step 4), and you add content in a loop (step 4):

You have different templates, and by templates we mean: existing PDF documents with fillable fields (AcroForms). You fill them out using PdfStamper and AcroFields (see your code on StackOverflow). This results in separate "flattened" form snippets kept in memory.
If you want to keep these snippets together, you can do so by creating a Document/PdfWriter instance to create a new PDF in memory that combines all the snippets that belong together. You get a snippet like this: PdfImportedPage snippet = writer.getImportedPage(reader, 1); and you add the snippet to the writer using the addTemplate() method.
You get the combined result using PdfImportedPage combined = writer.getImportedPage(reader, 1);, you wrap the result in an image like this: Image image = Image.getInstance(combined); You add the image to the document: document.add(image);

Step 2 could be omitted. You could add the different snippets straight to the document that is initially created. Repeats steps 1 to 3 as many times as needed, and close the document (step 5).

Omitting step 2 will result in a lower XObject nesting count, but keeping step 2 isn't problematic.

In pseudo code, we'd have:

[1.] The outer loop (the large part to the right of the schema, marked PdfWriter)

// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, os);
// step 3
document.open();
// step 4
for (int i = 0; i < parameters.length; i++)
    document.add(getSnippetCombination(writer, parameters[i]));
// step 5
document.close();

[2.] The creation of one unit (the arrow marked PdfWriter in the middle)

public Image getSnippetCombination(PdfWriter w, Parameters parameters) {
    // step 1
    Document document = new Document();
    // step 2
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    PdfWriter writer = PdfWriter.getInstance(document, baos);
    // step 3
    document.open();
    // step 4
    PdfContentByte canvas = writer.getDirectContent();
    for (int i = 0; i < parameters.getNumberOfSnippets(); i++)
        canvas.addTemplate(getSnippet(writer, parameters.getSnippet(i)),
            parameters.getX(i), parameters.getY(i));
    // step 5
    document.close();
    // Convert PDF in memory to From XObject wrapped in Image object
    PdfReader reader = new PdfReader(baos.toByteArray());
    PdfImportedPage page = w.getImportedPage(reader, 1);
    return Image.getInstance(page);
}

[3.] Filling out data in separate snippets (the arrows marked PdfStamper)

public PdfTemplate getSnippet(PdfWriter w, Snippet snippet) {
    // Using PdfStamper to fill out the fields
    PdfReader reader = new PdfReader(snippet.getBytes());
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    PdfStamper stamper = new PdfStamper(reader, baos);
    stamper.setFormFlattening(true);
    AcroFields form = stamper.getAcroFields();
    // fill out the fields; you've already implemented this
    stamper.close();
    // return the template
    PdfReader reader = new PdfReader(baos.toByteArray());
    return w.getImportedPage(reader, 1);
}

There may be better solutions, for instance involving XFA, but I don't know if that's feasible as I don't know if the templates (the light blue part in my schema) are always the same. It would also involve creating new templates in the XML Forms Architecture.

Best Answer

Related Solutions

Java – Scheduling a few CPU-intensive tasks

Java – Class design for writing multiple versions of multiple files

Keep the need for versioning low

Related Topic