EDIT: Updated flow diagram to better explain the (likely unnecessary) complexity of what I'm doing.
We at the company I work for are attempting to create complex PDF files using Java iText (the free version 2.1 line). The documents are built piece-by-piece from individual "template" files, which are added to the final document one after another using the PdfStamper
class, as well as filled using AcroForms.
The current design performs a loop which runs for each template that needs to be added in order (as well as the custom logic needed to fill each). In each iteration of the loop, it does the following:
- Creates a
PdfReader
to open the template file - Creates a
PdfStamper
that reads from thePdfReader
and writes to a "template buffer" - Fills AcroForm fields, as well as measures the height of the template by getting the location of an "end"
AcroField
. - Closes the
PdfReader
andPdfStamper
- Creates a
PdfReader
to read a "working buffer" that stores the current final document in progress - Creates a
PdfStamper
that reads from thePdfReader
and writes to a "storage buffer" - Closes the
PdfReader
, opens a newPdfReader
to the "template buffer" - Imports the page from the "template buffer," adds it to the
ContentByte
of thePdfStamper
- Closes the
PdfReader
andPdfStamper
- Swaps the "storage buffer" with the "working buffer" in order to be ready to repeat.
Here is a diagram visually explaining the above process, which is performed for each iteration of the "loop" that performs each template:
However, as discovered through this Stack Overflow question (where example code can also be viewed) and the response by iText author Bruno Lowagie, this methodology of using the PdfStamper
can cause significant issues. "Abusing" the PdfStamper
by creating and closing the stamper too many times can cause corruptions in the resulting file that only effect some programs, generating a seemingly good document that may fail in certain contexts.
What is the alternative? Mr. Lowagie's response suggests that there is a simpler or more direct way to use PdfStamper
s, though I do not quite understand it myself yet. Could this be done using only a single stamper? Could it be done without using a rotating series of buffers?
Best Answer
I'm always disappointed when I read "we are using iText 2.1" because that's really not a wise choice as explained here, but this is a question about design, so here is a possible approach:
You create a new document
Document document = new Document();
(step 1), you create aPdfWriter
instance (step 2), you open the document (step 4), and you add content in a loop (step 4):PdfStamper
andAcroFields
(see your code on StackOverflow). This results in separate "flattened" form snippets kept in memory.Document
/PdfWriter
instance to create a new PDF in memory that combines all the snippets that belong together. You get a snippet like this:PdfImportedPage snippet = writer.getImportedPage(reader, 1);
and you add thesnippet
to thewriter
using theaddTemplate()
method.PdfImportedPage combined = writer.getImportedPage(reader, 1);
, you wrap the result in an image like this:Image image = Image.getInstance(combined);
You add the image to the document:document.add(image);
Step 2 could be omitted. You could add the different snippets straight to the
document
that is initially created. Repeats steps 1 to 3 as many times as needed, and close the document (step 5).Omitting step 2 will result in a lower XObject nesting count, but keeping step 2 isn't problematic.
In pseudo code, we'd have:
[1.] The outer loop (the large part to the right of the schema, marked
PdfWriter
)[2.] The creation of one unit (the arrow marked
PdfWriter
in the middle)[3.] Filling out data in separate snippets (the arrows marked
PdfStamper
)There may be better solutions, for instance involving XFA, but I don't know if that's feasible as I don't know if the templates (the light blue part in my schema) are always the same. It would also involve creating new templates in the XML Forms Architecture.