C# – Removing PDF invisible objects with iTextSharp

citextsharppdf

Is possible to use iTextSharp to remove from a PDF document objects that are not visible (or at least not being displayed)?

More details:

1) My source is a PDF page containing images and text (maybe some vectorial drawings) and embedded fonts.

2) There's an interface to design multiple 'crop boxes'.

3) I must generate a new PDF that contains only what is inside the crop boxes. Anything else must be removed from resulting document (indeed I may accept content which is half inside and half outside, but this is not the ideal and it should not appear anyway).

My solution so far:

I have successfully developed a solution that creates new temporary documents, each one containing the content of each crop box (using writer.GetImportedPage and contentByte.AddTemplate to a page that is exactly the size of the crop box). Then I create the final document and repeat the process, using the AddTemplate method do position each "cropped page" in the final page.

This solution has 2 big disadvantages:

  • the size of the document is the [original size] * [number of crop boxes], since the entire page is there, stamped many times! (invisible, but it's there)
  • the invisible text may still be accessed by selecting all (CTRL+A) within Reader and pasted.

So, I think I need to iterate through PDF objects, detect if it is visible or not, and delete it. At the time of writing, I am trying to use pdfReader.GetPdfObject.

Thanks for the help.

Best Answer

If the PDF which you are trying is a template/predefined/fixed then you can remove that object by calling RemoveField.

PdfReader pdfReader = new PdfReader(../Template_Path.pdf"));
PdfStamper pdfStamperToPopulate = new PdfStamper(pdfReader, new FileStream(outputPath, FileMode.Create));
AcroFields pdfFormFields = pdfStamperToPopulate.AcroFields;
pdfFormFields.RemoveField("fieldNameToBeRemoved");