Javascript – How to program a text search and replace in PDF files


How would I be able to programmatically search and replace some text in a large number of PDF files? I would like to remove a URL that has been added to a set of files. I have been able to remove the link using javascript under Batch Processing in Adobe Pro, but the link text remains. I have seen recommendations to use text touchup, which works manually, but I don't want to modify 1300 files manually.

Best Answer

Finding text in a PDF can be inherently hard because of the graphical nature of the document format -- the letters you are searching for may not be contiguous in the file. That said, CAM::PDF has some search-replace capabilities and heuristics. Give a try and see if it works on your PDFs.

To install:

 $ cpan install CAM::PDF
 # start a new terminal if this is your first cpan module
 $ input.pdf oldtext newtext output.pdf