C# (or VB6) Convert Word doc to Tiff

cms-wordtiffvb6

I'm working on a VB6 application that is used by over a hundred users. It generates a Word document, then saves a TIFF image of the document in a database. Currently, it simply sets the printer to Microsoft Office Document Image Writer, "prints" the document to a set location, then imports the resulting TIFF file into the database. However, the organization is in the process of upgrading everyone to Office 07, and this means that Microsoft Office Document Image Writer is going away. So, I'd like to know how hard it would be to programmatically convert from Word to TIFF.
We're already bringing in a C# (.NET 3.5) control library as COM, so that seems like a good place to put the functionality. At some point I'll be converting the whole app to 3.5, so I'd prefer that any new code be already there so there's less to convert.

EDIT: I appreciate the suggestions, but I'd really like to try and do this without using expensive third-party components. It's just hard to get the money guys to see the merit of spending thousands of dollars to fix something that used to work for free. Plus, I'm genuinely interested in what it would take to roll it myself. A bit masochistic, I know, but I got into programming because I'm cursed with a desire to know how things work… 🙂

Thanks for all your help!

Best Answer

As far as I know (and a quick google seems to confirm this), both the TIFF format and DOC binary format specifications are available for free on the web. Therefore, and this would be a fairly big and complex project (I'm thinking man months rather than man weeks), you could write code to read the DOC document and populate an object model. You could then write more code to then output the object model as a TIFF document.

But, just think of some of the complexities: Tables, formatting, character sets, spacing, embedded content, etc. Eek! I guess this is why it is normally the job of expensive third party libraries or professional document management systems.

Out of interest, might this be the time to move away from proprietary document formats and store the document in the DB as something more manageable?

Related Topic