Google-docs – How to export text from Google Doc as HTML

cmsexportgoogle docshtmllinks

I have body of text in a Google Doc containing a large number of links. I need to get the text – and all the links – into my client's CMS.

Unfortunately, it seems Google Docs no longer allows users to export HTML. I've tried "Download as" > "Web page (.html, zipped)" and uploading that file into Text Wrangler to clean it up, but the links are all scrambled – e.g something twitter.com/sree becomes:

<a href="https://www.google.com/url?q=https://twitter.com/sree&amp;sa=D&amp;ust=1465095908840000&amp;usg=AFQjCNHpFpNdY6Hsr5xrZZlF5vCGTGIt6w">Sree Sreenivasan</a>

Rather than go in and re-do all the links manually, is there any way to get the html code I need from the Google Doc?

Best Answer

You can use a regular expression like this in Text Wrangler :

<a href="https://www.google.com/url\?q=(.*)\&amp;sa(.*)">(.*)</a>

and replace by :

<a href="$1">$3</a>