Google-sheets – How to auto generate the date a URL was last updated

google sheets

I'm trying to find out if it's possible to fetch the date a URL was last updated. So column A I'd have the URL and column B the data would auto generate. I'm guessing this would be done using some sort of script, but I don't know where to start. Googling isn't helping much.

Best Answer

Yes, this can be done with a script. Keep in mind that Google services are subject to various quotas; if you ask it to do too much, it may well refuse. How much is too much depends on circumstances: e.g., if some of the websites take a long time to reach, the script may have problems with maximum execution time (6 minutes).

Here is the script. It takes URLs from the column A of the active sheet, and records the "Last-Modified" date returned by the server in the B column. Each URL is tested for having http at the beginning, to avoid making weird fetch calls when the column happens to contain something else.

To refresh the data periodically, you may want to run the script daily with a trigger.

function testurl() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var range = sheet.getRange(1, 1, sheet.getLastRow(), 2);
  var urls = range.getValues();
  for (var i = 0; i < urls.length; i++) {
    if (/^http/.test(urls[i][0])) {
      urls[i][1] = UrlFetchApp.fetch(urls[i][0]).getHeaders()['Last-Modified'];
    }
  }
  range.setValues(urls);
}

Sample output:

+---+----------------------------------+-------------------------------+
|   |                A                 |               B               |
+---+----------------------------------+-------------------------------+
| 1 | http://blog.stackoverflow.com    | Tue, 29 Mar 2016 18:02:37 GMT |
| 2 | http://webapps.stackexchange.com | Sun, 03 Apr 2016 00:50:59 GMT |
+---+----------------------------------+-------------------------------+