Google-sheets – Periodically updating a URL in the ImportXML command with a script

google sheetsgoogle-apps-scriptgoogle-sheets-timestampimportxml

I am using importxml to extract prices from multiple websites. My main concern is to be able to refresh the prices whenever I want and not be dependent on Google's hourly update. I am using this script to append a timestamp at the end of the URL so that Google is forced to re-fetch each time.

function onOpen() {
    var e = SpreadsheetApp.getActiveSpreadsheet();
    var t = [{
        name: "Refresh This Sheet",
        functionName: "Refresh"
    }, {
        name: "Refresh All Sheets",
        functionName: "RefreshAll"
    }];
    e.addMenu("Looker", t)
}

function onInstall(e) {
    onOpen(e)
}

function Refresh(e) {
    var t = SpreadsheetApp.getActiveSheet();
    if (typeof e !== "undefined") {
        t = e
    }
    var n = t.getDataRange();
    var r = find("importxml", n);
    for (var i = 0; i < r.length; i++) {
        r[i].setValue(r[i].getFormula().replace(/"[^"]"/, "?apply_formatting=true&refresh=" + Date.now() + '","'))
    }
}

function RefreshAll() {
    var e = SpreadsheetApp.getActiveSpreadsheet().getSheets();
    for (var t = 0; t < e.length; t++) {
        Refresh(e[t])
    }
}

function find(e, t) {
    var n = t.getFormulas();
    var r = [];
    for (var i = 0; i < n.length; i++) {
        for (var s = 0; s < n[i].length; s++) {
            if (n[i][s].indexOf(e) > -1) {
                r.push(t.getCell(i + 1, s + 1))
            }
        }
    }
    return r
}

I am using the function refresh to update the values, but the problem is it's not replacing the timestamp at the end of the URL; instead it creates a unique timestamp and appends it to the URL each time.

What am I missing here? I believe I am not using the operators in replace correctly.

replace(/"[^"]"/,"?apply_formatting=true&refresh="+Date.now()+'","'

Best Answer

Issues

One problem with your script is that the regular expression /"[^"]"/ matches only a single character between double quote marks. To replace more characters between double quotes, use /"[^"]*"/ instead. The site regex101 is very helpful for debugging regular expressions.

Also, I don't quite see the logic of this replacement: to me it seems that it would kill the hostname of the website, not just change the parameter. Finally, I don't see why the updated formula is set with .setValue instead of setFormula.

Solution

Here is a replacement that works. Suppose this is a formula in your spreadsheet that you want to update, say in cell F2.

=importxml("http://example.com?apply_formatting=true&refresh=1234567890987", "//div[42]")

Here is a function that updates it:

function update() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var cell = sheet.getRange("F2");
  cell.setFormula(cell.getFormula().replace(/\?[^"]*"/, '?apply_formatting=true&refresh='+Date.now()+'"'));
}