Google Sheets – Bulk ImportHTML Script

google sheetsgoogle-apps-script

With great help, this script is working to pull in the 2nd table on the below link. The table cannot be scraped with IMPORTHTML unlike the first table (Regular Season works, Playoffs doesn't).

https://www.basketball-reference.com/players/c/curryst01/gamelog/2016

function myFunction() {
  const url = "https://www.basketball-reference.com/players/c/curryst01/gamelog/2016"; // This URL is from your question.
  const sheetName = "Sheet1";  // Please set the destination sheet name.

  const html = UrlFetchApp.fetch(url).getContentText();
  const tables = [...html.matchAll(/<table[\s\S\w]+?<\/table>/g)];
  if (tables.length > 2) {
    const ss = SpreadsheetApp.getActiveSpreadsheet();
    Sheets.Spreadsheets.batchUpdate({ requests: [{ pasteData: { html: true, data: tables[8][0], coordinate: { sheetId: ss.getSheetByName(sheetName).getSheetId() } } }] }, ss.getId());

    Sheets.Spreadsheets.batchUpdate({ requests: [{ pasteData: { html: true, data: tables[7][0], coordinate: { sheetId: ss.getSheetByName(sheetName).getSheetId() } } }] }, ss.getId());
    
    
    return;
  }
  throw new Error("Expected table cannot be retrieved.");
}`enter preformatted text here`

Now, I have two more issues I'm looking to solve.

  1. Is there a way to tweak this code so both tables 8 and 7 are loading at once? I think my code here fills in with table 8, but then is replaced by table 7. I think because of the batchupdate?

  2. Is there a way to load more than one URL at once. Basically, I want to replicate this code so that 50 URLs can load in the tables at once.

I was previously just using a bunch of IMPORTHTML functions, but because the Playoffs table is a Javascript object, I'm struggling to figure it out.

Best Answer

 You are on the right track as the code is using batchUpdate. I suggest you to spend some time learning about how to handle Arrays in JavaScript. I.E. There is no need to include two batchUpdate as it's possible to put the request object of both in a single one, actually in a single batchUpdate it's possible to include 50 pasteData.

Your guess about that both "paste" operations are working and that the second overwrites the first is correct. The overwrites happens because the coordinate property only specifies the sheetId but not the rowIndex and columnIndex properties.

Try replacing:

Sheets.Spreadsheets.batchUpdate({ requests: [{ pasteData: { html: true, data: tables[8][0], coordinate: { sheetId: ss.getSheetByName(sheetName).getSheetId() } } }] }, ss.getId());

Sheets.Spreadsheets.batchUpdate({ requests: [{ pasteData: { html: true, data: tables[7][0], coordinate: { sheetId: ss.getSheetByName(sheetName).getSheetId() } } }] }, ss.getId());

by

Sheets.Spreadsheets.batchUpdate(
{ requests: [
{ pasteData: { html: true, data: tables[8][0], coordinate: { 
sheetId: ss.getSheetByName(sheetName).getSheetId(), 
rowId: 1,
columnId: 1
} } },

{ pasteData: { html: true, data: tables[7][0], coordinate: { 
sheetId: ss.getSheetByName(sheetName).getSheetId(),
rowIndex: 30,
columnIndex: 1
 } } }

] }, ss.getId()
);

Complete code:

function myFunction() {
  const url = "https://www.basketball-reference.com/players/c/curryst01/gamelog/2016"; // This URL is from your question.
  const sheetName = "Sheet1";  // Please set the destination sheet name.

  const html = UrlFetchApp.fetch(url).getContentText();
  const tables = [...html.matchAll(/<table[\s\S\w]+?<\/table>/g)];
  if (tables.length > 2) {
    const ss = SpreadsheetApp.getActiveSpreadsheet();
    Sheets.Spreadsheets.batchUpdate({
      requests: [
        {
          pasteData: {
            html: true, data: tables[8][0], coordinate: {
              sheetId: ss.getSheetByName(sheetName).getSheetId(),
              rowIndex: 1,
              columnIndex: 1
            }
          }
        },
        {
          pasteData: {
            html: true, data: tables[7][0], coordinate: {
              sheetId: ss.getSheetByName(sheetName).getSheetId(),
              rowIndex: 30,
              columnIndex: 1
            }
          }
        }
      ]
    }, ss.getId());


    return;
  }
  throw new Error("Expected table cannot be retrieved.");
}

Resources

Related Topic