Google-sheets – =IMPORTHTML forces content to date format

google sheetsgoogle-sheets-arrayformulagoogle-sheets-arraysimporthtml

I am a beginner with code and formulas here, so I was wondering if anyone could do exactly what was done on this question:

Trying to use Google Sheets importHTML() to import a table. It forces content to a date format

to this table:

https://www.actionnetwork.com/nfl/nfl-against-the-spread-standings

In short, IMPORTHTML changes sports records to date format. Changing the numbers format to plain text does not work when importing websites, as the format is preset. However, there is a workaround using the ARRAYFORMULA and REGEXREPLACE. When you open the link scroll down to Aurielle's response for the single formula version. I have tried to implement the workaround for the NFL record link above, but I can't quite find the right formula mostly because, again, I'm a beginner and I don't understand these formulas. I was wondering if anyone understood this better and could give me the correct formula for this specific table of NFL records.

I would greatly appreciate any help!

Best Answer

There are likely to be many solutions to your question. Even in this answer, I would expect that the structure could be simplified - possiblky along the lines of the answer by Rubén in Trying to use Google Sheets importHTML() to import a table. It forces content to a date format. In that answer, Rubén constructed a formula in which column() was a variable. This would be a convenient enhancement to this answer.

The answer is in two parts:

  • Part#1 - variables

    • Cell A1: the URL https://www.actionnetwork.com/nfl/nfl-against-the-spread-standings
    • Cell A2-A12 - (11 cells) contain the XPath for specific elements of the output formulae.
      • A2-Headers: //*[@id="__next"]/div/main/div/div[2]/div/table/thead/tr[1]
      • A3-Team Names: //*/div/main/div/div[2]/div/table/tbody/tr/td[1]
      • A4-Overall: //*/div/main/div/div[2]/div/table/tbody/tr/td[2]
      • A5-Home: //*/div/main/div/div[2]/div/table/tbody/tr/td[3]
      • A6-Away: //*/div/main/div/div[2]/div/table/tbody/tr/td[4]
      • A7-ATS: //*/div/main/div/div[2]/div/table/tbody/tr/td[5]
      • A8-ATS Home: //*/div/main/div/div[2]/div/table/tbody/tr/td[6]
      • A9-ATS-Away: //*/div/main/div/div[2]/div/table/tbody/tr/td[7]
      • A10-Ov/Un: //*/div/main/div/div[2]/div/table/tbody/tr/td[8]
      • A11-Ov/Un:Home //*/div/main/div/div[2]/div/table/tbody/tr/td[9]
      • A12-Ov/Un-Away: //*/div/main/div/div[2]/div/table/tbody/tr/td[10]
  • Part#2 - Output

    • Cell C1 - Headers/Column Titles
      • =importxml(A1,A2)
    • Cell C2 - Team names Column
      • =ARRAYFORMULA(trim(transpose(query(transpose(query({importxml(A1,A3)})),,COLUMNS(query({importxml(A1,A3)}))))))
    • Cells D2-L2 (9 columns) - column data
      • Overall: =arrayformula(substitute(ARRAYFORMULA(SUBSTITUTE(trim(transpose(query(transpose(query({importxml(A1,A4)})),,COLUMNS(query({importxml(A1,A4)})))))," "," → "))," → ",""))
      • Home: =arrayformula(substitute(ARRAYFORMULA(SUBSTITUTE(trim(transpose(query(transpose(query({importxml(A1,A5)})),,COLUMNS(query({importxml(A1,A5)})))))," "," → "))," → ",""))
      • Away: =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A6)})),,COLUMNS(query({importxml(A1,A6)})))))," "," → ") )," → ",""))
      • ATS: =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A7)})),,COLUMNS(query({importxml(A1,A7)})))))," "," → ") )," → ",""))
      • ATS Home: =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A8)})),,COLUMNS(query({importxml(A1,A8)})))))," "," → ") )," → ",""))
      • ATS Away: =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A9)})),,COLUMNS(query({importxml(A1,A9)})))))," "," → ") )," → ",""))
      • Ov/Un: =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A10)})),,COLUMNS(query({importxml(A1,A10)})))))," "," → ") )," → ",""))
      • Ov/Un Home: =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A11)})),,COLUMNS(query({importxml(A1,A11)})))))," "," → ") )," → ",""))
      • Ov/Un Away" =arrayformula(substitute(ARRAYFORMULA( SUBSTITUTE( trim(transpose(query(transpose(query({importxml(A1,A12)})),,COLUMNS(query({importxml(A1,A12)})))))," "," → ") )," → ",""))

URL and XPaths

Paths


Output

Highlighted cells indicate a cell containing a formula.

Output


Credit: Max Makhrov: How to use CONCAT in QUERY?