Google-sheets – Import Price from Gamestop Into Google Sheets

google sheetsimportxml

I'm trying to obtain a price from a given Gamestop.com URL and have it imported into a Google Sheets document, but I can't seem to make it work. I've tried to use the importXML formula but don't seem to be getting to syntax to work correctly.

For example, I'm trying to get the price from this URL: https://www.gamestop.com/xbox-360/games/call-of-duty-modern-warfare-2/74392 but am not able to do so properly. Every syntax that I attempt to enter returns:

"Imported XML cannot be parsed" or "No content".

This is what I'm currently trying but it's not working:

=IMPORTXML("https://www.gamestop.com/xbox-360/games/call-of-duty-modern- 
 warfare-2/74392","class='ats-prodBuy- price'/html/body/form/div[3]/div[2]/div[2]/div[1]/div/div/div[2]/div/div[5]/div[2]/div[2]/div[2]/h3")

What do I need to do differently to get it working properly?

Best Answer

Two things to note - as to why your formula isn't working - 1 being that the syntax is wrong - which i will explain - and 2- which is not your fault - in that the way their populate the data is likely from some framework using templates.

As far as importing your price - I cant guarantee how reliable this will be forever - but i tested it with a few different products and it seemed to always give back the right price -

=TRIM(regexextract(regexreplace(concatenate(IMPORTXML(A1,"//head")),"\n"," "), "(\$\W\d+\S+)+"))

What I am doing above is pulling in the tags from the head section of the site, then using concatenate to join all the tags together.

The regexreplace portion is to remove all the carriage returns and newlines - simply because they annoy me to death, and its impossible to work with a cell that is bigger than what you can see.

 The final regexextract (\$\W\d+\S+) pattern means the following: 

 \$ is an escaped dollar sign, since $ is reserved otherwise to
 mean end of line

 so it looks for that symbol followed by a special character '\W' 

 followed by any digit '\d' 

 and then any characters following up until the next space '\S+'

For your xpath syntax - note that the fact that i used head is actually an edge case - and most sites dont add their data into the meta data of the website - we just got lucky in that one - however for other sites if you wanted to fetch the field using the class name and the h3 element you would instead structure your formula like this:

=importxml("url", "//*[@class='ats-prodBuy-price']")

or 

=importxml("url", "//h3")

although the h3 by itself could easily be annoying as it is likely not the only h3 found on the page in terms of most common scenarios

Lastly I dont have a good explanation as to why the price shows up with an asterisk - or why it loads into the head when i use importxml but is not technically visible in the html itself but there you go

enter image description here