Google-sheets – How to extract the price using importxml from an AngularJS site

google sheetsimportxmlregexextract

I need to extract the price from this website using importXML on Google Spreadsheets.

Here is a product
https://en-sa.wadi.com/apple-iphone-6s-16-gb-4g-rose-gold-with-facetime-61616.html

The issue is they are using AngularJS, so the typical method is not working.

This is what I was using earlier, but it's no longer working.

=REGEXEXTRACT(REGEXREPLACE(ImportXML("https://en-sa.wadi.com/sony-xperia-z3-16-gb-4g-lte-black-dual-sim-355.html", "//div[@class='info-module other-sellers']//p"), "{{ctrl.selectedSupplier.suppliers.length-1}}", ""),"[0-9]+")

Best Answer

When the data you're after is in a script, you can get all script content with xPath //script, join the results (so you don't have to rely on the script being 7th or 11th in the source), and parse the results with regexextract. The following formula returns 2447 as of now:

=regexextract(regexextract(join(" ", IMPORTXML("https://en-sa.wadi.com/apple-iphone-6s-16-gb-4g-rose-gold-with-facetime-61616.html", "//script")), """offerPrice"":\d+"), "\d+")

Here, the first regular expression """offerPrice"":\d+" matches "offerPrice":2447 (the quote character is doubled inside the string). The second regexextract keeps only the number 2447.

The above assumes that "offerPrice" is the thing you want. To extract the regular price, use """price"":\d+" instead.

Related Topic