Google Sheets – SUM Row of Numbers After Extracting with REGEX

google sheetsgoogle-sheets-arrayformula

I have columns of data that are of the form:

A1: "IDjohn / 35.00 / california"
A2: "IDmike / 25.00 / oregon"
A3: "IDrebecca / 40.00 / ohio"
B1: "IDchang / 20.00 / washington"
B2: "IDwill / 25.00 / delaware"

Each cell is quite dense with info. I'm trying to find a formula that would summarize a whole column after extracting the number between the slashes. So the summation of column A would result in 100.00 and the summation of column B would result in 45.00.

Is such a thing possible? Or do I need to rework the data format into a more parse-able format?

Best Answer

Use regexextract like

=arrayformula(sum(0+iferror(regexextract(A1:A3, "[\d\,\.]+"))))

How this works:

  • Regular expression extracts the first group of characters that consists of digits 0-9 commas or dots (such as 12,345.67, or 1.23, or just 1). This is a basic number match, you may need a stricter number regular expression.

  • The wrapper iferror is needed in case there is no such group (maybe the cell is empty), because regexextract has the annoying habit of showing #N/A instead of just giving an empty string.

  • Adding 0 forces Sheets to treat the result as a number.

  • Arrayformula and sum perform the summation over a range after running the formula.

A slightly shorter alternative is regexreplace:

=arrayformula(sum(0+regexreplace(A1:A3, "[^\d\,\.]", "")))

Here, regexreplace removes everything that is not a digit or period. An advantage is that there can be no error thrown. A disadvantage is that there is a greater chance of getting wrong results if, say, the third column has some digits 0-9 in it. Those digits would get appended to the number you want.