Google-sheets – Extracting text from a capture group while using an arrayformula to consider multiple criteria

google sheetsgoogle-sheets-arrayformularegexregexextract

I have a page which contains a list of regular expressions in the format "some text (thing I want) but only if (thing is matched here)" that are fairly varied. There are 29 of them so far, and I'd prefer not to put them into a single formula, though that is my fallback. A sample of what I'm doing is here.

I have tried several techniques, for example I used textjoin() to concatenate all my conditions and am able to correctly get a match – that is just the true/false that this is valid – but I am unable to then perform the corresponding extract because I don't know what row I've matched on. I thought this would be the best way to go, but other formulas like VLOOKUP can't be used with a regular expression so I'm uncertain how to obtain that data.

The closest I've gotten is shown here, that returns the thing I want but the other groups as well.

=textjoin("",true,arrayformula(if(iserror(REGEXEXTRACT(E2,'Criteria'!B2:B)),"",choose(1,REGEXEXTRACT(E2,'Criteria'!B2:B)))))

I'm using textjoin so that the result isn't overwritten by "" by non matches on other lines and my expectation was that choose would restrict the textjoin only to the first element but this is not the output I'm seeing.

Thoughts on how to extract only the pattern match for "thing I want"?

Best Answer

You would need to alter your "regular expression" in the sample regex

from: ((?:[[:alpha:]]+\s?)+)
to: (?:[[:alpha:]]+\s?)+

Following that, your formula will work just fine.

=textjoin("",true,arrayformula(if(iserror(REGEXEXTRACT(B2,Criteria!$A$2:$A)),"",choose(1,REGEXEXTRACT(B2,Criteria!$A$2:$A)))))

enter image description here

You could also use [A-Za-z] instead of [:alpha:] as shown in cell C2.


BUT
If you also change your formula and use the Arrayformula function in a different way, you can use just one formula for all rows.

=ArrayFormula(IFERROR(REGEXEXTRACT(B2:B,Criteria!$A$2:$A)))

enter image description here