Google-sheets – Matching a cell value to a category from patterns table

google sheetsgoogle-sheets-queryregex

I'm trying to categorize a cell value in Google Spreadsheets, based on a categories table, so that depending on the value in the cell, the correct category will be determined.

Given a categories table like this:

A       B
Label   Category    
bla 1   cat1    
bla 2   cat2    
...

then VLOOKUP can find exact matches:

A            B
Description  Category
bla 1        =VLOOKUP(A10, A$2:B$6, 2, False)
bla 2        =VLOOKUP(A11, A$2:B$6, 2, False)

However, VLOOKUP only works because the categories table contains exact matches of the possible values for the cell. What I want is the categories table to contain patterns:

A        B 
Pattern  Category
bla.*1   cat1
bla.*2   cat2
foo      cat1

This would be trivial to do using Google Apps Script where I could easily write a loop that checks for a match on any of a set regular expressions, but I'm curious to find how to do it using spreadsheet builtin functions.

The docs for VLOOKUP indicate to use Query if matching by pattern, but I don't think that Query can be used for this, because Query matches on a specific regexp, whereas my categories table has many. I tried "=query(A$1:B$6, select F where A matches G" but that does not work.

So is there a way to do this or is a custom function (using GAS) my only option? Basically I'm looking for a way to say "find the category in table for which the cell of interest (not in the category table) matches the category's regular expression".

Best Answer

Contrary to what I had concluded earlier, "where 'string' matches B" works:

Description   Expected Category Inferred Category
bla 1          cat1             =query(A$1:B$3, "select C where '" & A8 & "' matches B limit 1")
bla 2          cat2             =query(A$1:B$3, "select C where '" & A9 & "' matches B limit 1")
bla blue 2     cat2             =query(A$1:B$3, "select C where '" & A10 & "' matches B limit 1")
bla green 1    cat1             =query(A$1:B$3, "select C where '" & A11 & "' matches B limit 1")
foo            cat1             =query(A$1:B$3, "select C where '" & A12 & "' matches B limit 1")

produces the expected output. You have to limit the result to 1 in case there are more than one matches.