Google Sheets – How to Convert ‘4.8 Million’ into an Actual Number

formulasgoogle sheets

I have a sheet that has numeric records that are a mix of three different number types:

260,000
4.8 million
870 million
1.1 billion

The first one is handled just fine as a number, but the ones formatted, such as 4.8 million, are difficult to convert into numbers. 

I need a formula to convert a numeric string, such as one of the four examples, into an actual regular number. If it helps any, the number of places after the decimal places for the large numbers is at most 1. There is also no named amount larger than a billion. 

I am thinking if I were doing this in code in JavaScript, it would be to split it into an array by the space if it contained the substring of 'illion', and then either multiply it by 1000000 or 1000000000 based on the second word in the array. Not sure how I would do it in a spreadsheet formula.

Best Answer

Optimized Array Formula

This formula is faster than my comparable array formula at the end. I favor shorter code where possible but apparently, that doesn't always go hand-in-hand with shorter execution time.

LET(rng,A1:A, arr,ARRAY_CONSTRAIN(rng,MAX(FILTER(ROW(rng), rng<>"")),1),
  m,INDEX(IFS(arr="",,ISNUMBER(arr),,1,IFERROR(REGEXEXTRACT(arr, " (\w)illion")))),
  v,IF(m="",,REGEXEXTRACT(arr, "([\d\.]*) ")),
  ARRAYFORMULA(IFS(m="m",v*10^6, m="d",v*10^9, 1,arr)))
  1. rng references Column A
  2. arr stores an array created by removing any empty rows after the last populated cell in rng.
  3. m returns an array of multipliers by processing the values in arr such that it has a:
    1. blank, if arr is a number or a blank,
    2. a single word character \w, if one is captured from the beginning of a text string using the regular expression (\w)illion. Presumably it would be an m from million or b from billion, but other incidental captures aren't a concern.
    3. lastly the value might be an error if there are string in Column A that don't match (\w)illion. This is also not a concern.
  4. v stores an array of the same dimensions as m made up of blanks where m is blank, and extracts numbers from arr wherever m successfully extracted a single word character.
  5. Finally, inside an ARRAYFORMULA, IFS evaluates multiple conditions, returning the value corresponding to the first true condition.
    1. Conditions one and two check if m contains either the letter "m" or "b" returning either v*10^6 or v*10^9 respectively if one of them is true.
    2. If the previous two conditions return false, then the original value from arr / Column A is returned (text string, number, whatever). To achieve this the last condition is set to 1, which is equivalent to TRUE, therefore if ever the condition is reached, it must always succeed. This approach is why most errors generated in the earlier stages are not relevant as, if there is no "m" or "b" the formula returns the original value.

Earlier Answer

Martin's answer is a good approach and originally I created a slight variation of it adding only some code reduction and extending it to work as an array formula. I found later that changing the approach resulted in better performance.

Credit: Martin's Original Formula:

=IF(A1="","",IF(ISNUMBER(A1),A1,VALUE(REGEXEXTRACT(A1,"[0-9\.]*"))*
   IF(REGEXMATCH(A1,"million"),1000000,IF(REGEXMATCH(A1,"billion"),1000000000,1))))
Variation on Martin's
=IFS(A1="",, ISNUMBER(A1),A1, 1,REGEXEXTRACT(A1,"[\d\.]*") * 10^ 
   (REGEXMATCH(A1,"m")*6 + REGEXMATCH(A1,"b")*9))
Array Formula Version
=LET(rng,A1:A, INDEX(
   IFS(rng="",, ISNUMBER(rng),rng, 1,REGEXEXTRACT(rng,"[\d\.]*") * 10^ 
     (REGEXMATCH(rng,"m")*6 + REGEXMATCH(rng,"b")*9))))