I have asked a question here about js, regex, quantifiers and global search. I've understood finally how this works, but, let's take a concrete example and then I`ll write my question.
Based on the same example
var str = 'ddd';
var r = /d*/g;
console.log(str.match(r))
it outputs this array: ["ddd", ""]
I understand that the first item in the array is because it matches the letter d
and the last item (that empty string) is because it matches the end of the string, which is nothing, so * makes sense because it matches 0 or more occurrences…
So, my questions are:
- Why this is happening?
- Why it just have to query the end of the string to finally obtain a
true matching?
In my opinion, the end of the string(ddd) should not be queried; because it's not like my string is containing an empty space at the end 'ddd '. If my string was empty, it was logical to match, but not in this case. My logic here is this:
for every character in the string, do this search/regex (d*) …so why does it just continue with the end of the string? It should stop on the last charachter of my string, which in this case is d
…
Best Answer
match
is just a wrapper forexec
, per ES5 15.5.4.10, step 8(f)(i):For a global regex,
match
continuously callsexec
untilexec
retruns anull
value.When we look at
exec
, we see that each call toexec
with a global regex increases the regex object'slastIndex
after the match is made:However (here's the real mechanical answer),
lastMatch
is only reset byexec
when it is strictly greater than the length of the string:(Note
i > length
, noti >= length
.)Therefore, there will be a final attempt to match the substring whose left bound is
lastIndex
and whose right bound is the end of the string. Since the final match is done when those positions are identical, a last match attempt is always done on the empty string.As it happens,
d*
matches the empty string (since*
matches zero and up), so that match is included in thematch
results.I cannot offer a surefire explanation why the matching does not stop when
lastIndex
equal string length. My guess is that this final empty-string check is necessary to match zero-length terminal regexes like/$/
which would never match if considered against a non-zero-length substring.