How to Distinguish Between Keywords and Identifiers

language-designlexerparsing

I'm aware that most modern languages use reserved words to prevent things like keywords from being used as identifiers.

Reserved words aside, let's assume a language that allows keywords to be used as identifiers. (For example, in Ruby a keyword can be used as a method name). During compilation, how would you deal with this ambiguity?

The lexing phase doesn't seem like a good fit because it would have to consider the tokens around it. The parsing phase also doesn't seem like a good fit since ideally the parser would work with tokens that are unambiguous.

If I had to design it myself, I suppose I would have the lexer yield an ambiguous token, then have another level that considers the ambiguous token in the context of the tokens around it, (e.g. does the ambiguous token follow a def keyword? Then it must be an identifier.) Then, I would hand the unambiguous tokens to the parser.

In languages that allow keywords to be used as identifiers, what is the standard way for the compiler to tell them apart?

Best Answer

If you notice in Ruby, you cannot call the method named like that directly, e.g. you cannot do

begin()

You can do

obj.begin()

Because there you can have grammar like:

*Arguments* :
    "(" ")"

*MemberExpression* :
    *MemberExpression* "." *IdentifierName*

*CallExpression* :
    *MemberExpression* *Arguments*

(Unrelated rules to the example left out for brevity)

to recognize it. It only requires separating the rule Identifier from IdentifierName:

*Identifier*:
    *IdentifierName* **but not reserved word**

*IdentifierName*:
    //Rules for identifier names here

If you have a starter begin like in

begin()

Then you already activated a rule like

*Block*:
    "begin" *indent* *statement* *outdent* "end"

And Ruby doesn't try to figure out what you mean and it will just be a block.

But for method names where a receiver appears or some other prefix it is easy to allow keywords in the grammar and e.g. Javascript does it doo.

Grammar examples taken from ecma-262