Parsing – How to Add Precedence to LALR Parser Like in YACC

parsing

Please note, I am asking about writing LALR parser, not writing rules for LALR parser.

What I need is…

…to mimic YACC precedence definitions. I don't know how it is implemented, and below I describe what I've done and read so far.


For now I have basic LALR parser written. Next step — adding precedence, so 2+3*4 could be parsed as 2+(3*4).

I've read about precedence parsers, however I don't see how to fit such model into LALR. I don't understand two points:

  • how to compute when insert parenthesis generator

  • how to compute how many parenthesis the generator should create

I insert generators when the symbols is taken from input and put at the stack, right? So let's say I have something like this (| denotes boundary between stack and input):

  1. ID = 5 | + ..., at this point I add open, so it gives
  2. ID = < 5 | + ..., then I read more input
  3. ID = < 5 + | 5 ... and more
  4. ID = < 5 + 5 | ; ... and more
  5. ID = < 5 + 5 ; | ...

At this point I should have several reduce moves in normal LALR, but the open parenthesis does not match so I continue reading more input. Which does not make sense.

So this was when problem.

And about count, let's say I have such data < 2 + < 3 * 4 >. As human I can see that the last generator should create 2 parenthesis, but how to compute this? After all there could be two scenarios:

  • ( 2 + ( 3 *4 )) — parenthesis is used to show the outcome of generator

  • or (2 + (( 3 * 4 ) ^ 5) because there was more input

Please note that in both cases before 3 was open generator, and after 4 there was close generator. However in both cases, after reading 4 I have to reduce, so I have to know what generator "creates".

Best Answer

Consider the following rules:

[1] E -> E + E
[2] E -> E * E
[3] E -> num

Without precedence, it would produce a shift/reduce error when encountering the second operator. The table looks something like this:

    num     +       *       $       E
0   S2      ---     ---     ---     1
1   ---     S3      S4      Accept  ---
2   ---     R3      R3      R3      ---
3   S2      ---     ---     ---     5
4   S2      ---     ---     ---     6
5   ---     R1/S3   R1/S4   R1      ---
6   ---     R2/S3   R2/S4   R2      ---

With precedence, you are saying which of "shift" and "reduce" you want in the case of conflicts.

  1. If the first operator has a lower precedence, you shift.
  2. If the first operator has a higher precedence, you reduce.
  3. If the operators have equal precedences, and both are
    • left associative, you reduce.
    • right associative, you shift.
    • otherwise, you fail.

Cleaning up the errors in the table, you get something like this:

    num     +       *       $       E
0   S2      ---     ---     ---     1
1   ---     S3      S4      Accept  ---
2   ---     R3      R3      R3      ---
3   S2      ---     ---     ---     5
4   S2      ---     ---     ---     6
5   ---     R1      S4      R1      ---
6   ---     R2      R2      R2      ---
Related Topic