Algorithms – Precedence of Functions in Shunting-Yard Algorithm

algorithmsoperator precedenceshunting-yard

I am working through the Shunting-yard algorithm, as described by wikipedia.

The description of the algorithm when dealing with operators is as follows:

If the token is an operator, o1, then:

while there is an operator token, o2, at the top of the operator
stack, and either
o1 is left-associative and its precedence is less than or equal to
that of o2, or

o1 is right associative, and has precedence less than that of o2,
then pop o2 off the operator stack, onto the output queue;

push o1 onto the operator stack.

However, they give the following example:

Input: sin max 2 3 / 3 * 3.1415

When the algorithm hits the / token, the description of what should happen is as follows:

Token |        Action       |   Output (in RPN) |   Operator Stack
...
/     | Pop token to output | 2 3 max           | / sin 
...

They are popping the function token max off the stack and putting into the queue. According to their algorithm, this would seem to mean the function token is both an operator, and has a precedence less than that of the / operator.

There is no explanation as to whether or not this is the case. So, for the Shunting-yard algorithm, what is the precedence of a function? Are functions right or left associative? Or is wikipedia just incomplete/inaccurate?

Best Answer

I believe the direct answer is simply that functions are not operators. From the page you linked:

If the token is a function token, then push it onto the stack.

This is all it needs to say, since the function case (prefix to postfix) is much simpler than the operator case (infix to postfix).

For the follow-up questions: The notions of precedence and associativity are only needed because of the inherit ambiguity in any expression with multiple infix operators. Function tokens are already using prefix notation, so they simply don't have that problem. You don't need to know whether sin or max has "higher precedence" to figure out that max needs to be evaluated first; it's already clear from the order of the tokens. That's why computers prefer pre/postfix notation to begin with, and why we have this algorithm for converting infix to pre/postfix.

You do need to have some kind of rule for where a function's arguments start and end when no parenthesis are present, so you could say that functions "take precedence" over operators or vice versa. But unlike infix operators, a single consistent rule for all functions is enough to make their compositions completely unambiguous.

Related Solutions

Algorithm to go from infix notation to a tree

Treat the comma as an infix operator. Then

max(1,3,7)*4+5

becomes

The comma should have a lower precedence than your calculation operators (+ - * / etc.).

Incorporating functions into a Shunting-Yard algorithm implementation

I use a two state, two stack model similar to the Shunting Yard algorithm. The two states are the unary state and the binary state. In each state, you handle tokens (or issue errors) and then either continue the same state or switch to the other state.

In the unary state, if you find an open paren, (, it is operator grouping, and, you stay in the unary state. (In the unary state, an identifier is a variable reference you push it onto the operand stack, and switch to the binary state; a minus, -, is a unary negation, and you stay in unary state.)

In the binary state, if you find an open paren, it is a function invocation (and you switch to the unary state). Differentiating between the two states also help you deal with unary and pre/post-fix operators. (Here a minus, -, is a binary subtraction and you switch to unary state.)

ok, for the expression (*p[i++])(), p is a pointer to pointer to function, right? So,

(Note that the empty parameter list expression should only be allowed when function calls are actually present; there are various ways and places for detecting the various possible errors.)

The actual tokens handled in unary state and in binary state are determined by the language you're trying to parse. I allow separate code sections for the unary state and the binary state. There seems to be no point in mixing them and then you don't need a state variable either (the code section just knows which one it is).

Best Answer

Related Solutions

Algorithm to go from infix notation to a tree

Incorporating functions into a Shunting-Yard algorithm implementation

Related Topic