Why is x=x++ undefined

clanguage-design

It's undefined because the it modifies x twice between sequence points. The standard says it's undefined, therefore it's undefined.
That much I know.

But why?

My understanding is that forbidding this allows compilers to optimize better. This could have made sense when C was invented, but now seems like a weak argument.
If we were to reinvent C today, would we do it this way, or can it be done better?
Or maybe there's a deeper problem, that makes it hard to define consistent rules for such expressions, so it's best to forbid them?

So suppose we were to reinvent C today. I'd like to suggest simple rules for expressions such as x=x++, which seem to me to work better than the existing rules.
I'd like to get your opinion on the suggested rules compared to the existing ones, or other suggestions.

Suggested Rules:

  1. Between sequence points, order of evaluation is unspecified.
  2. Side effects take place immediately.

There's no undefined behavior involved. Expressions evaluate to this value or that, but surely won't format your hard disk (strangely, I've never seen an implementation where x=x++ formats the hard disk).

Example Expressions

  1. x=x++ – Well defined, doesn't change x.
    First, x is incremented (immediately when x++ is evaluated), then it's old value is stored in x.

  2. x++ + ++x – Increments x twice, evaluates to 2*x+2.
    Though either side may be evaluated first, the result is either x + (x+2) (left side first) or (x+1) + (x+1) (right side first).

  3. x = x + (x=3) – Unspecified, x set to either x+3 or 6.
    If the right side is evaluated first, it's x+3. It's also possible that x=3 is evaluated first, so it's 3+3. In either case, the x=3 assignment happens immediately when x=3 is evaluated, so the value stored is overwritten by the other assignment.

  4. x+=(x=3) – Well defined, sets x to 6.
    You could argue that this is just shorthand for the expression above.
    But I'd say that += must be executed after x=3, and not in two parts (read x, evaluate x=3, add and store new value).

What's the Advantage?

Some comments raised this good point.
I certainly don't think expressions such as x=x++ should be used in any normal code.
Actually, I'm much more strict than that – I think the only good usage for x++ in as x++; alone.

However, I think the language rules must be as simple as possible. Otherwise programmers just don't understand them. the rule forbidding changing a variable twice between sequence points is certainly a rule most programmers don't understand.

A very basic rule is this:
If A is valid, and B is valid, and they're combined in a valid way, the result is valid.
x is a valid L-value, x++ is a valid expression, and = is a valid way to combine an L-value and an expression, so how come x=x++ isn't legal?
The C standard makes an exception here, and this exception complicates the rules. You can search stackoverflow.com and see how much this exception confuses people.
So I say – get rid of this confusion.

=== Summary of Answers ===

  1. Why do that?
    I tried to explain in the section above – I want C rules to be simple.

  2. Potential for optimization:
    This does take some freedom from the compiler, but I didn't see anything that convinced me that it might be significant.
    Most optimizations can still be done. For example, a=3;b=5; can be reordered, even though the standard specifies the order. Expressions such as a=b[i++] can still be optimized similarly.

  3. You can't change the existing standard.
    I admit, I can't. I never thought I can actually go ahead and change standards and compilers. I only wanted to think if things could have been done differently.

Best Answer

Maybe you should first answer the question why it should be defined? Is there any advantage in programming style, readability, maintainability or performance by allowing such expressions with additional side effects? Is

y = x++ + ++x;

more readable than

y = 2*x + 2;
x += 2;

Given that such a change is extremely fundamental and breaking to the existing code base.