Php – Why is PHP’s method of comparing different types bad

comparisondynamic-typinglanguage-designPHPprogramming-languages

I'm working on designing a new programming language and trying to decide how I will do variable comparisons. Along with many different types of languages, I've used PHP for years and personally had zero bugs related to its comparison operations other than situations where 0 = false. Despite this, I've heard a lot of negativity towards its method of comparing types.

For example, in PHP:

 2  <  100      # True
"2" < "100"     # True
"2" <  100      # True

In Python, string comparison goes like this:

 2  <  100      # True
"2" < "100"     # False
"2" <  100      # False

I don't see any value in Python's implementation (how often do you really need to see which of two strings is lexicographically greater?), and I see almost no risk in PHP's method and a lot of value. I know people claim it can create errors, but I don't see how. Is there ever really going to be a situation where you are testing if (100 = "100") and you don't want the string to be treated as a number? And if you really did, you could use === (which I've also heard people complain about but without any substantial reason).

So, my question is, not counting some of PHP's weird conversion and comparison rules dealing with 0's and nulls and strings mixed with characters and numbers, are there any substantial reasons that comparing ints and strings like this is bad, and are there any real reasons having a === operator is bad?

Best Answer

The biggest problem is that an equivalence relationship, the mathy term for things like ==, is supposed to satisfy 3 laws.

  1. reflexivity, a == a
  2. commutativity a == b means b == a
  3. transitivity a == b and b == c means a == c

All of these are very intuitive and expected. And PHP doesn't follow them.

'0'==0 // true
 0=='' // true
'0'==''// false, AHHHH

So it's not actually an equivalence relationship, which is a pretty distressing realization for some mathy people (including me).

It also hints at one of the things that people really hate about implicit casts, they often behave unexpectedly when combined with the mundane. It's basically just an arbitrary set of rules because it's unprincipled in this sense, weird stuff happens and it all needs to be specified case by case.

Basically we sacrifice consistency and the developer has to shoulder the extra burden of making sure there's no funny (and expensive) conversions happening behind the scene's. To quote this article

Language consistency is very important for developer efficiency. Every inconsistent language feature means that developers have one more thing to remember, one more reason to rely on the documentation, or one more situation that breaks their focus. A consistent language lets developers create habits and expectations that work throughout the language, learn the language much more quickly, more easily locate errors, and have fewer things to keep track of at once.

EDIT:

Another gem I stumbled across

  NULL == 0
  NULL < -1

So if you try to sort anything, it's nondetermistic and entirely dependent on the order in which comparisons are made. Eg suppose bubble sort.

  bubble_sort([NULL, -1, 0]) // [NULL, -1, 0]
  bubble_sort([0, -1, NULL]) // [-1, 0, NULL]