Coding Style – Is Changing Variable Types Mid-Procedure Bad?

coding-styledynamic-typingtype-systemsweak-typing

In Python (and occasionally PHP) where variables do not have fixed types, I'll frequently perform 'type transformations' on a variable part-way through my code's logic. I'm not (necessarily) talking about simple casts, but about functions that change the type of a variable while leaving it basically representing the same value or data.

For example, I might write code like this when doing a web request, using the response variable to store an addurlinfo object, then the string content of that object, and the dictionary returned by parsing the string as JSON:

response = urlopen(some_url)
assert response.info().type == 'application/json'

response = response.read()
logger.debug('Received following JSON response: ' + response)

response = json.loads(response)
do_something(response)

(Okay, it's a slightly contrived example, but I think it demonstrates the idea fine). My feeling is that this is better than using three separate variable names because a) it conveys that the response variable contains basically the same information, just 'transformed' into a different type, and b) it conveys that the earlier objects aren't going to be needed any further down the function, since by reassigning over their variable I've made them unavailable to later code.

The tradeoff, I guess, is that the reader could become confused about what type the variable is at any given moment.

Is this bad style, and should I be using three different variables in the above example instead of reassigning to one? Also, is this standard practice in dynamically typed languages, or not? I haven't seen enough of other people's code to know.

Best Answer

I'll go out on a limb and say: No, this is a terrible idea.

It's just a special case of reusing a variable, which is a bad idea - mainly because it makes it hard to understand what a variable contains at any given point in the program flow. See e.g. Should I reuse variables?

About your points: The points you raise are valid, it's just that reusing the variable is not a good solution :-).

a) it conveys that the response variable contains basically the same information, just 'transformed' into a different type

Providing this information is a good idea. However, don't do this by using the same variable, because then you obscure the fact that it the information was transformed. Rather, use names with a common pre-/postfix. In your example:

rawResponse = urlopen(some_url)
[...]    
jsonResponse = response.read()
[...]    
responseData = json.loads(response)
[...]

This makes it clear that the variables are closely related, but also that they do not contain the same data.

b) it conveys that the earlier objects aren't going to be needed any further down the function, since by reassigning over their variable I've made them unavailable to later code.

Again, communicating this "no longer needed" is good, but don't do it by reusing the variable: The reuse assignement will usually be hard to see, so you only confuse the reader.

Rather, if a variable lives long after its last use, that is an indication the method/function is too long. Split the part with the short-lived variables into a sub-function; that makes the code easier to read, and limits the variable lifetime.

Note: I usually even go one step further than not reusing variables, and try to even only assign a value once (i.e. never change the value, make it immutable). This is an idea mainly from functional languages, but I found it can make code much clearer. Of course, in non-functional languages, you sometimes need to change a variable (obvious example being a loop variable), but once you start looking, you'll see that in most cases a "fresh" variable makes for more readable and less bug-prone code.

String Concatentation

Obviously this is not a problem in PHP because there are separate string concatenation (.) and addition (+) operators.

^JavaScript

var a = 5;
var b = "10"
var incorrect = a + b; // "510"
var correct = a + Number(b); // 15

String Comparison

Often in computer systems "5" is greater than "10" because it doesn't interpret it as a number. Not so in PHP, which, even if both are strings, realizes they are numbers and removes the need for a cast):

^JavaScript

console.log("5" > "10" ? "true" : "false"); // true

^PHP

echo "5" > "10" ? "true" : "false";  // false!

Function signature typing

PHP implements a bare-bones type-checking on function signatures, but unfortunately it's so flawed it's probably rarely usable.

I thought I might be doing something wrong, but a comment on the docs confirms that built-in types other than array cannot be used in PHP function signatures - though the error message is misleading.

^PHP

function testprint(string $a) {
    echo $a;
}

$test = 5;
testprint((string)5); // "Catchable fatal error: Argument 1 passed to testprint()
                      //  must be an instance of string, string given" WTF?

And unlike any other language I know, even if you use a type it understands, null can no longer be passed to that argument (must be an instance of array, null given). How stupid.

Boolean interpretation

[Edit]: This one is new. I thought of another case, and again the logic is reversed from JavaScript.

^JavaScript

console.log("0" ? "true" : "false"); // True, as expected. Non-empty string.

^PHP

echo "0" ? "true" : "false"; // False! This one probably causes a lot of bugs.

So in conclusion, the only useful case I can think of is... (drumroll)

Type truncation

In other words, when you have a value of one type (say string) and you want to interpret it as another type (int) and you want to force it to become one of the valid set of values in that type:

$val = "test";
$val2 = "10";
$intval = (int)$val; // 0
$intval2 = (int)$val2; // 10
$boolval = (bool)$intval // false
$boolval2 = (bool)$intval2 // true
$props = (array)$myobject // associative array of $myobject's properties

I can't see what upcasting (to a type that encompasses more values) would really ever gain you.

So while I disagree with your proposed use of typing (you essentially are proposing static typing, but with the ambiguity that only if it was force-cast into a type would it throw an error — which would cause confusion), I think it's a good question, because apparently casting has very little purpose in PHP.

Why dynamically typed languages do not let the developer specify the type

The point of having static typing is the ability to prove statically that your program is correct with regard of types (note: not completely correct in all senses). If you have a static type system throughout, you can detect type errors most of the time.

If you only have partial type information, you can only check the small pieces of a call graph where type info happens to be complete. But you have spent time and effort to specify type information for incomplete parts, where it can't help you but could give a false sense of security.

To express type information, you need a part of language which cannot be excessively simple. Soon you'll find out that info like int is not enough; you'll want something like List<Pair<Int, String>>, then parametric types, etc. It can be confusing enough even in the rather simple case of Java.

Then, you'll need to handle this information during translation phase and execution phase, because it's silly to only check for static errors; the user is going to expect that the type constraints always hold if specified at all. Dynamic languages are not too fast as they are, and such checks will slow the performance down even more. A static language can spend serious effort checking types because it only does that once; a dynamic language can't.

Now imagine adding and maintaining all of this just so that people sometimes optionally used these features, only detecting a small fraction of type errors. I don't think it's worth the effort.

The very point of dynamic languages is to have a very small and very malleable framework, within which you can easily do things that are much more involved when done in a static language: various forms of monkey-patching that are used for metaprogramming, mocking and testing, dynamic replacement of code, etc. Smalltalk and Lisp, both very dynamic, took it to such an extreme as to ship environment images instead of building from source. But when you want to ensure that particular data paths are type-safe, add assertions and write more unit tests.

Update from 2020: Some dynamic languages now support partial typing of sorts. Python allows type hints, to be used by external tools like mypy. TypeScript allows mixing with type-oblivious JavaScript. Still, the points above mostly hold.

Best Answer

Related Solutions

Php – Type casting variables in PHP, what is the practical reason for doing this

String Concatentation

String Comparison

Function signature typing

Boolean interpretation

Type truncation

Why dynamically typed languages do not let the developer specify the type

Related Topic