Coding Style – Is Changing Variable Types Mid-Procedure Bad?

coding-styledynamic-typingtype-systemsweak-typing

In Python (and occasionally PHP) where variables do not have fixed types, I'll frequently perform 'type transformations' on a variable part-way through my code's logic. I'm not (necessarily) talking about simple casts, but about functions that change the type of a variable while leaving it basically representing the same value or data.

For example, I might write code like this when doing a web request, using the response variable to store an addurlinfo object, then the string content of that object, and the dictionary returned by parsing the string as JSON:

response = urlopen(some_url)
assert response.info().type == 'application/json'

response = response.read()
logger.debug('Received following JSON response: ' + response)

response = json.loads(response)
do_something(response)

(Okay, it's a slightly contrived example, but I think it demonstrates the idea fine). My feeling is that this is better than using three separate variable names because a) it conveys that the response variable contains basically the same information, just 'transformed' into a different type, and b) it conveys that the earlier objects aren't going to be needed any further down the function, since by reassigning over their variable I've made them unavailable to later code.

The tradeoff, I guess, is that the reader could become confused about what type the variable is at any given moment.

Is this bad style, and should I be using three different variables in the above example instead of reassigning to one? Also, is this standard practice in dynamically typed languages, or not? I haven't seen enough of other people's code to know.

Best Answer

I'll go out on a limb and say: No, this is a terrible idea.

It's just a special case of reusing a variable, which is a bad idea - mainly because it makes it hard to understand what a variable contains at any given point in the program flow. See e.g. Should I reuse variables?

About your points: The points you raise are valid, it's just that reusing the variable is not a good solution :-).

a) it conveys that the response variable contains basically the same information, just 'transformed' into a different type

Providing this information is a good idea. However, don't do this by using the same variable, because then you obscure the fact that it the information was transformed. Rather, use names with a common pre-/postfix. In your example:

rawResponse = urlopen(some_url)
[...]    
jsonResponse = response.read()
[...]    
responseData = json.loads(response)
[...]

This makes it clear that the variables are closely related, but also that they do not contain the same data.

b) it conveys that the earlier objects aren't going to be needed any further down the function, since by reassigning over their variable I've made them unavailable to later code.

Again, communicating this "no longer needed" is good, but don't do it by reusing the variable: The reuse assignement will usually be hard to see, so you only confuse the reader.

Rather, if a variable lives long after its last use, that is an indication the method/function is too long. Split the part with the short-lived variables into a sub-function; that makes the code easier to read, and limits the variable lifetime.

Note: I usually even go one step further than not reusing variables, and try to even only assign a value once (i.e. never change the value, make it immutable). This is an idea mainly from functional languages, but I found it can make code much clearer. Of course, in non-functional languages, you sometimes need to change a variable (obvious example being a loop variable), but once you start looking, you'll see that in most cases a "fresh" variable makes for more readable and less bug-prone code.

Related Topic