If nulls are evil, what should be used when a value can be meaningfully absent

anti-patternsnull

This is one of the rules that are beaing repeated over and over and that perplex me.

Nulls are evil and should be avoided whenever possible.

But, but – from my naivety, let me scream – sometimes a value CAN be meaningfully absent!

Please let me ask this on an example that comes from this anti-pattern ridden horrible code I'm working on at the moment. This is, at its core, a multiplayer web turn based game, where both players' turns are run simultaneously (like in Pokemon, and oppositely to Chess).

After each turn the server broadcasts a list of updates to client side JS code. The Update class:

public class GameUpdate
{
    public Player player;
    // other stuff
}

This class is serialized to JSON and sent to connected players.

Most updates naturally have a Player associated with them – after all, it is necessary to know which player made which move this turn. However, some updates can't have a Player meaningfully associated with them. Example: The game has been force tied because the turn limit without actions have been exceeded. For such updates, I think, it is meaningful to have the player nulled.

Of course, I could "fix" this code by utilizing inheritance:

public class GameUpdate
{
    // stuff
}

public class GamePlayerUpdate : GameUpdate
{
    public Player player;
    // other stuff
}

However, I fail to see how this is of any improvement, for two reasons:

  • The JS code will now simply receive an object without Player as a defined property, which is the same as if it was null since both cases require checking if the value is present or absent;
  • Should another nullable field be added to the GameUpdate class, I would have to be able to use multiple inheritance to continue with this desing – but MI is evil in its own right (according to more experienced programmers) and more importantly, C# doesn't have it so I can't use it.

I have an inkling that this piece of this code is one of the very many where an experienced, good programmer would scream in horror. At the same time I can't see how, in this place, is this null hurting anything and what should be done instead.

Could you explain this issue to me?

Best Answer

Lots of things are better to return than null.

  • An empty string ("")
  • An empty collection
  • An "optional" or "maybe" monad
  • A function that quietly does nothing
  • An object full of methods that quietly do nothing
  • A meaningful value that rejects the incorrect premise of the question
    (which is what made you consider null in the first place)

The trick is realizing when to do this. Null is really good at blowing up in your face but only when you dot off it without checking for it. It's not good at explaining why.

When you don't need to blow up, and you don't want to check for null, use one of these alternatives. It may seem weird to return a collection if it only ever has 0 or 1 elements but it's really good at letting you silently deal with missing values.

Monads sound fancy but here all they're doing is letting you make it obvious that the valid sizes for the collection are 0 and 1. If you have them, consider them.

Any of these do a better job of making your intent clear than null does. That's the most important difference.

It may help to understand why you are returning at all rather than simply throwing an exception. Exceptions are essentially a way to reject an assumption (they are not the only way). When you ask for the point where two lines intersect you are assuming they intersect. They might be parallel. Should you throw a "parallel lines" exception? Well you could but now you have to handle that exception elsewhere or let it halt the system.

If you'd rather stay where you are you can bake the rejection of that assumption into a kind of value that can express either results or rejections. It's not that weird. We do it in algebra all the time. The trick is to make that value meaningful so we can understand what happened.

If the returned value can express results and rejections it needs to be sent to code that can handle both. Polymorphism is really powerful to use here. Rather than simply trading null checks for isPresent() checks you can write code that behaves well in either case. Either by replacing empty values with default values or by silently doing nothing.

The problem with null is it can mean far too many things. So it just ends up destroying information.


In my favorite null thought experiment I ask you to imagine a complex Rube Goldbergian machine that signals encoded messages using colored light by picking up colored light bulbs off a conveyor belt, screwing them into a socket, and powering them up. The signal is controlled by the different colors of the bulbs that are placed on the conveyor belt.

You've been asked to make this hideously expensive thing compliant with RFC3.14159 which states that there should be a marker between messages. The marker should be no light at all. It should last for exactly one pulse. The same amount of time that a single colored light normally shines.

You can't just leave spaces between the messages because the contraption sets off alarms and halts if it can't find a bulb to put in the socket.

Everyone familiar with this project shudders at the thought of touching the machinery or it's control code. It is not easy to change. What do you do? Well you can dive in and start breaking things. Maybe update this things hideous design. Yeah you could make it so much better. Or you could talk to the janitor and start collecting burned out bulbs.

That's the polymorphic solution. The system doesn't even need to know anything has changed.


It's really nice if you can pull that off. By encoding a meaningful rejection of an assumption into a value you don't have to force the question to change.

How many apples will you owe me if I give you 1 apple? -1. Because I owed you 2 apples from yesterday. Here the assumption of the question, "you will owe me", is rejected with a negative number. Even though the question was wrong, meaningful info is encoded in this answer. By rejecting it in a meaningful way the question isn't forced to change.

However, sometimes changing the question actually is the way to go. If the assumption can be made more reliable, while still useful, consider doing that.

Your problem is that, normally, updates come from players. But you've discovered a need for an update that doesn't come from player 1 or player 2. You need an update that says time has expired. Neither player is saying this so what should you do? Leaving player null seems so tempting. But it destroys information. You're trying to encode knowledge in a black hole.

The player field doesn't tell you who just moved. You think it does but this problem proves that's a lie. The player field tells you where the update came from. Your time expired update is coming from the game clock. My recommendation is give the clock the credit it deserves and change the name of the player field to something like source. Then the value can be player 1, player 2, or clock.

This way if the server is shutting down and has to suspend the game it can send out an update that reports what's happening and lists itself as the source of the update.

Does this mean that you should never just leave something out? No. But you need to consider how well nothing can be assumed to really mean a single well known good default value. Nothing is really easy to over use. So be careful when you use it. There is a school of thought that embraces favoring nothing. It's called convention over configuration.

I like that myself but only when the convention is clearly communicated in some way. It should not be a way to haze the newbies. But even if you're using that, null is still not the best way to do it. Null is a dangerous nothing. Null pretends to be something you can use right up until you use it, then it blows a hole in the universe (breaks your semantic model). Unless blowing a hole in the universe is the behavior you need why are you messing with this? Use something else.

Related Topic