Java – Rationale for Preferring Local Variables Over Instance Variables

javarefactoring

The codebase I'm working on frequently uses instance variables to share data between various trivial methods. The original developer is adamant that this adheres to the best practices stated in the Clean Code book by Uncle Bob/Robert Martin: "The first rule of functions is that they should be small." and "The ideal number of arguments for a function is
zero (niladic). (…) Arguments are hard. They take a lot of conceptual power."

An example:

public class SomeBusinessProcess {
  @Inject private Router router;
  @Inject private ServiceClient serviceClient;
  @Inject private CryptoService cryptoService;

  private byte[] encodedData;
  private EncryptionInfo encryptionInfo;
  private EncryptedObject payloadOfResponse;
  private URI destinationURI;

  public EncryptedResponse process(EncryptedRequest encryptedRequest) {
    checkNotNull(encryptedRequest);

    getEncodedData(encryptedRequest);
    getEncryptionInfo();
    getDestinationURI();
    passRequestToServiceClient();

    return cryptoService.encryptResponse(payloadOfResponse);
  }

  private void getEncodedData(EncryptedRequest encryptedRequest) {
    encodedData = cryptoService.decryptRequest(encryptedRequest, byte[].class);
  }

  private void getEncryptionInfo() {
    encryptionInfo = cryptoService.getEncryptionInfoForDefaultClient();
  }

  private void getDestinationURI() {
    destinationURI = router.getDestination().getUri();
  }

  private void passRequestToServiceClient() {
    payloadOfResponse = serviceClient.handle(destinationURI, encodedData, encryptionInfo);
  }
}

I would refactor that into the following using local variables:

public class SomeBusinessProcess {
  @Inject private Router router;
  @Inject private ServiceClient serviceClient;
  @Inject private CryptoService cryptoService;

  public EncryptedResponse process(EncryptedRequest encryptedRequest) {
    checkNotNull(encryptedRequest);

    byte[] encodedData = cryptoService.decryptRequest(encryptedRequest, byte[].class);
    EncryptionInfo encryptionInfo = cryptoService.getEncryptionInfoForDefaultClient();
    URI destinationURI = router.getDestination().getUri();
    EncryptedObject payloadOfResponse = serviceClient.handle(destinationURI, encodedData,
      encryptionInfo);

    return cryptoService.encryptResponse(payloadOfResponse);
  }
}

This is shorter, it eliminates the implicit data coupling between the various trivial methods and it limits the variable scopes to the minimum required. Yet despite these benefits I still cannot seem to convince the original developer that this refactoring is warranted, as it appears to contradict the practices of Uncle Bob mentioned above.

Hence my questions:
What is the objective, scientific rationale to favor local variables over instance variables? I can't quite seem to put my finger on it. My intuition tells me that hidden couplings are bad and that a narrow scope is better than a broad one. But what is the science to back this up?

And conversely, are there any downsides to this refactoring that I have possibly overlooked?

Best Answer

What is the objective, scientific rationale to favor local variables over instance variables?

Scope isn't a binary state, it's a gradient. You can rank these from largest to smallest:

Global > Class > Local (method) > Local (code block, e.g. if, for, ...)

Edit: what I call a "class scope" is what you mean by "instance variable". To my knowledge, those are synonymous, but I'm a C# dev, not a Java dev. For the sake of brevity, I've lumped all statics into the global category since statics are not the topic of the question.

The smaller the scope, the better. The rationale is that variables should live in the smallest scope possible. There are many benefits to this:

  • It forces you to think about the current class' responsibility and helps you stick to SRP.
  • It enables you to not have to avoid global naming conflicts, e.g. if two or more classes have a Name property, you're not forced to prefix them like FooName, BarName, ... Thus keeping your variable names as clean and terse as possible.
  • It declutters the code by limiting the available variables (e.g. for Intellisense) to those that are contextually relevant.
  • It enables some form of access control so your data can't be manipulated by some actor you don't know about (e.g. a different class developed by a colleague).
  • It makes the code more readable as you ensure that the declaration of these variables tries to stay as close to the actual usage of these variables as is possible.
  • Wantonly declaring variables in an overly wide scope is often indicative of a developer who doesn't quite grasp OOP or how to implement it. Seeing overly widely scoped variables acts as a red flag that there's probably something going wrong with the OOP approach (either with the developer in general or the codebase in specific).
  • (Comment by Kevin) Using locals forces you to do things in the right order. In the original (class variable) code, you could wrongly move passRequestToServiceClient() to the top of the method, and it would still compile. With locals, you could only make that mistake if you passed an uninitialized variable, which is hopefully obvious enough that you don't actually do it.

Yet despite these benefits I still cannot seem to convince the original developer that this refactoring is warranted, as it appears to contradict the practices of Uncle Bob mentioned above.

And conversely, are there any downsides to this refactoring that I have possibly overlooked?

The issue here is that your argument for local variables is valid, but you've also made additional changes which are not correct and cause your suggested fix to fail the smell test.

While I understand your "no class variable" suggestion and there's merit to it, you've actually also removed the methods themselves, and that's a whole different ballgame. The methods should have stayed, and instead you should've altered them to return their value rather than store it in a class variable:

private byte[] getEncodedData() {
    return cryptoService.decryptRequest(encryptedRequest, byte[].class);
}

private EncryptionInfo getEncryptionInfo() {
    return cryptoService.getEncryptionInfoForDefaultClient();
}

// and so on...

I do agree with what you've done in the process method, but you should've been calling the private submethods rather than executing their bodies directly.

public EncryptedResponse process(EncryptedRequest encryptedRequest) {
    checkNotNull(encryptedRequest);

    byte[] encodedData = getEncodedData();
    EncryptionInfo encryptionInfo = getEncryptionInfo();

    //and so on...

    return cryptoService.encryptResponse(payloadOfResponse);
}

You'd want that extra layer of abstraction, especially when you run into methods that need to be reused several times. Even if you don't currently reuse your methods, it's still a matter of good practice to already create submethods where relevant, even if only to aid code readability.

Regardless of the local variable argument, I immediately noticed that your suggested fix is considerably less readable than the original. I do concede that wanton use of class variables also detracts from code readability, but not at first sight compared to you having stacked all the logic in a single (now long-winded) method.

Related Topic