Refactoring – Should Variables Live in the Smallest Scope Possible?

local variablerefactoringscope

According to the accepted answer on "Rationale to prefer local variables over instance variables?", variables should live in the smallest scope possible.
Simplify the problem into my interpretation, it means we should refactor this kind of code:

public class Main {
    private A a;
    private B b;

    public ABResult getResult() {
        getA();
        getB();
        return ABFactory.mix(a, b);
    }

    private getA() {
      a = SomeFactory.getA();
    }

    private getB() {
      b = SomeFactory.getB();
    }
}

into something like this:

public class Main {
    public ABResult getResult() {
        A a = getA();
        B b = getB();
        return ABFactory.mix(a, b);
    }

    private getA() {
      return SomeFactory.getA();
    }

    private getB() {
      return SomeFactory.getB();
    }
}

but according to the "spirit" of "variables should live in the smallest scope as possible", isn't "never have variables" have smaller scope than "have variables"? So I think the version above should be refactored:

public class Main {
    public ABResult getResult() {
        return ABFactory.mix(getA(), getB());
    }

    private getA() {
      return SomeFactory.getA();
    }

    private getB() {
      return SomeFactory.getB();
    }
}

so that getResult() doesn't have any local variables at all. Is that true?

Best Answer

No. There are several reasons why:

Variables with meaningful names can make code easier to comprehend.
Breaking up complex formulas into smaller steps can make the code easier to read.
Caching.
Holding references to objects so that they can be used more than once.

And so on.

Related Solutions

Python Refactoring – How to Refactor a ‘God Class’

Having written and then refactored a python "God object", I sympathize. What I did is break the original object down into sub sections based upon methods. For instance, the original looked like this pseudo code:

method A():
    self.bla += 1

method B():
    self.bla += 1

do stuff():
    self.bla = 1
    method A()
    method B()
    print self.bla

The stuff method is a self contained "unit" of work. I migrated it out to a new class which the original instantiates. This pulled out the necessary properties as well. Some were used only by the sub class and could move straight across. Others were shared, and got moved into a shared class.

The "God object" creates a new copy of the shared class at start up, and each of the new sub classes accepts a pointer as part of their init method. For example, here's a stripped version of the mailer:

#!/usr/bin/env python
# -*- coding: ascii -*-
'''Functions for emailing with dirMon.'''

from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email.Utils import COMMASPACE, formatdate
from email import Encoders
import os
import smtplib
import datetime
import logging

class mailer:
    def __init__(self,SERVER="mail.server.com",FROM="support@server.com"):
        self.server = SERVER
        self.send_from = FROM
        self.logger = logging.getLogger('dirMon.mailer')

    def send_mail(self, send_to, subject, text, files=[]):
        assert type(send_to)==list
        assert type(files)==list
        if self.logger.isEnabledFor(logging.DEBUG):
            self.logger.debug(' '.join(("Sending email to:",' '.join(send_to))))
            self.logger.debug(' '.join(("Subject:",subject)))
            self.logger.debug(' '.join(("Text:",text)))
            self.logger.debug(' '.join(("Files:",' '.join(files))))
        msg = MIMEMultipart()
        msg['From'] = self.send_from
        msg['To'] = COMMASPACE.join(send_to)
        msg['Date'] = formatdate(localtime=True)
        msg['Subject'] = subject
        msg.attach( MIMEText(text) )
        for f in files:
            part = MIMEBase('application', "octet-stream")
            part.set_payload( open(f,"rb").read() )
            Encoders.encode_base64(part)
            part.add_header('Content-Disposition', 'attachment; filename="%s"' % os.path.basename(f))
            msg.attach(part)
        smtp = smtplib.SMTP(self.server)
        mydict = smtp.sendmail(self.send_from, send_to, msg.as_string())
        if self.logger.isEnabledFor(logging.DEBUG):
            self.logger.debug("Email Successfully Sent!")
        smtp.close()
        return mydict

It is created once and shared between the different classes that need mailing capabilities.

So for you, create a class larry with the properties and methods you need. Everywhere the client says larry = blah replace it with larryObj.larry = blah. This migrates things to sub projects without breaking the current interface.

The only other thing to do is look for "units of work". If you were going to turn part of the "God Object" into it's own method, do so. But, put the method outside it. This forces you to create an interface between the components.

Laying that groundwork allows everything else to follow it. For example, a piece of the helper object demonstrating how it interfaces with the mailer:

#!/usr/bin/env python
'''This module holds a class to spawn various subprocesses'''
import logging, os, subprocess, time, dateAdditionLib, datetime, re

class spawner:
    def __init__(self, mailer):
        self.logger = logging.getLogger('dirMon.spawner')
        self.myMailer = mailer

Concentrate on the smallest individual unit of work possible, and move it out. This is easier to do, and lets you play with the setup quickly. Don't look at properties for moving stuff, they are ancillary to the tasks being done with them in most cases. Whatever is left over after you have dealt with the methods should probably stay in the original object, since it is part of the shared state.

But, the new objects should now accept the properties they need as init variables, not touching the caller objects property. They then return any necessary values, which the caller can use to update the shared properties as necessary. This helps to decouple the objects and makes for a more robust system.

TDD Red-Green-Refactor: Testing Methods That Become Private

Units

I think I can pinpoint exactly where the problem started:

I figured, I'll need a method that finds all the non-numerical fields in a line.

This should be immediately followed with asking yourself "Will that be a separate testable unit to gatherNonNumericColumns or part of the same one?"

If the answer is "yes, separate", then your course of action is simple: that method needs to be public on an appropriate class, so it can be tested as a unit. Your mentality is something like "I need to test drive out one method and I also need to test drive out another method"

From what you say though, you figured that the answer is "no, part of the same". At this point, your plan should no longer be to fully write and test findNonNumericFields then write gatherNonNumericColumns. Instead, it should be simply to write gatherNonNumericColumns. For now, findNonNumericFields should just be a likely part of the destination you have in mind when you're choosing your next red test case and doing your refactoring. This time your mentality is "I need to test drive out one method, and while I do so I should keep in mind that my finished implementation will probably include this other method".

Keeping a short cycle

Doing the above should not lead to the problems you describe in your penultimate paragraph:

Because if I start out by writing a test for a public method, there will be several minutes (or hours, or even days in very complex cases) before I get all the details in the private methods to work so that the test testing the public method passes.

At no point does this technique require you to write a red test which will only turn green when you implement the entirety of findNonNumericFields from scratch. Much more likely, findNonNumericFields will start as some code in-line in the public method you're testing, which will be built up over the course of several cycles and eventually extracted during a refactoring.

Roadmap

To give an approximate roadmap for this particular example, I don't know the exact test cases you used, but say you were writing gatherNonNumericColumns as your public method. Then most likely the test cases would be the same as the ones you wrote for findNonNumericFields, each one using a table with only one row. When that one-row scenario was fully implemented, and you wanted to write a test to force you to extract out the method, you'd write a two-row case which would require you to add your iteration.