Coding Style – Best Way to Split Up Large Methods with Dependent Subtasks

coding-stylemethods

According to this question and just about every style guide I've ever read, large methods are bad. However, suppose I have a method that does the following:

  1. Receive a JSON String from some service and store it as a Dictionary
  2. For various parameters in the Dictionary, creates connections to other services, such as using AWS keys to open connections to AWS services
  3. Downloads, Authenticates, or Manipulates data based on values in the Dictionary and new data received from step 2.
  4. Creates a file on the local directory and writes various values to it, dependent on step 2 and 3.
  5. Pushes the file from step 4 to another location

Clearly, this should not all be one large method. However, I'm not sure how to split this up into smaller methods without passing around a ton of parameters. For example, the first method is simple because I only need to tell the program one thing, and only one thing is returned. However, starting at step 3, I will either have to pass the entire Dictionary to the new method, or make the Dictionary an instance or global variable.

So, in short, here is my question. When breaking up a large method like this, what is the better alternative: should I make variables which need to be used by multiple methods instance/global variables (whichever is appropriate for this particular task), or should I pass create new methods which take many parameters?

To me, it seems like the first option is much worse, because it seems to break encapsulation: each method created in this way now has some strange dependency on a global or instance variable which is outside of its scope. However, the second method feels really messy.

Best Answer

There's nothing wrong with passing a very large and complex parameter, as long as the structure and content of that parameter is well-defined and easy to understand. What you want to avoid is parameters that are easy to mix up, having parameters whose purpose is not clear, or depending too much on instance state rather than parameters. Even having lots of parameters is okay if you find an intuitive way of grouping them.

In fact, if I'm reading your question right, it sounds like you already know how to break up this task in a relatively simple way that only requires pure functions. Since Javascript is my main language, I'm imagining the code would look something like this:

var jsonStr = fetchJSONStringFromService();
var dict = JSON.parse(jsonStr);

var serviceConnections = createServiceConnections(dict);

var processedData = doVariousDataProcessingThings(dict);

var dataToWrite = serializeStuff(serviceConnections, processedData);
var fs = require('fs');
fs.writeFile('file/path', dataToWrite);

pushFileToOtherServer('file/path');

It's possible you left out some details that would make the real code significantly more complicated than this, but I'm assuming this is in the right ballpark. As you can see, it's already very easy to follow; the only thing missing is documentation of what structure the various objects being passed around are expected to have.

To answer your question directly, this sounds like a situation where you're better off aiming for stateless, pure, referentially transparent functions that only manipulate POJSOs, network connections and files. Even if some of the methods take more parameters than I've guessed above. That way each step will be straightforward to unit test (once you make all the required mocks), and very easy to change, move around or replace in the future without breaking the other steps. If any of the functions in your real code seem to take "too many" parameters, there's probably a way to break them into smaller functions or combine the parameters into meaningful objects until you end up with functions that are easy to understand.

Since your question implies this dict object is fairly large and complex, a possible improvement on top of the above would be to add a step where you break up the dict into smaller dictionaries or other data structures. In particular, I presume there's a subset of information in dict that needs to be used by createServiceConnections(), so it would be ideal to pass only that subset rather than the whole thing.

For completeness, I'm not saying that objects with instance variables are never useful. In fact, in the sample code above, it's likely that serviceConnections and fs are very stateful objects, which is fine because they represent inherently stateful things, unlike a blob of data.

Related Topic