Python – How to refactor functions

functionspythonrefactoringsingle-responsibility

Let's say I want to have a function to load a project's data from both a json and a file path, the first idea which comes to my mind would be having a couple of functions such as:

def load_project_from_json(json_data):
    ...

def load_project_from_path(path):
    with open(path) as f:
        load_project_from_json(f.read())

but then I was asking myself if maybe would be more convenient to achieve both tasks with just a single function, for instance:

def load_project(path_or_json)

Many interrogants come to my mind, for instance:

  • how do you know when it's convenient to explode a function into multiple ones or not?

  • how can i judge whether a function is a "good" function or not?

  • is it the function following SRP?

Question: Which systematic method (set of steps or rules) could help me out to extract all responsabilities from any function? The main reason is learning how to judge the quality of any function (no matter if we're talking about free functions or class methods). That way I'll be able to say whether it's necessary or not to refactor it.

NS: When I say systematic method think in terms of simplifying equations, if you follow a set of rules you'll be able to reach a point where the equation can't be simplified further, similarly I'd like to know which rules I could apply to know when a function can't be refactored any further. I hope the analogy makes sense

Best Answer

Having one function load from a file, and another load from JSON makes absolute sense. If your application finds itself in a situation where this is ambiguous -- and in multiple places -- then the third function can be added to remove the replication of logic determining which format it is: File or JSON.

Basically, you'll have all three where load_project(file_or_json) detects whether or not it's a file and calls the appropriate method:

def load_project(path_or_json)
  if os.path.exists(path_or_json):
     return load_project_from_path(path_or_json)
  else:
     return load_project_from_json(path_or_json)

Benefits:

  1. Centralized location of testing whether or not something is a file path or JSON

  2. It's handy to use when the caller doesn't know the difference

  3. You still have explicit functions when you do know the format

  4. You can optimize the test. Instead of calling out to the file system, maybe you decide to do a quick Regex test to see if it is JSON (after all, how many file names will start with { or [ characters?)

  5. You still adhere to the Single Responsibility Principal, because the load_project function is responsible for determining the format. The other two functions are responsible for loading from the specific formats.

Related Topic