Python – Preferred way to expand a command line script to be used as a library in Python

argparsepython

I have a useful Python script that I've been invoking from the command line. It has decent number of options, maybe 20, and it's not unusual to run the script with six or seven flags. Then the rest of the input comes via stdin.

Now I have some other Python code from which I'd like to call this useful little utility. Two options I can think of are:

  1. I can use subprocess.call and invoke my little script
  2. (A little better) I can cobble together a command line and then pass it as a list of strings to argparse
  3. I can totally refactor the program so that the entry point of my utility is Python function call and then have my command line utility just call this function. In principle this seems like the responsible thing to do, but it does leave my managing two separate interfaces to my function. For example, I have to decide whether I want to let argparse know the default values for my options or have the function know the defaults (or have two sets of defaults). Any validation I do using, say, ArgumentParser.add_mutually_exclusive_group will not apply when my tool is run as a library instead of a command line script.

Is there a standard paradigm for creating a single interface in Python that is well-suited to being invoked both from Python and from the command line?

Best Answer

If there is not a good reason to not do so, I would definitely advocate a spin on option 3. As @jonsharp mentions, breaking up your utility into clean units of functionality is a good way to ensure testability. Even the smallest scripts can eventually morph into a much larger program and making sure that you have an extensible API sooner rather than later will alleviate much headache down the road.

The way I'd approach this is:

  1. Break up your code into logical methods with clean and clear I/O
  2. Add unit tests. Having them is never a bad thing.
  3. Rather than using if __name__ == '__main__', create a main() (or similar) method containing your entry point
  4. Use setuptool's setup() function to define the script entry point in your setup.py file.

For example:

from setuptools import setup
setup(
    name='mypackage',
    version='0.1',
    entry_points={
        'console_scripts': [ 'myscript = mypackage.mymodule:main' ],
    }
)

Now, not only is all of your code (including main()) is easily unit testable, but you can still have your console entry point once you've done a python setup.py install|develop.

Any validation I do using, say, ArgumentParser.add_mutually_exclusive_group will not apply when my tool is run as a library instead of a command line script.

Depending on how your API is designed, you may need to add some extra validation to input parameters, but that should likely be there to prevent unexpected input anyways.

Edit: The only time I would use generally use subprocess is when I'm calling into a non-Python application or another Python script that I don't own or have the time to refactor, but the latter only being as a last resort. Most well-written Python utilities will expose both command line utilities and internal API.