Python Class Design – Impact of Dynamically Generating Classes

class-designpython

I have a set of classes that represent different objects (tables in a database):

class ObjA:
    # some class specific attributes and methods

    def refresh(self):
        # implementation
        pass

class ObjB:
    # some class specific attributes and methods

    def refresh(self):
        # implementation
        pass

and they all have a refresh method. Now I have a task manager (like Luigi) that is used to call the refresh method on each class and the definitions look like:

from somewhere import ObjA, ObjB


class RefreshObjA(TaskManager):
    def run(self):
        ObjA.refresh()


class RefreshObjB(TaskManager):
    def run(self):
        ObjB.refresh()

TaskManager here is a parent class provided by the manager module. As you can see, the class definitions all have an identical pattern and naming convention, and it stands to reason that these can be dynamically generated as generateRefreshClasses([ObjA, ObjB, ...]).

However, I have been resisting this and writing each one out explicitly because

  • All the information re: the classes is available when the code is written i.e. I'm not reliant on any external/user generated input which is probably where dynamic generation is useful. This use is purely to save space/keystrokes/repetition.
  • One doesn't know if some ObjX might need modifications/additional requirements and will need to be careful to remove it from the dynamic generator, lest it is overwritten.
  • It is harder to reason where (as in file and line no:) an execution error occurs if the code is dynamically generated.
  • These classes might be imported elsewhere in the code base and I lose out on the static inspection based checking provided by the IDE.

Are these valid reasons to continue to individually define the classes (there might be about 20 or so of them) or are there other benefits to dynamic generation that might make these concerns small in comparison? Would dynamic generation of classes be considered acceptable/good practice in such an application?

Best Answer

You seem to have an appreciation of the trade-offs involved here. As you say, since you know all the information about your classes in advance, it doesn't quite make sense to generate them dynamically.

One alternative to reduce your boring duplicated boilerplate code is to use Python's wonderful support for metaprogramming. If all your refresh methods follow a similar template but the details vary due to the names and types of your classes' attributes, it makes sense to use a metaclass to generate this method when the class is created. In other words, you keep your class statements, and you still declare your class members statically in code, but the information about the refresh boilerplate is kept in one place (in the template that is used by the metaclass). (You could get the same effect using a regular base class and introspection; without more details about your code, I can't give you a clear assessment of what approach would be a good fit for your problem.)

A second option would be to use a code generator. This typically takes the form of a script which takes (for example) some XML or YAML defining your objects as input, and writes out a valid Python file into your build directory. Don't check the generated code into source control; instead just run the generator script every time you build the code. This way you still get file/line information when debugging, and the resulting code is typically not too hard to understand because it's mostly boilerplate. On the other hand, you can't directly edit the file once you've diagnosed a problem because it'll be overwritten next time you run a build - you have to modify the generator. So this approach still has many of the advantages and disadvantages of full-blown runtime class creation.

Another question you should consider asking is "Do we really know everything about these objects in advance?". 20 similar classes is a lot, and it suggests to me that a future requirements change may in fact need you to load these classes at runtime. While you should of course follow YAGNI, you can still design your code in such a way that this change would be easy to make in future. This is another way in which metaprogramming can help you out.

Related Topic