Python Programming – Pre-Initializing Attributes vs. Adding Later

functional programmingobject-oriented-designprogramming practicespython

I'm sorry if this is a ABSOLUTELY sophomoric question, but I'm curious what the best practices are out there, and I can't seem to find a good answer on Google.

In Python, I usually use an empty class as a super-catchall data structure container (sort of like a JSON file), and add attributes along the way:

class DataObj:
    "Catch-all data object"
    def __init__(self):
        pass

def processData(inputs):
    data = DataObj()
    data.a = 1
    data.b = "sym"
    data.c = [2,5,2,1]

This gives me a tremendous amount of flexibility, because the container object can essentially store anything. So if new requirements crop up, I'll just add it as another attribute to the DataObj object (which I pass around in my code).

However, recently it has been impressed upon me (by FP programmers) that this is an awful practice, because it makes it very hard to read the code. One has to go through all the code to figure out what attributes DataObj actually has.

Question: How can I rewrite this for greater maintainability without sacrificing flexibility?

Are there any ideas from functional programming that I can adopt?

I'm looking for best-practices out there.

Note: one idea is to pre-initialize the class with all the attributes that one expects to encounter, e.g.

class DataObj:
    "Catch-all data object"
    def __init__(self):
        data.a = 0
        data.b = ""
        data.c = []

def processData(inputs):
    data = DataObj()
    data.a = 1
    data.b = "sym"
    data.c = [2,5,2,1]

Is this actually a good idea? What if I don't know what my attributes are a priori?

Best Answer

How can I rewrite this for greater maintainability without sacrificing flexibility?

You don't. The flexibility is precisely what causes the problem. If any code anywhere may change what attributes an object has, maintainability is already in pieces. Ideally, every class has a set of attributes that's set in stone after __init__ and the same for every instance. Not always possible or sensible, but it should the case whenever you don't have really good reasons for avoiding it.

one idea is to pre-initialize the class with all the attributes that one expects to encounter

That's not a good idea. Sure, then the attribute is there, but may have a bogus value, or even a valid one that covers up for code not assigning the value (or a misspelled one). AttributeError is scary, but getting wrong results is worse. Default values in general are fine, but to choose a sensible default (and decide what is required) you need to know what the object is used for.

What if I don't know what my attributes are a priori?

Then you're screwed in any case and should use a dict or list instead of hardcoding attribute names. But I take it you meant "... at the time I write the container class". Then the answer is: "You can edit files in lockstep, duh." Need a new attribute? Add a frigging attribute to the container class. There's more code using that class and it doesn't need that attribute? Consider splitting things up in two separate classes (use mixins to stay DRY), so make it optional if it makes sense.

If you're afraid of writing repetive container classes: Apply metaprogramming judiciously, or use collections.namedtuple if you don't need to mutate the members after creation (your FP buddies would be pleased).