Python Class – Creating an Attribute of an Object Versus a Method

classpython

This question pertains to properly structuring a class. The object of a class I have is instantiated with multiple parameters. Many of these parameters are then manipulated with each other to derive new values. Currently, I have all of these manipulations as methods in the class.

Is it inappropriate to move these methods to attributes when I'm instantiating the object? There is a bit of a grey area regarding whether these new derivations are attributes of the object or a new output. In some cases, they definitely seem like mere attributes. For example, in addition to representing nominal values, I want to store the values as a percentage of their expected amount. This definitely seems like an attribute.

However, if I relegate this to __init__, then that strengthens the argument to relegate other methods that have more complexity.

Best Answer

Properly structured is subjective. :)

Generally, the more moving parts you have (i.e. mutable state), the harder it is to reason about your code. There are more pieces to fit in your mental model of the code, but there are additional concerns as well, such as ensuring the derived values stay coherent when a value is updated, or or that your code is correct if ran in a concurrent context. Therefore, I would consider leaving the calculation of the derived attributes in a method as a good default since it minimizes the state required.

Not all calculations are equal, however. It might be too computationally expensive to re-calculate the derived values every time they are needed (e.g. in a tight loop), at which time you might want to consider caching. The good thing is that if the computation is hidden in a method, caching simply becomes an implementation detail, leaving the rest of your code unaware of this optimization. If your use case warrants it, you may even use the Decorator pattern to implement this caching, decoupling the actual, interesting calculation from the technical details of caching.

More specifically for Python, you can define your derived value as a property instead of a method. You have the benefits of keeping the state to the minimum while still seemingly using an attribute (if that makes sense in your use case). That being said, a property is generally expected not to perform an expensive operation, so if that is your case, I would generally prefer to keep the method.

Answer 1

With this in mind, how can any python classes that accept a nonzero number of parameters for their constructors be said to comply with the LSP?

There are two parts to this. First, __init__ is not a constuctor. __new__ is the constructor — the method that actually constructs a new class instance from whole cloth. __init__ is just a private method that is called by __new__ on the new instance. And since it's private, it's not part of your type and not subject to the Liskov substitution principle.

What about __new__ then? All the parameters of __init__ are by default implicitly parameters of __new__, so am I just kicking the can down the road? __new__ is a static method, so it's not a property of an instance of object — it's part of object itself. So it's not subject to the Liskov substitution principle for instance of object either.

(Answer 1 Digression)

Here's where it gets kind of interesting (to me, at least). object is a class, but in Python classes are objects. They're instances of type, which is called their metaclass. So while __new__ is a static method, that means it's an instance method of object itself.¹ So, it is subject to the Liskov substitution principle for instance of type. And if we look at the definition of __new__ in type, we see:

__new__(*args, **kwargs) method of builtins.type instance
    Create and return a new object.  See help(type) for accurate signature.

So type's __new__ accepts any and all arguments. Since many classes __init__ methods — and thus their __new__ methods — don't accept arbitrary arguments, those class objects are kind of in violation of the Liskov substitution method as instances of type. But… as you pointed out later in your question,

New exceptions cannot be thrown by the methods in the subtype, except if they are subtypes of exceptions thrown by the methods of the supertype.

— Liskov substitution principle

And that's exactly what __new__ does. If you call type.__new__ with different arguments than it expects, it throws a TypeError:

>>> type.__new__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type.__new__(): not enough arguments

Which means that all subtypes of type (i.e., all class objects) are free to throw their own TypeErrors in __new__, and callers are obligated to handle it. And that's exactly what object.__new__ does, but under different conditions:

>>> object.__new__(object, 'foo', (), {})  # This would be valid for type.__new__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object() takes no arguments

So, by weakening the base type's precondition (the argument list) and opting to instead validate “at runtime” by throwing an exception, the __new__ method is able to meet the Liskov substitution principle as a property of instances of type.²

This means we really can't just instantiate any arbitrary type in Python (either by calling __new__ or by just calling the type) without knowing the target type, unless we're prepared to catch and handle TypeErrors — and I think that tracks with most programmers' intuitions.

Answer 2

collections.Counter, however, is a direct subclass of dict. While it is mostly an extension of dict rather than a modification of behaviours already defined in dict, this isn't true for collection.Counter.fromkeys.

The answer here is similar to the previous answer. Since fromkeys is a class method, it's not really a property of instances of dict and not subject to the Liskov substitution principle.³

But then, what about if we look at dict as a class object? Do we run into the same complications we did with object.__new__? No, we don't, because Counter and dict don't have any sort of hierarchical relationship as class objects — they're both direct instances of type. We can't assume anything about their fromkeys methods because they didn't inherit them from type.

On the other hand, in Python a class does inherit all its parents' properties, which includes static and class methods like fromkeys. So Counter has to do something with fromkeys. It could attempt to hide the method, e.g., by replacing it with a descriptor that always throws an AttributeError, or even just by setting the property to None. The author of Counter chose to keep the method visible and to throw NotImplementedError instead, perhaps to signal that the method is intentionally unusable.⁴

In the end, the Liskov substitution principle is just an attempt to formalize something very intuitive: don't surprise the users of your code. In that sense, it may be seen a necessary condition for “good code” (whatever that is), but not a sufficient condition.

¹ This is a slight lie. __new__ is not an instance method of object because it doesn't take the receiver (cls, a.k.a. self) as a parameter. It's a static method, so it's essentially just a function property of object — but it doesn't make a difference to the discussion.

² I put “at runtime” in scare quotes because technically everything in Python is at runtime, but hopefully the distinction is clear.

³ It is possible to call static and class methods through instances, e.g.,

d1 = dict()
d2 = d1.fromkeys(itr)

This just dispatches the method to type(d1), which is dict. As far as I know, it's pretty well accepted that this is a quirk of Python, and we still think of those methods as properties of the type and not properties of the instance. But I suppose, in the strictest sense, that is a violation of the Liskov substitution principle.

⁴ According to the docs for NotImplementedError, this is exactly how that exception is not meant to be used.

Note: It should not be used to indicate that an operator or method is not meant to be supported at all — in that case either leave the operator/method undefined or, if a subclass, set it to None.

But, I suppose the standard library is allowed to contradict itself.

Best Answer

Related Solutions

Python – Using Class Like an Object

Liskov Substitution Principle in Python – Explanation and Examples

Answer 1

(Answer 1 Digression)

Answer 2

Related Topic