According to Pycharm (and thus assume according by PEP) i should state all instance attributes directly inside __init__
. In my case it does not seem suitable to me. I have something like:
class Handler(object):
def __init__(self):
self.start_index = # parse
# Do stuff
def parse(self):
parse_header()
# Do stuff
parse_body()
def parse_header():
raise NotImplementedError
def parse_body():
raise NotImplementedError
class StringHandler(Handler):
def parse_header():
self.string_index = # parse
self.string_count = # parse
self.style_index = # parse
self.style_count = # parse
# More here
def parse_body():
self.strings = # parse
self.styles = # parse
# More here
class IntHandler(Handler):
def parse_header():
self.int_index = # parse
self.int_count = # parse
# More here
def parse_body():
self.ints = # parse
# More here
Given that there are objects (quite a lot of them) that all derive from Handler
class, the structure of each object it quite obvious for anyone who reads the code. To me it seems like unnecessary cluttering to declare __init__
just for sake of calling parent __init__
and declare couple of parameters as None
. Usually parse
-ing functions just parse whatever is needed and store that to attributes, thus adding such __init__
would sometimes almost double the size of the class (code-wise).
edit: I searched for this problem, but I did not found suitable answer, for example while searching stackexcahnge, the questions were either too broad to get useful answer ( How would you know if you've written readable and easily maintainable code? ) or asking different question ( Is it a good practice to declare instance variables as None in a class in Python? )
Best Answer
Yes, you should assign all attributes of your object in the
__init__
method.The most important reason to do this is tool support. Many Python IDEs can see the assignments in the
__init__
and use those attributes for type-based autocompletion. Sphinx-doc can list those attributes in your reference documentation.There's also a more fundamental reason: Simple is good, so don't create complicated objects that change their shape during execution. Given a
StringHandler
instance I may or may not be able to read thehandler.strings
attribute. If I try that beforehandler.parse()
has been executed, I will get anAttributeError
. Afterwards I would get the value. To avoid exceptions, consumers of your class would have togetattr(handler, 'strings', None)
with a default value which is quite tedious.In general, constructors should create fully-initialized objects that are immediately usable. In some cases it is necessary to create mutable objects that have an empty, default state, and later transition to a state where they contain a value. However, this should be the exception and is rarely a good design. See also Are init() methods a code smell? and Start Method vs. Setting up everything in constructor which discuss this issue for Java.
In your case the
parse()
method is such a two-phase initialization method that must be called before the object can really be used. This raises the questions:What is the point of these objects, if they're just a container for whatever values the
parse()
function produces? As you are using them, you might as well use a dict instead of a class.Is an empty
Handler
meaningful in any way?Why is the initialization in
parse()
separate from the other initialization?Or are you using those attributes only for communication between the
parse_x
functions, not for external consumers?Possible alternative designs:
The handlers specify how a field should be parsed, but do not actually contain the value. The
parse
method returns the value. Because the handlers do not contain any data, it may not be necessary to use classes for this – simple functions could are sufficient.The handlers represent a parsed value. There is no “empty” state. The parsing either happens in the constructor, or in separate functions. Because these objects are just dumb structs, you could define them via
namedtuple()
when parsing is separate.