Python Coding Style – Is It Okay to Have Multiple Classes in One File?

coding-stylepython

I'm freshly coming to the Python world after years of Java and PHP. While the language itself is pretty much straightforward, I'm struggling with some 'minor' issues that I can't wrap my head around — and to which I couldn't find answers in the numerous documents and tutorials I've read this far.

To the experienced Python practitioner, this question might seem silly, but I really want an answer to it so I can go further with the language:

In Java and PHP (although not strictly required), you are expected to write each class on its own file, with file's name is that of the class as a best practice.

But in Python, or at least in the tutorials I've checked, it is ok to have multiple classes in the same file.

Does this rule hold in production, deployment-ready code or it's done just for the sake of brevity in educative-only code?

Best Answer

Is it ok to have multiple classes in the same file in Python?

Yes. Both from a philosophical perspective as well as a practical one.

In Python, modules are a namespace that exist once in memory.

Say we had the following hypothetical directory structure, with one class defined per file:

                    Defines
 abc/
 |-- callable.py    Callable
 |-- container.py   Container
 |-- hashable.py    Hashable
 |-- iterable.py    Iterable
 |-- iterator.py    Iterator
 |-- sized.py       Sized
 ... 19 more

All of these classes are available in the collections module and (there are, in fact, 25 in total) defined in the standard library module in _collections_abc.py

There are a couple of issues here that I believe makes the _collections_abc.py superior to the alternative hypothetical directory structure.

  • These files are sorted alphabetically. You could sort them in other ways, but I am not aware of a feature that sorts files by semantic dependencies. The _collections_abc module source is organized by dependency.
  • In non-pathological cases, both modules and class definitions are singletons, occurring once each in memory. There would be a bijective mapping of modules onto classes - making the modules redundant.
  • The increasing number of files makes it less convenient to casually read through the classes (unless you have an IDE that makes it simple) - making it less accessible to people without tools.

Are you prevented from breaking groups of classes into different modules when you find it desirable from a namespacing and organizational perspective?

No.

From the Zen of Python , which reflects the philosophy and principles under which it grew and evolved:

Namespaces are one honking great idea -- let's do more of those!

But let us keep in mind that it also says:

Flat is better than nested.

Python is incredibly clean and easy to read. It encourages you to read it. Putting every separate class in a separate file discourages reading. This goes against the core philosophy of Python. Look at the structure of the Standard Library, the vast majority of modules are single-file modules, not packages. I would submit to you that idiomatic Python code is written in the same style as the CPython standard lib.

Here's the actual code from the abstract base class module. I like to use it as a reference for the denotation of various abstract types in the language.

Would you say that each of these classes should require a separate file?

class Hashable:
    __metaclass__ = ABCMeta

    @abstractmethod
    def __hash__(self):
        return 0

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Hashable:
            try:
                for B in C.__mro__:
                    if "__hash__" in B.__dict__:
                        if B.__dict__["__hash__"]:
                            return True
                        break
            except AttributeError:
                # Old-style class
                if getattr(C, "__hash__", None):
                    return True
        return NotImplemented


class Iterable:
    __metaclass__ = ABCMeta

    @abstractmethod
    def __iter__(self):
        while False:
            yield None

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Iterable:
            if _hasattr(C, "__iter__"):
                return True
        return NotImplemented

Iterable.register(str)


class Iterator(Iterable):

    @abstractmethod
    def next(self):
        'Return the next item from the iterator. When exhausted, raise StopIteration'
        raise StopIteration

    def __iter__(self):
        return self

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Iterator:
            if _hasattr(C, "next") and _hasattr(C, "__iter__"):
                return True
        return NotImplemented


class Sized:
    __metaclass__ = ABCMeta

    @abstractmethod
    def __len__(self):
        return 0

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Sized:
            if _hasattr(C, "__len__"):
                return True
        return NotImplemented


class Container:
    __metaclass__ = ABCMeta

    @abstractmethod
    def __contains__(self, x):
        return False

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Container:
            if _hasattr(C, "__contains__"):
                return True
        return NotImplemented


class Callable:
    __metaclass__ = ABCMeta

    @abstractmethod
    def __call__(self, *args, **kwds):
        return False

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Callable:
            if _hasattr(C, "__call__"):
                return True
        return NotImplemented

So should they each have their own file?

I hope not.

These files are not just code - they are documentation on the semantics of Python.

They are maybe 10 to 20 lines on average. Why should I have to go to a completely separate file to see another 10 lines of code? That would be highly impractical. Further, there would be nearly identical boilerplate imports on each file, adding more redundant lines of code.

I find it quite useful to know that there is a single module where I can find all of these Abstract Base Classes, instead of having to look over a list of modules. Viewing them in context with each other allows me to better understand them. When I see that an Iterator is an Iterable, I can quickly review what an Iterable consists of by glancing up.

I sometimes wind up having a couple of very short classes. They stay in the file, even if they need to grow larger over time. Sometimes mature modules have over 1000 lines of code. But ctrl-f is easy, and some IDE's make it easy to view outlines of the file - so no matter how large the file, you can quickly go to whatever object or method that you're looking for.

Conclusion

My direction, in the context of Python, is to prefer to keep related and semantically similar class definitions in the same file. If the file grows so large as to become unwieldy, then consider a reorganization.