Managing Complex Software Code Base – Best Practices

large-scale-projectobject-orientedobject-oriented-design

I often create programs both for myself and others using various object-oriented programming languages. When doing so, they are usually relatively small (a few thousand lines at most). Recently, however, I have been attempting to get into making larger projects, such as complete game engines. When doing so, I often seem to run into a road block: complexity.

In my smaller projects, it is easy to remember a mental map of how every part of the program works. Doing this, I can be fully aware of how any change will effect the rest of the program and avoid bugs very effectively as well as see exactly how a new feature should fit into the code base. When I attempt to create larger projects, however, I find it impossible to keep a good mental map which leads to very messy code and numerous unintended bugs.

In addition to this "mental map" issue, I find it hard to keep my code decoupled from other parts of itself. For example, if in a multiplayer game there is a class to handle the physics of player movement and another to handle networking, then I see no way to have one of these classes not rely on the other to get player movement data to the networking system to send it over the network. This coupling is a significant source of the complexity that interferes with a good mental map.

Lastly, I often find myself coming up with one or more "manager" classes that coordinate other classes. For example, in a game a class would handle the main tick loop and would call update methods in the networking and player classes. This goes against a philosophy of what I have found in my research that each class should be unit-testable and usable independently of others, since any such manager class by its very purpose relies on most of the other classes in the project. Additionally, a manager classes orchestration of the rest of the program is a significant source of non-mental-mappable complexity.

Taken together, this prevents me from writing high-quality bug free software of a substantial size. What do professional developers do to effectively deal with this problem? I am especially interested in OOP answers target at Java and C++, but this sort of advice is probably very general.

Notes:

  • I have tried using UML diagrams, but that only seems to help with the first problem, and even then only when it is in regard to class structure rather than (for example) ordering of method calls with regard to what is initialized first.

Best Answer

I often seem to run into a road block: complexity.

There are entire books written on this subject. Here is a quote from one of the most important books ever written on software development, Steve McConnell's Code Complete:

Managing complexity is the most important technical topic in software development. In my view, it's so important that Software's Primary Technical Imperative has to be managing complexity.

As an aside, I would highly recommend reading the book if you have any interest in software development at all (which I assume you do, since you've asked this question). At the very least, click on the link above and read the excerpt about design concepts.


For example, if in a multiplayer game there is a class to handle the physics of player movement and another to handle networking, then I see no way to have one of these classes not rely on the other to get player movement data to the networking system to send it over the network.

In this particular case, I would consider your PlayerMovementCalculator class and your NetworkSystem class to be completely unrelated to each other; one class is responsible for calculating player movement, and the other is responsible for network I/O. Perhaps even in separate independent modules.

However I would certainly expect there to be at least some additional bit of wiring or glue somewhere outside of those modules which mediates data and/or events/messages between them. For example, you might write a PlayerNetworkMediator class using the Mediator Pattern.

Another possible approach might be to de-couple your modules using an Event Aggregator.

In the case of Asynchronous programming such as the type of logic involved with network sockets, you might use expose Observables to tidy up the code which 'listens' to those notifications.

Asynchronous programming doesn't necessarily mean multi-threaded either; its more about program structure and flow control (although multi-threading is the obvious use-case for asynchrony). Observables may be useful in one or both of those modules to allow unrelated classes to subscribe to change notifications.

For example:

  • NetworkMessageReceivedEvent
  • PlayerPositionChangedEvent
  • PlayerDisconnectedEvent

etc.


Lastly, I often find myself coming up with one or more "manager" classes that coordinate other classes. For example, in a game a class would handle the main tick loop and would call update methods in the networking and player classes. This goes against a philosophy of what I have found in my research that each class should be unit-testable and usable independently of others, since any such manager class by its very purpose relies on most of the other classes in the project. Additionally, a manager classes orchestration of the rest of the program is a significant source of non-mental-mappable complexity.

While some of this certainly comes down to experience; the name Manager in a class often indicates a design smell.

When naming classes, consider the functionality that class is responsible for, and allow your class names to reflect what it does.

The problem with Managers in code, is a bit like the problem with Managers in the workplace. Their purpose tends to be vague and poorly understood even by themselves; most of the time we're just better off without them altogether.

Object-Oriented programming is primarily about behaviour. A class is not a data entity, but a representation of some functional requirement in your code.

If you can name a class based on the functional requirement it fulfils, you'll reduce your chance of ending up with some kind of bloated God Object, and are more likely to have a class whose identity and purpose in your program is clear.

Furthermore, it should be more obvious when extra methods and behaviour start creeping in when it really doesn't belong, because the name will start to look wrong - i.e. you'll have a class which is doing a whole bunch of things which aren't reflected by its name

Lastly, avoid the temptation of writing classes whose names look like they belong in an entity relationship model. The problem with class names such as Player, Monster, Car, Dog, etc. is that the imply nothing about their behaviour, and only seem to describe a collection of logically related data or attributes. Object-oriented design isn't data modelling, its behaviour modelling.

For example, consider two different ways of modelling a Monster and Player calculating damage:

class Monster : GameEntity {
    dealDamage(...);
}

class Player : GameEntity {
    dealDamage(...);
}

The problem here is that you might reasonably expect Player and Monster to have a whole bunch of other methods which are probably totally unrelated to the amount of damage these entities might do (Movement for example); you're on the path to the God Object mentioned above.

A more naturally Object-Oriented approach is to identify the name of the class based on its behaviour, for example:

class MonsterDamageDealer : IDamageDealer {
    dealDamage(...) { }
}

class PlayerDamageDealer : IDamageDealer {
    dealDamage(...) { }
}

With this type of design, your Player and Monster objects probably don't have any methods associated with them because those objects contain the data needed by your whole application; they are probably just simple data entities which live inside a repository and only contain fields/properties.

This approach is usually known as Anemic Domain Model, which is considered an anti-pattern for Domain-Driven-Design (DDD), but the S.O.L.I.D principles naturally lead you toward a clean separation between 'shared' data entities (perhaps in a repository), and modular (preferably stateless) behavioural classes in your application's object graph.

SOLID and DDD are two different approaches to OO design; while they cross-over in many ways, they tend to pull in opposing directions with regards to class identity and separation of data and behaviour.


Going back to McConnell's quote above - managing complexity is the reason why software development is a skilled profession rather than a mundane clerical chore. Before McConnell wrote his book, Fred Brooks wrote a paper on the subject which neatly sums up the answer to your question - There is No Silver Bullet to managing complexity.

So while there's no single answer, you can make life easier or harder for yourself depending on the way you approach it:

  • Remember KISS, DRY and YAGNI.
  • Understand how to apply the S.O.L.I.D Principles of OO Design/Software Development
  • Also understand Domain-Driven Design even if there are places where the approach conflicts with SOLID principles; SOLID and DDD tend to agree with each other more than they disagree.
  • Expect your code to change - write automated tests to catch the fallout of those changes (You don't have to follow TDD in order to write useful automated tests - indeed, some of those tests might be integration tests using "throwaway" console apps or test harness apps)
  • Most importantly - be pragmatic. Don't slavishly follow any guidelines; the opposite of complexity is simplicity, so if in doubt (again) - KISS
Related Topic