Why is UML not used in most free software (e.g. on Linux)

free softwareuml

I'm trying to understand why UML is not used in most free software projects. For example, my Debian/Linux system has probably more than ten thousand free software packages, and I cannot name even one which has been developed using explicit UML framework and methodology. For example, Qt, GCC, Linux kernel, bash, GNU make, Ocaml, Gnome, Unison, lighttpd, libonion, docker are free software projects which (AFAIK) don't mention UML at all.

(My guess is that UML is very well suited for formal subcontracting of development tasks, and that is not how free software is developed)

Notice that while I did read some material about UML, I don't claim to have a good understanding of it.

Actually, I cannot easily name a free software where UML has been used (except perhaps some UML tools implemented as free software). Perhaps openstack is an exception (something there mentions UML).

(even old free software projects might have adopted UML after they have been started, but they did not)


Some colleagues working on Papyrus mentioned that most free software projects did not have at their beginning any explicitly (and deep enough) formalized model. Also, UML looks much more related to Java than it claims (I am not entirely sure it would make sense for Ocaml or Common Lisp or Haskell or Javascript, and perhaps not even for C++11….). Perhaps agile software development is not very UML friendly.

See also this answer to a somehow related question. M.Fowler's blog Is Design Dead? is insightful.

PS. I don't think it is mainly a matter of opinion; there should be some objective reason, and some essential characteristic of free software, that explains why. I tend to guess that UML is only useful for formalized subcontracting, and is useful only when some part of the developed software is hidden, as in proprietary projects. If that is true, UML would be incompatible with free software development.

NB: I am not an UML fan myself. I don't define UML as paper documentation only, but also as a [meta-]data format for software tools

Best Answer

There are different ways to use UML. Martin Fowler calls these UML modes and identifies four: UML as Notes, UML as Sketch, UML as Blueprint, and UML as a Programming Language.

UML as a Programming Language never really took off. There has been some work in this area under different names, like Model Driven Architecture or Model Based Software Engineering. In this approach, you create highly detailed models of your software system and generate the code from those models. There may be some use cases where this approach is useful, but not for general software and especially not outside of large companies that can afford the tools that power this approach. It's also a time-consuming process - I can type the code for a class faster than I can create all of the graphical models necessary to implement it.

UML as a Blueprint is often indicative of a "big design up front" project. It doesn't have to be, of course. The model can be fully described for a particular increment, as well. But the idea is that the time is spent creating a design in the form of UML models that are then handed off to someone to convert into code. All of the details are spelled out and the conversion to code tends to be more mechanical.

UML as Sketch and UML as Notes are similar in nature, but differ based on when they are used. Using UML as Sketch means that you will sketch out designs using UML notations, but the diagrams are likely to not be complete, but will focus on particular aspects of the design that you need to communicate with others. UML as Notes is similar, but the models are created after the code to aid in understanding the code base.

When you're considering this, I think everything above is true for any kind of modeling notation. You can apply it to entity-relationship diagrams, IDEF diagrams, business process modeling notation, and so on. Regardless of the modeling notation, you can choose when you apply it (before as a specification, after as an alternative representation) and how much detail (full detail to key aspects).


The other side of this is open source culture.

Often, open source projects start off to solve a problem that an individual (or, today, a company) is experiencing. If it's being launched by an individual, the number of developers is 1. In this case, the communication overhead is extremely low and there's little need to communicate about the requirements and design. In a company, there's likely to be a small team. In this instance, you'll likely need to communicate design possibilities and discuss trade-offs. However, once you have made your design decisions, you need to either maintain your models as your code base changes over time or throw them away. In Agile Modeling terms, "document continuously" and maintain a "single source of information".

As a brief aside, there is the idea that code is design and that models are just alternate views of the design. Jack Reeves wrote three essays on code as design, and there are discussions on C2 wiki as well, discussing the ideas that the source code is the design, the design is the source code, and source code and modeling. If you subscribe to this belief (which I do), then the source code is the reality and any diagrams should just exist to make understanding the code and, more importantly, the rationale behind why the code is what it is.

A successful open source project, like the ones that you mention, have contributors around the world. These contributors tend to be technically competent in the technologies that power the software and are likely also to be users of the software. Contributors are people who can read source code just as easily as models, and can use tools (IDEs and reverse engineering tools) to understand the code (including generating models, if they feel the need). They can also create sketches of the flow on their own.


Of the four modes that Fowler describes, I don't think you'll find an open source project, or very many projects anywhere, that are using modeling languages as programming languages or blueprints. This leaves notes and sketch as possible uses for UML. Notes would be created by the contributor for the contributor, so you probably wouldn't find them uploaded anywhere. Sketches diminish in value as the code becomes more complete and likely wouldn't be maintained as that would just take effort on the part of contributors.

Many open source projects don't have models made available because it doesn't add value. However, that doesn't mean that models weren't created by someone early in the project or that individuals haven't created their own models of the system. It's just more time effective to maintain one source of design information: the source code.

If you want to find people exchanging design information, I'd recommend looking at any kind of forums or mailing lists that are used by contributors. Often, these forums and mailing lists serve as the design documentation for projects. You may not find formal UML, but you may find some kind of graphical representation of design information and models there. You can also pop into chat rooms or other communication channels for the project - if you see people talking about design decisions, they may be communicating with the graphical models. But they likely won't become part of a repository since they aren't valuable once they have served their purpose in communication.

Related Topic