Git – How Git Was Designed

designgitlinuxversion control

My workplace recently switched to Git and I've been loving (and hating!) it. I really do love it, and it is extremely powerful. The only part I hate is that sometimes it's too powerful (and maybe a bit terse/confusing).

My question is… How was Git designed? Just using it for a short amount of time, you get the feel that it can handle many obscure workflows that other version control systems could not. But it also feels elegant underneath. And fast!

This is no doubt in part to Linus's talent. But I'm wondering, was the overall design of git based off of something? I've read about BitKeeper but the accounts are scant on technical details. The compression, the graphs, getting rid of revision numbers, emphasizing branching, stashing, remotes… Where did it all come from?

Linus really knocked this one out of the park and on pretty much the first try! It's quite good to use once you're past the learning curve.

Best Answer

Git was not designed as much as evolved.

Take a look by yourself. Clone the official git repository, open it in gitk (or your favorite graphical git log viewer), and look at its earliest revisions.

You will see it originally had only the very core functionality (the object database and the index). Everything else was done by hand. However, this small core was designed to be easily automated via shell scripting. The early users of git wrote their own shell scripts to automate common tasks; little by little, these scripts were incorporated into the git distribution (see for an early example 839a7a0). Every time there was a new need, the scripts were adapted to allow for it. Much later, several of these scripts would be rewritten in C.

This combination of a clean, orthogonal core (which you can still use directly if you have the need), with an upper layer which grew organically over it, is what gives git its power. Of course, it is also what gives it the large amount of oddly-named commands and options.


The compression, the graphs, getting rid of revision numbers, emphasizing branching, stashing, remotes... Where did it all come from?

A lot of that was not there in the beginning.

While each object was individually compressed, and duplicates were avoided by their naming, the "pack" files which are responsible for the high compression we are used to seeing in git did not exist. The philosophy in the beginning was "disk space is cheap".

If by "the graphs" you mean graphical viewers like gitk, they appeared later (AFAIK, the first one was gitk). AFAIK, BitKeeper also had a graphical history viewer.

Getting rid of the version numbers, in fact git's core concept of using a content-addressed filesystem to store the objects, mostly came from monotone. At that time, monotone was slow; if this were not the case, it is possible Linus would have used it instead of creating git.

Emphasizing branching is somewhat unavoidable on a distributed version control system, since each clone acts as a separate branch.

Stashing (git stash) is, IIRC, quite recent. The reflogs, which it uses, were not there in the beginning.

Even remotes were not there initially. Originally, you copied the objects by hand using rsync.

One by one, each of these features was added by someone. Not all of them — perhaps not even most of them — were written by Linus. Every time anyone feels a need which git does not fulfill, one can create a new feature over git's core "plumbing" layer, and propose it for inclusion. If it is good, it probably will be accepted, enhancing git's utility (and its command line complexity) even further.