Git – How to Version Dozen Libraries Worked on in Parallel

branchingdependenciesgitscmtagging

We are doing projects, but we reuse a lot of code between the projects and have lots of libraries that contain our common code. As we implement new projects we find more ways to factor out common code and put it into libraries. The libraries depend on each other, and the projects depend on the libraries. Each project, and all libraries used in that project, need to use the same version of all the libraries they are referring to. If we release a piece of software we will have to fix bugs and maybe add new features for many years, sometimes for decades. We have about a dozen libraries, changes often cut across more than two, and several teams work on several projects in parallel, making concurrent changes to all these libraries.

We have recently switched to git and set up repositories for each library and each project. We use stash as a common repository, do new stuff on feature branches, then make pull requests and merge them only after review.

Many of the issues we have to deal with in projects requires us to do changes across several libraries and the project's specific code. These often include changes of library interfaces, some of which are incompatible. (If you think this sounds fishy: We interface with hardware, and hide specific hardware behind generic interfaces. Almost each time we integrate some other vendor's hardware we run into cases our current interfaces did not anticipate, and so have to refine them.) For example, imagine a project P1 using the libraries L1, L2, and L3. L1 also uses L2 and L3, and L2 uses L3 as well. The dependency graph looks like this:

   <-------L1<--+
P1 <----+  ^    |
   <-+  |  |    |
     |  +--L2   |
     |     ^    |
     |     |    |
     +-----L3---+

Now imagine a feature for this project requires changes in P1 and L3 which change the interface of L3. Now add projects P2 and P3 into the mix, which also refer to these libraries. We cannot afford to switch them all to the new interface, run all the tests, and deploy the new software. So what's the alternative?

implement the new interface in L3
make a pull request for L3 and wait for the review
merge the change
create a new release of L3
start working on the feature in P1 by making it refer to L3's new release, then implement the feature on P1's feature branch
make a pull request, have this reviewed, and merged

(I just noticed that I forgot to switch L1 and L2 to the new release. And I don't even know where to stick this in, because it would need to be done in parallel with P1…)

This is a tedious, error-prone, and very long process to implement this feature, it requires to independent reviews (which makes it much harder to review), does not scale at all, and is likely to put us out of business because we get so bogged down in process we never get anything done.

But how do we employ branching and tagging in order to create a process that allows us to implement new features in new projects without too much overhead?

Best Answer

Kind of putting out the obvious here, but maybe worth to mention it.

Usually, git repos are tailored per lib/project because they tend to be independent. You update your project, and don't care about the rest. Other projects depending on it will simply update their lib whenever they see fit.

However, your case seems highly dependent on correlated components, so that one feature usually affects many of them. And the whole has to be packaged as a bundle. Since implementing a feature/change/bug often requires to adapt many different libraries/projects at once, perhaps it makes sense to put them all in the same repo.

There are strong advantages/drawbacks to this.

Advantages:

Tracability: the branch shows everything changed in every project/lib related to this feature/bug.
Bundling: just pick a tag, and you'll get all the sources right.

Drawbacks:

Merging: ...it's sometimes already tough with a single project. With different teams working on shared branches, be ready to brace for impact.
Dangerous "oops" factor: if one employee messes up the repository by making some mistake, it might impact all projects & teams.

It's up to you to know if the price is worth the benefit.

EDIT:

It would work like this:

Feature X must be implemented
Create branch feature_x
All developers involved work on this branch and work paralelly on it, probably in dedicated directories related to their project/lib
Once it's over, review it, test it, package it, whatever
Merge it back in the master ...and this may be the tough part since in the meantime feature_y and feature_z may have been added too. It becomes a "cross-team" merge. This is why it is a serious drawback.

just for the record: I think this is in most cases a bad idea and should be done cautiously because the merge drawback is usually higher than the one you get through dependency management / proper feature tracking.

Best Answer

Related Solutions

Git Branching – How to Name Bug Fix Branches in Git-Flow

Git Dependencies – Git Repository Structure for Interdependent Projects

Related Topic