Git Dependencies – Git Repository Structure for Interdependent Projects

dependenciesgit

Note: I've seen several other questions about repository organization, but I haven't found any with this dependency issue

Current Structure

At the moment, we have several distributions (which must remain as independent repositories – these are more or less client-specific and can't be shared), which all have a library repository as a submodule. This, in turn, depends on a few more layers, though these lower layers are pretty set in stone.

This is, roughly, what the current structure looks like (Distro is an ever-growing list of distributions):

Distro
+-Library Collection
  +-Utility wrapper
    +-Inner core

The problem here is the Library Collection repository. This includes a few dozen libraries at the moment, but continues to grow at an accelerating rate. As a result, this is becoming cumbersome to maintain and I expect git to start experiencing some performance issues in the coming years.

Additionally, most distributions include code that might need to end up in Library Collection in the future, though this is difficult to predict and is largely dependent on external factors.

We'd like to find a way to split Library Collection in such a way that each distribution can pick-and-choose whatever individual libraries it needs, while keeping project organization (mostly CMake) fairly straightforward. Ideally, it would look something like this:

Distro
+-Lib A
+-Lib B
+-Utility wrapper
  +-Inner core

Dependencies

In the current structure, every repository depends on the next level down (sometimes, though rarely, two levels down); that's why this repository structure evolved the way it has.

In the ideal structure above, this nested-dependency system breaks down. Lib A and Lib B both depend on Utility Wrapper, which is now at their level. We don't want to make Utility Wrapper a submodule of both Lib A and Lib B, because this will lead to quite a lot of redundant data when Distro is cloned (not to mention headaches while debugging).

Additionally, there's a chance Lib B may depend on Lib A, though this varies between libraries.

While taking a first crack at splitting the libraries, I had to resort to working on individual libraries within a distribution repository, so the Distro level could work out dependencies and figure out include directories and linking (we use CMake for all of this, if it changes anything).

Question

What's the best way to organize these repositories, so that Lib A can be developed/tested independently of Distro, even though it requires Utility Wrapper and, possibly, some other libraries at the same level?

Best Answer

Source control should not manage dependencies.

Git submodules are great for sharing library code as long as the parent and child repositories evolve together, and often enough that managing this relationship in source control makes sense.

In your case, a need arises to develop these independently. Extra care must be taken to ensure breaking changes are not introduced, especially if you are dealing with multiple distros and clients.

You haven't specified a technology stack, but the solution to this problem is dependency management. If this were .NET then NuGet is your go-to tool. For Java it would be something like Mavin (and there are others too). JavaScript has NPM, Ruby has RubyGems, etc.

Keep each library in their own repository, but break the parent-child source control mix. Each library needs to have a package release and a version number associated with it (related: Semantic Versioning). Each distro needs to specify exactly which version number of the top most library it needs. That library needs to specify which version it needs for the next library down.

Distro A

  • Library Collection v1.0.3
    • Utility Wrapper v1.3.7
      • Inner Core v2.7.0

Distro B

  • Library Collection v1.2.0
    • Utility Wrapper v1.8.12
      • Inner Core v2.13.2

A good package manager can resolve the dependency graph and install the correct versions of each library.

Source control and dependency management are very different beasts, and require different tools.

Related Topic