I'm a big fan of Git sub-modules. I like to be able to track a dependency along with its version, so that you can roll-back to a previous version of your project and have the corresponding version of the dependency to build safely and cleanly. Moreover, it's easier to release our libraries as open source projects as the history for libraries is separate from that of the applications that depend on them (and which are not going to be open sourced).
I'm setting up workflow for multiple projects at work, and I was wondering how it would be if we took this approach a bit of an extreme instead of having a single monolithic project. I quickly realized there is a potential can of worms in really using sub-modules.
Supposing a pair of applications: studio
and player
, and dependent libraries core
, graph
and network
, where dependencies are as follows:
core
is standalonegraph
depends oncore
(sub-module at./libs/core
)network
depdends oncore
(sub-module at./libs/core
)studio
depends ongraph
andnetwork
(sub-modules at./libs/graph
and./libs/network
)player
depends ongraph
andnetwork
(sub-modules at./libs/graph
and./libs/network
)
Suppose that we're using CMake and that each of these projects has unit tests and all the works. Each project (including studio
and player
) must be able to be compiled standalone to perform code metrics, unit testing, etc.
The thing is, a recursive git submodule fetch
, then you get the following directory structure:
studio/
studio/libs/ (sub-module depth: 1)
studio/libs/graph/
studio/libs/graph/libs/ (sub-module depth: 2)
studio/libs/graph/libs/core/
studio/libs/network/
studio/libs/network/libs/ (sub-module depth: 2)
studio/libs/network/libs/core/
Notice that core
is cloned twice in the studio
project. Aside from this wasting disk space, I have a build system problem because I'm building core
twice and I potentially get two different versions of core
.
Question
How do I organize sub-modules so that I get the versioned dependency and standalone build without getting multiple copies of common nested sub-modules?
Possible solution
If the the library dependency is somewhat of a suggestion (i.e. in a "known to work with version X" or "only version X is officially supported" fashion) and potential dependent applications or libraries are responsible for building with whatever version they like, then I could imagine the following scenario:
- Have the build system for
graph
andnetwork
tell them where to findcore
(e.g. via a compiler include path). Define two build targets, "standalone" and "dependency", where "standalone" is based on "dependency" and adds the include path to point to the localcore
sub-module. - Introduce an extra dependency:
studio
oncore
. Then,studio
buildscore
, sets the include path to its own copy of thecore
sub-module, then buildsgraph
andnetwork
in "dependency" mode.
The resulting folder structure looks like:
studio/
studio/libs/ (sub-module depth: 1)
studio/libs/core/
studio/libs/graph/
studio/libs/graph/libs/ (empty folder, sub-modules not fetched)
studio/libs/network/
studio/libs/network/libs/ (empty folder, sub-modules not fetched)
However, this requires some build system magic (I'm pretty confident this can be done with CMake) and a bit of manual work on the part of version updates (updating graph
might also require updating core
and network
to get a compatible version of core
in all projects).
Any thoughts on this?
Best Answer
I'm very late to this party, but your question still doesn't seem to have a complete answer, and it's a pretty prominent hit from google.
I have the exact same problem with C++/CMake/Git/Submodules and I have a similar problem with MATLAB/Git/Submodules, which gets some extra weirdness because MATLAB isn't compiled. I came across this video recently, which seems to propose a "solution". I don't like the solution, because it essentially means throwing away submodules, but it does eliminate the problem. It is just as @errordeveloper recommends. Each project has no submodules. To build a project, create a super-project to build it, and include it as a sibling to its dependencies.
So your project for developing
graph
might look like:and then your project for studio could be:
The super-projects are just a main
CMakeLists.txt
and a bunch of submodules. But none of the projects have any submodules themselves.The only cost I see to this approach is the proliferation of trivial "super-projects" that are just dedicated to building your real projects. And if someone gets a hold of one of your projects, there is no easy way to tell without finding the super-project as well, what its dependencies are. That might make it sit really ugly on Github, for example.