Git – How to determine when to open a new branch or a new repository when using Git

branchinggitversion control

Currently I am working on a project to make a product for company X. The team I am working with is selling similar service to many company with similar product, where 40% of the code is reused and 60% customized feature made to particular company. Previously there were no version control until I suggest using git recently. We are happy that we don't need to copy and paste each other's work repeatedly after introducing Git, but now we are a bit confused that should the new project be open as a new branch of previous product's repository (i.e. one repository for all products to different company) or as a new stand alone repository (i.e. one repository for different products)

Normally we are not merging the products of different companies, but sometimes when we find some good and commonly-wanted features, we would like to add those features to all the products of different company as a gift(once or twice per year). In this case, will it be better to open a new repository or just start a new branch?

Best Answer

40% of the code is reused and 60% customized feature made to particular company

This might take a little time to actually get to the point that you can do this, but I think it should be a goal to extract that 40% out and make it its own repository. This is your main library/framework from which to work.

Then, that other 60% can be in dedicated repositories on a per-client basis.

It sounds like your code base qualifies as "legacy," so I highly recommend picking up Working Effectively With Legacy Code by Michael Feathers for yourself and your team. It provides techniques for bringing code under test, which can also be used for creating a plan to extract your code out into a library.

How would you go about doing this starting now? Here's what I'd do:

  1. Make every client its own repository (or, if you're using something like Github or Gitlab to manage the git repositories, and you have several repositories for a client, you can create an organization per client, but we'll assume 1 repository per client).
  2. Take a class that I reuse a lot and extract it out (assuming an object-oriented approach, though a functional one should have similar principles). We'll use this to start the library repository (this makes it useful from day 1). If you don't have full classes yet, then that's where Feathers' book comes in, as it shows you how to make classes out of non-classed code. While we're at it, we can add some basic testing to ensure it still does what we expect it to do.
  3. Commit this extraction as its own thing, and you can use it as the template to replicate the change across all repositories.
  4. As I come across shared parts, I extract them out and add them to this library repository.
  5. As the library grows, and I find things that are used by some clients, but not others, I can work toward a modular approach, where larger features are extracted out into their own repositories and included as "modules" or "plugins" to the main system. Alternatively, if there are themes for particular groups of items (such as a bunch that deal with authentication or payment processing), I can extract those out to their own repositories. How you go, exactly, in this stage depends on the nature of your code and the direction you want to go with it.

The advantage of this system is that all you need to do is add that shared item into the library and wire up any hooks necessary to make it work, and it's available to all of the clients, no copy-pasta needed!

Why this instead of branches? While technologically, branches aren't much different from standalone repositories, convention differentiates between them. A repository is a complete project (with references of some sort, either via submodules or scripts, to dependencies), while a branch is generally a given state of the project. You branch when you want to add a feature, fix a bug, or make another kind of change, but the system itself is considered the same. You make a new repository when the project is considered a different one (in this case, for a different client, with its own customizations and whatnot).

A good example of this is Oracle OpenOffice vs LibreOffice. LibreOffice is a fork of OpenOffice, and is even maintained by the same people. However, LibreOffice, even when it was technologically identical to OpenOffice (when it was first forked), it was considered a different product. The team then stripped the Oracle branding from it and started LibreOffice's own development path, which is now distinct from OpenOffice. While the changes that have been made to LibreOffice could have gone into a branch in OpenOffice, since LibreOffice is considered a different project, it needed its own repository, instead.

Related Topic