How to Prevent Developers from Committing to the Wrong Branch in DVCS

dvcstrainingversion control

The problem

I am on a software project which has about 10 developers, we share source code via Mercurial. We have a development and production branch per release. Repeatedly during the course of the project we have had source code from one branch i.e. v1 getting into patch and maintenance branches for earlier releases of software i.e. v2.

This results in either time spent backing out the wrong commit, or wrong (possibly non-QAd) code reaching and getting deployed in the wrong branch if we don't notice that the code has gone into the wrong branch.

Our branch and merge design/method

               v1-test   v1-patch1   v1-patch2
               ^---------^-----------^                v1-prod
              /         / \           \
-----------------------/   \           \              v1-dev
              \             \           \
               --------------------------\            v2-dev
                             \       \    \ 
                              ^-------^-------------  v2-prod
                              v2-test v2-patch1      

Hence we will work on a release development branch, until it's deemed ready, branch it off for a single testing/UAT/Production branch, where all releases and maintenance is done. Tags are used to build releases of this branch. While v1 is being tested, a branch will have been made for v2 and developers will start working on new features.

What tends to happen is that a developer commits work due for v2-dev branch into v1-dev or v1-prod, or worse, they merge v2-dev into v1-prod (or similar such mistakes).

We tell most developers not to access the -prod branches, however code still sneaks in. A group of more senior developers `look after' the -prod branch.

It should be noted that while v2 has just started development, there may still be some quite hefty patches going into v1 to fix issues. I.e. v1 may not just be getting the odd small patch.

What we've tried so far

  • Having a separate -prod branch, with gatekeepers. A -prod branch should raise warnings through its name and most developers don't need to ever be in that branch. This has not really reduced the problem.
  • Raised awareness of this problem amongst the developers, to try and make them more vigilant. Again this has not been very successful.

Possible reasons I see for developers committing to the wrong branch

  • Too complex a branch design
  • Having active development in multiple branches in parallel. (The project does exhibit symptoms of using the avalanche-model.)
  • Developers don't understand the DVCS well enough

Questions I've read which were somewhat relevant

I've read this question on not committing to the wrong branch and I feel that the answers regarding visual cues may be helpful. However I am not entirely convinced that the problems we're experiencing are not symptoms of a more fundamental problem.

With the visual clues, we can incorporate them into the command line easily, however about half the team use eclipse which I'm unsure how to incorporate visual cues.

Question

What methods, in the form of software, project management or governance can we use to reduce (ideally stop) commits to the wrong branch taking up our time or dirtying our deployed code?

Specific comment on the reasons I believe may be contributing as outlined above would be appreciated, but this shouldn't limit your reply.

Best Answer

The problem is you are changing what the meaning of a branch is part way through the process.

Initially, the v1 dev branch is for development. All new features go there. At some point in the future, it becomes a maintenance branch for the v1 release branch. This is the crux of the problem.

Its not that the developers are sloppy, its that the permissions and roles of the branch are sloppy and subject to change.

What you need to do is establish what role each branch as, and maintain that role. If the role changes, branch.

For example:

 developer
  commits    |   |  |   |    |     |   |     |
             v   v  v   v    v     v   v     v
 dev  +--+---------------------+------------------->
         |           ^    ^    |           ^    ^
         |           |    |    |           |    |
 v1      +----+------+----+    |           |    |
           prod  patches       |           |    |
                               |           |    |
                               |           |    |
 v2                            +-----+-----+----+
                                  prod  patches

In this model, developers always commit to dev. If you are building a patch, you check the patch into that release's branch (or better yet, branch the release branch for a patch and then merge it back into the release branch).

One article that you should read (and its probably an understatement for 'should') is Advanced SCM Branching Strategies by Stephen Vance.

In this paper, I first define branching in a general sense. I then discuss various strategies for branching, starting with the obvious and moving up to several that are more appropriate for larger development efforts. Along the way, I discuss the pros and cons of each strategy, using them to motivate the changes that compose the more complex strategies...

In this article, he identifies five roles that branches may have. Sometimes a branch may fill two roles and roles do not necessarily need a new branch as long as the role policies do not change mid branch (you will occasionally see mention of "branch on incompatible policy").

These roles are:

  1. Mainline. This is where branches are made from. Always branching from the mainline makes merges easier since the two branches will have a common ancestor that isn't branch upon branch upon branches.
  2. Development. This is where developers check in code. One may have multiple development branches to isolate high risk changes from the ones that are routine and mundane.
  3. Maintenance. Bug fixes on an existing production environment.
  4. Accumulation. When merging two branches, one may not want to risk destabilizing the mainline. So branch the mainline, merge the branches into the accumulator and merge back to the mainline once things are settled.
  5. Packaging. Packaging a release happens in the packaging branches. This often becomes the release and serves to isolate the release effort from development. See How to deal with undesired commits that break long-running release builds? for an example of where the packaging conflicts with development.

In your example, you've got a cascading mainline (this is a problem - it makes merges more difficult - what happens if you want to merge a fix for v1 into v2 and v3?), a dev branch that becomes a maintenance branch (change of policy, this is a problem).

Ok, you say, thats great, but this was written for perforce which is a centralized VCS - I'm using DVCS.

Lets look at the git-flow model and see how it applies.

The master branch (blue) is the release branch - for tagging. It is not the mainline. The mainline is actually the develop branch (yellow). The release branches (green) are the packaging role. Low risk development happens in the mainline, high risk development happens in the feature branches (pink). In this model, accumulation is done in the develop branch. Maintenance are considered 'hot fixes' which are red.

While the role policies aren't exact match (each product has its own slightly different lifecycle), they are a match.

Doing this should simplify your branching policy and make it easier for everyone involved.