SVN Tagging – Tags or Specify Revision?

svntagging

At my company, we have your typical svn structure. Each project has branches, tags and trunk.

repo
-Project A
 * trunk
 * branches
 * tags
-Project B
 * trunk
 * branches
 * tags
-Project C
 * trunk
 * branches
 * tags
-Project D
 * trunk
 * branches
 * tags
-Application 1
 * trunk
 * branches
 * tags

So Project A is core functionality pretty much all of our other projects use. D may depend on C & B as well, and application 1 uses them all. Note that we have many more libraries and applications than I've illistrated above, but its all done the same way. The assemblies generated by building project A are included in needed library or application projects as externals. We've got quite a few libraries like this.

What we've been doing after a release is tagging the RC branches we created for each project used by the application, and then these then become tags.

Obviously this is a fair amount of work as we ensure that tags only point to other tags, and you start by creating the tag for Project A, making the other stuff point to that, etc., the idea being we want to branch from the tags should a patch release be required before our next feature release.

Given the amount of effort that can be involved getting this all together, one proposal is to tag only the application's RC and for all the externals point to a specific revision. If we need to create a branch to fix one of the libraries, branch from that revision and do the work then.

I'm not against this, as I would like to not have to spend the time getting all the tags for all the projects setup, but I wanted to ask, can anyone think of any pitfalls of this approach? Is there any reason to create these tags over just pinning to a specific revision?

Best Answer

It depends on how cleanly all these elements are separated.

In order to put clarification on how exactly i see things - i am trying to put general

As you highlighted you can deal with this in two ways -

A. Single Product Approach
The entire space is one projects and while they are fairly independent - the repo doesn't recognize it.

In this case few things can be observed.

  1. The same release number of any RC is applied to all object products. So if the repo moves from 1.2 to 1.3 via 1.2.1 and 1.2.3 - there is a corresponding revision of each element i.e. project1.1.2.1.so as well as project2.1.2.1.so.

  2. Version number of tags of all projects and application has the same tag.

  3. Between two versions of a particular singular repo, whether the projectA or projectB has any changes or not, the tag keeps advancing as per total progress.

  4. In such case, when you take a branch from any point in trunk, the entire trunk must be mirrored.

  5. Assuming that testing has been good, all projects and application with same version/tag are always perfectly compatible with each other.

B. Independent Product approach
Here each project is an independent product -

  1. They must have different repo (or at least separate trunks for themselves).

  2. The tagging is limited to the projects individually each part of repo will produce end product like - projectA.1.2.1.so OR projectB.1.5.lib

  3. In such cases - when you branch - the branch is always only for the given project and any tags before/after the merge only applies to those projects.

  4. Each project can have independent number of tags depending on the actual number of bugs within it.

  5. In this case each different project advances with different amount - you always need to find which set of tags - for each other works compatibly with other projects. For example - projectA-1.2 works for projectB-1.2 all the way upto projectB-2.3 but breaks after projectB-2.5 or higher. Such nested dependencies is what you need to now manage through explicitly integration testing.

So what are the pros and cons?

  1. In a simple terms - the single product approach is clumsy because all elements are copied when a branch is created. Now if before committing if someone changes something on another person's module it becomes a conflict and that too after some significant time in future when other person decides to commit/merge. This problem essentially grows as the number of people grow in the team.

  2. One of the primary reason why single product approach helps is the simplicity. Ideally if you have a single authority for release management - every full release is not only tested to see that pieces work individually but also that all versions work together. The integration testing is always more painful when independent products grow and you need to keep a tab on multiple versions of each different products that works together and that is the job by itself.

  3. Support cost increases in case of multiple product situation. This is understood in this way - suppose projectB-1.2 and projectB-1.7 depends on different compatible versions of projectA-0.8 and projectA-0.9. Now, if the projectA has got a bug which could have been affecting all its' clients (irrespective of compatibility) it has to be fixed twice - both in projectA-0.8 and projectA-0.9. This doubles the effort on bug support for projectA.

  4. Further suppose if implementations in projectA-0.8 is found to be a bug (say already solved in projectA-0.9) but you realize that solution to this bug under 0.8 will help part of its clients, but break others; in this case projectA-0.8 needs to be further branched out separately to see that now both groups are supported.

  5. Last but most important - when people work in independent systems the communication between them reduces. They establish their work based on their individual tests; later on when the application folk needs to put in all the modules together, they have to put effort to bring pieces together, you realize a lot of issues of integration and further, if the responsibility of the bugs are not appropriately fixed based on root cause, system will tend to accumulate tech debt over time.

In a nutshell the more break-up you do in terms of a projects/modules, more of the nesting dependencies issues arise.

However this complexity is worth it if the size of the repo is too large. In a single repo, everyday people may end-up stuck dealing with conflicts or mergers.

One of the practical solution
In our place this has been a long time debate with all of us.

A simple approach we try to follow is as follows. Initially we all start with a single product approach. As we realize that as APIs become more stable, and certain key objects becomes fairly stable, we remove them slowly to another repo so that it becomes black-box for everyone. This approaches balances the initial need to remain in close communication and integration, and longer term need to keep repo manageable.

Sorry, i wrote too much?

NOTE: From your question - i felt that your current practice and proposed one are both slightly different from how i described; however, i have used the above scenarios since they are well outlined methods so easier to follow.