I've been dealing with the problem of scaling CI at my company and at the same time trying to figure out which approach to take when it comes to CI and multiple branches. There is a similar question at stackoverflow, Multiple feature branches and continuous integration. I've started a new one because I'd like to get more of discussion and provide some analysis in the question.
So far I've found that there are 2 main approaches that I can take (or maybe some others???).
- Multiple set of jobs (talking about Jenkins/Hudson here) per branch
- Write tooling to manage the extra jobs
- Create/modify/delete Jobs in bulk
- Custom settings for each job per branch (SCM url, dep management repos duplications)
- Some examples of people tackling this problem with shell tools, ant scripts and Jenkins CLI. See:
- http://jenkins.361315.n4.nabble.com/Multiple-branches-best-practice-td2306578.html
- http://jenkins.361315.n4.nabble.com/Is-it-possible-to-handle-multiple-branches-where-some-jobs-should-run-on-each-one-without-duplicatin-td954729.html
- http://jenkins.361315.n4.nabble.com/Parallel-development-with-branches-td1013013.html
- Configure or Create hudson job automatically
- Will cause more load on your CI cluster
- Feedback cycle for devs slows down (if the infrastructure cannot handle the new load)
- Write tooling to manage the extra jobs
- Multiple set of jobs per 2 branches (dev & stable)
- Manage the two sets manually (if you change the conf of a job then be sure to change in the other branch)
- PITA but at least so few to manage
- Other extra branches won't get a full test suite before they get pushed to dev
- Unsatisfied devs. Why should a dev care about CI scaling problems. He has a simple request, when I branch I would like to test my code. Simple.
- Manage the two sets manually (if you change the conf of a job then be sure to change in the other branch)
So it seems if I want to provide devs with CI for their own custom branches I need special tooling for Jenkins (API or shellscripts or something?) and handle scaling. Or I can tell them to merge more often to DEV and live without CI on custom branches. Which one would you take or are there other options?
Best Answer
When you talk about scaling CI you're really talking about scaling the use of your CI server to handle all your feature branches along with your mainline. Initially this looks like a good approach as the developers in a branch get all the advantages of the automated testing that the CI jobs include. However, you run into problems managing the CI server jobs (like you have discovered) and more importantly, you aren't really doing CI. Yes, you are using a CI server, but you aren't continuously integrating the code from all of your developers.
Performing real CI means that all of your developers are committing regularly to the mainline. Easy to say, but the hard part is doing it without breaking your application. I highly recommend you look at Continuous Delivery, especially the Keeping Your Application Releasable section in Chapter 13: Managing Components and Dependencies. The main points are:
They are pretty self explanatory except branch by abstraction. This is just a fancy term for:
The following paragraph from the Branches, Streams, and Continuous Integration section in Chapter 14: Advanced Version Control summarises the impacts.
It takes quite a mind shift to give up feature branches and you will always get resistance. In my experience this resistance is based on developers not feeling safe committing code the the mainline and this is a reasonable concern. This in turn usually stems from a lack of knowledge, confidence or experience with the techniques listed above and possibly with the lack of confidence with your automated tests. The former can be solved with training and developer support. The latter is a far more difficult problem to deal with, however branching doesn't provide any extra real safety, it just defers the problem until the developers feel confident enough with their code.