Continuous Integration – How to Avoid CI-Driven Development

ccontinuous integrationopen source

I'm working on a very large research-led open-source project, with a bunch of other regular contributors. Because the project is now quite big, a consortium (composed of two full-time employees and few members) is in charge of maintaining the project, the continuous integration (CI), etc. They just don't have time for integration of external contributions though.

The project is composed of a "core" framework, of about half-a-milion-or-so lines of code, a bunch of "plugins" that are maintained by the consortium, and several external plugins, most of which we aren't even aware of.

Currently, our CI builds the core, and the maintained plugins.

One of the big issue we face is that most contributors (and especially the occasional ones) aren't building 90% of the maintained plugins, so when they propose refactoring changes in the core (which these days happens on a quite regular basis), they checked that the code compiles on their machine before making a pull request on GitHub.

The code works, they're happy, and then the CI finishes building and the problems start: compilation failed in a consortium-maintained plugin, that the contributor did not build on his/her machine.

That plugin might have dependencies on third-party libraries, such as CUDA for instance, and the user does not want, does not know how to, or simply can't for hardware reasons, compile that broken plugin.

So then – either the PR stays ad aeternam in the limbo of never-to-be-merged PRs – Or the contributor greps the renamed variable in the source of the broken plugin, changes the code, pushes on his/her branch, waits for the CI to finish compiling, usually gets more errors, and reiterates the process until CI is happy – Or one of the two already-overbooked permanents in the consortium gives a hand and tries to fix the PR on their machine.

None of those options are viable, but we just don't know how to do it differently. Have you ever been confronted to a similar situation of your projects? And if so, how did you handle this problem? Is there a solution I'm not seeing here?

Best Answer

CI-driven development is fine! This is a lot better than not running tests and including broken code! However, there are a couple of things to make this easier on everyone involved:

  • Set expectations: Have contribution documentation that explains that CI often finds additional issues, and that these will have to be fixed before a merge. Perhaps explain that smallish, local changes are more likely to work well – so splitting a large change into multiple PRs can be sensible.

  • Encourage local testing: Make it easy to set up a test environment for your system. A script that verifies that all dependencies have been installed? A Docker container that's ready to go? A virtual machine image? Does your test runner have mechanisms that allows more important tests to be prioritized?

  • Explain how to use CI for themselves: Part of the frustration is that this feedback only comes after submitting a PR. If the contributors set up CI for their own repositories, they'll get earlier feedback – and produce less CI notifications for other people.

  • Resolve all PRs, either way: If something cannot be merged because it is broken, and if there's no progress towards getting the problems fixed, just close it. These abandoned open PRs just clutter up everything, and any feedback is better than just ignoring the issue. It is possible to phrase this very nicely, and make it clear that of course you'd be happy to merge when the problems are fixed. (see also: The Art of Closing by Jessie Frazelle, Best Practices for Maintainers: Learning to say no)

    Also consider making these abandoned PRs discoverable so that someone else can pick them up. This may even be a good task for new contributors, if the remaining issues are more mechanical and don't need deep familiarity with the system.

For the long-term perspective, that changes seem to break unrelated functionality so often could mean that your current design is a bit problematic. For example, do the plugin interfaces properly encapsulate the internals of your core? C++ makes it easy to accidentally leak implementation details, but also makes it possible to create strong abstractions that are very difficult to misuse. You can't change this over night, but you can shepherd the long-term evolution of the software towards a less fragile architecture.

Related Topic