Rest – How to manage the underlying codebase for a versioned API

api-versioningrestversioning

I've been reading up on versioning strategies for ReST APIs, and something none of them appear to address is how you manage the underlying codebase.

Let's say we're making a bunch of breaking changes to an API – for example, changing our Customer resource so that it returns separate forename and surname fields instead of a single name field. (For this example, I'll use the URL versioning solution since it's easy to understand the concepts involved, but the question is equally applicable to content negotiation or custom HTTP headers)

We now have an endpoint at http://api.mycompany.com/v1/customers/{id}, and another incompatible endpoint at http://api.mycompany.com/v2/customers/{id}. We are still releasing bugfixes and security updates to the v1 API, but new feature development is now all focusing on v2. How do we write, test and deploy changes to our API server? I can see at least two solutions:

  • Use a source control branch/tag for the v1 codebase. v1 and v2 are developed, and deployed independently, with revision control merges used as necessary to apply the same bugfix to both versions – similar to how you'd manage codebases for native apps when developing a major new version whilst still supporting the previous version.

  • Make the codebase itself aware of the API versions, so you end up with a single codebase that includes both the v1 customer representation and the v2 customer representation. Treat versioning as part of your solution architecture instead of a deployment issue – probably using some combination of namespaces and routing to make sure requests are handled by the correct version.

The obvious advantage of the branch model is that it's trivial to delete old API versions – just stop deploying the appropriate branch/tag – but if you're running several versions, you could end up with a really convoluted branch structure and deployment pipeline. The "unified codebase" model avoids this problem, but (I think?) would make it much harder to remove deprecated resources and endpoints from the codebase when they're no longer required. I know this is probably subjective since there's unlikely to be a simple correct answer, but I'm curious to understand how organisations who maintain complex APIs across multiple versions are solving this problem.

Best Answer

I've used both of the strategies you mention. Of those two, I favor the second approach, being simpler, in use cases that support it. That is, if the versioning needs are simple, then go with a simpler software design:

  • A low number of changes, low complexity changes, or low frequency change schedule
  • Changes that are largely orthogonal to the rest of the codebase: the public API can exist peacefully with the rest of the stack without requiring "excessive" (for whatever definition of of that term you choose to adopt) branching in code

I did not find it overly difficult to remove deprecated versions using this model:

  • Good test coverage meant that ripping out a retired API and the associated backing code ensured no (well, minimal) regressions
  • Good naming strategy (API-versioned package names, or somewhat uglier, API versions in method names) made it easy to locate the relevant code
  • Cross-cutting concerns are harder; modifications to core backend systems to support multiple APIs have to be very carefully weighed. At some point, the cost of versioning backend (See comment on "excessive" above) outweighs the benefit of a single codebase.

The first approach is certainly simpler from the standpoint of reducing conflict between co-existing versions, but the overhead of maintaining separate systems tended to outweigh the benefit of reducing version conflict. That said, it was dead simple to stand up a new public API stack and start iterating on a separate API branch. Of course, generational loss set in almost immediately, and the branches turned into a mess of merges, merge conflict resolutions, and other such fun.

A third approach is at the architectural layer: adopt a variant of the Facade pattern, and abstract your APIs into public facing, versioned layers that talks to the appropriate Facade instance, which in turn talks to the backend via its own set of APIs. Your Facade (I used an Adapter in my previous project) becomes its own package, self-contained and testable, and allows you to migrate frontend APIs independently of the backend, and of each other.

This will work if your API versions tend to expose the same kinds of resources, but with different structural representations, as in your fullname/forename/surname example. It gets slightly harder if they start relying on different backend computations, as in, "My backend service has returned incorrectly calculated compound interest that has been exposed in public API v1. Our customers have already patched this incorrect behavior. Therefore, I cannot update that computation in the backend and have it apply until v2. Therefore we now need to fork our interest calculation code." Luckily, those tend to be infrequent: practically speaking, consumers of RESTful APIs favor accurate resource representations over bug-for-bug backwards compatibility, even amongst non-breaking changes on a theoretically idempotent GETted resource.

I'll be interested to hear your eventual decision.