Deployment Practices – Risks of Pushing Untested Code to Production

deploymentperformancetesting

Firstly, let me clearly state that I am completely against this, I have real issues with unnecessary, obsolete, untested, shitty code, especially when they end up on a production system. My background is in build and release management/devops and I am now working as a Technical Consultant supporting an outsourced development team.

The scenario is this;

We have an MI system that published data from approximately 200 source tables into 200 destination tables by ways of a single 'dynamic' stored procedure that generates and executes MERGE statements (for each table). The content of these MERGE statements are driven by various configuration tables (source to destination at the column level).

While I am not a complete fan of this way of doing things, it has been in place since release 1 (years) and it works and is well tested, and surprisingly well written given what/how it does.

There is a separate project that has a goal of reviewing the system as a whole and making performance improvements where possible. Absolutely no issue with the goal here.

As part of this project, there has been a discussion that perhaps the dynamic MERGE statement might be more performant as a static stored procedure. Perhaps there could be improvements by caching execution details, etc. I am not a DBA, but the general idea sounds feasible. We can do an in-depth analysis of each table, and tune if possible (add additional indexes).

While converting some of the dynamic elements might make sense, perhaps this is not suited for all and adding the artefacts and converting all to static is an overhead that might not be necessary. So in order to support this, we discussed an extension to the existing configuration framework that allows a switch between dynamic and static. All good here so far.

We then go into the subject of testing, and validation of the performance at scale. Generally the approach to this was fine, but there was going to be a delay in getting a suitable environment, with a suitable set of data ( we are talking billions or rows/terabytes of data)

I have recently found out that the intention is to implement the extension to the configuration framework and push this live, with no real testing, and no validation of the performance gains. The icing on the cake to me is that it will not actually be enabled.

Our deployment process is currently fairly complex, and releases to production are done in a standard manner regardless of the payload. There is a release scheduled in a few months and these changes will go live. The next release has no go-live schedule. All changes such as this are done as a DACPAC deployment, so whether there are changes or not, the process will compare the DACPAC to production for each component.

So between this release and next, these changes will exist, untested, but unused, while the analysis is performed. Once complete, the next release will update the configuration data and any supporting artefacts.

I appear to be fighting a losing battle with regards to convincing the (technical) project manager of the third party that this is not a good approach and have simply asked why it could not be released once the analysis has been done, in the next release (if applicable still)

I am generally curious as to whether I am being too subjective about this and its not really a big issue. I have just sent another email to a wide circle of people clearing stating my concerns and it will be interesting to see the response, I am guessing it will be along the lines of we are doing this anyway.

Any thoughts or comments?

Thanks

Best Answer

Your exact technical solution and implementation seem somewhat niche. I did however work for a company where putting something live in a turned off state was common. In our case it was a winforms application with a flag in the database or config file that could turn the feature off and on, but I'm hoping some of my experiences can assist with the problems you're seeing here.

THE GAIN

Once the code is put live, even if it isn't used, it is a tick in a box and the deployment has at least been proven. This however, is the only upside.

THE PAIN

Minor revisions break the new functionality

Once the big change goes in, sometimes even the simplest subsequent change can cause this to unravel. Because the big change isn't actually live, it often gets missed in the testing or isn't considered in later designs.

YAGNI

You shouldn't be writing code unless you need it. If you need it, you should deploy it. If you've delayed turning the new code on, new requirements may have come to light meaning that the code could and should have been written better.

Technical debt

This in some ways is the worst outcome. The new code is never turned on. Bigger priorities come in and the new code is parked. It is too much of a risk to back it out and so it remains in the deployment, clogging up the code base and hogging resource.

Performance

Even though the new feature isn't visible. Sometimes related processing still takes place behind the scenes meaning the application degrades. As far as the user goes this is lose-lose. There is no new functionality and yet the system is performing slower.

The switchover fails

Due to minor subsequent changes or unforeseen circumstances on the live system, sometimes the switchover simply fails. The best scenario here is that you can simply turn it off again but if data is affected you could have a major issue on your hands.

SUMMARY

As with all things of this ilk, if it comes off - it looks like sheer brilliance. The release has gone in ahead of time and a magic switch makes the new functionality just work. But it rarely happens like this. Even if it does, the rare failures should immediately set alarm bells ringing that this is poor deployment practice.

Related Topic