Blue-Green Deployment in a Data Architecture

3 min readMay 15, 2023

Blue-green deployment is a technique used in software deployment, mainly with micro-services, that reduces downtime and risk of failure by deploying new code in parallel with the old code.
Once the new code has been tested and verified in the blue environment, traffic is redirected, some time gradually, from the old environment (green) to the new one. This allows for the updated code to be released with minimal disruption to the end-users. One of the benefits of blue-green deployment is the ability to roll back quickly in the event of any issues with the new code.

Imagine you have a super cool smartphone app, and you are deploying a new version of the super critical micro-service that runs on the backend.
You can deploy it in the blue env, and gradually rollout all the calls from the green env (that still runs old code) to the blue. You can start with some test calls.. Maybe coming from test accounts (this can be done with proper call tagging) or you can gradually do the switch. Like start with 1% of the calls and then go on with the rest.
This obviously offers enormous advantages, you can early detect bugs and in case of problems, you can quickly roll out to the old version.

But what happens if you would like to use this technique in the data domain. So imagine you have data arriving from various sources, DBs, CRMs, etc. You haver a single point of ingestion for those data. So let’s say you are using PubSub with Dataflow, or maybe Kafka with Kafka Streams or whatever.
You cannot redirect the traffic gradually, or precisely, you cannot do it easily. Even if you do it, it can lead to serious issues. Imagine that row-id 123 is the first one that goes through your blue dataflow job. It will be committed to PubSub and will not be available to be resubmitted into pipeline green in case of an issue. Then extend your issue to a thousand of records every second, and here you have the biggest problem of your life.

What you should do here is find another solution. What I was thinking is that, in this case, the blue pipeline can for a short period of time just be a parallel pipeline that simply reads duplicated data that is also going to pipeline green. You may say, but this will lead to duplicated records down the road. Totally true, that’s why you should handle this, pipeline blue can be configured to write on parallel structures as well. This will give you the chance to also make automatic diff checking on both te results. Once you reach a configured level of confidence, the pipeline blue can replace the green!

I think that this one is the only way you have to test new features on critical jobs like streaming jobs. Yes, you have a TEST env for sure to do tests. And that’s great! You should do unit test and then all kind of tests in the TEST env. But if like me, you do not have the same load and variety of data… well, you only have one choice…

What do you think? Is this something you would do? Please let me know by commenting this article! I am always looking for different points of view!

Blue-Green Deployment in a Data Architecture

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Data Intensive Dreamer

No responses yet