ML Model Deployment Strategies

Deploying any ML Model to production involves certain challenges which include:

Concept Drift & Data Drift: Concept Drift is basically when the relationship between the training variable & the target output changes whereas data drift is the change in the distribution of the data over time. Both can lead to a decline in the model's performance.
Software Engineering Issues: When we are deploying an ML Model there are certain factors that we need to think about, for example, the latency & the throughput needed, whether to have real-time or batch predictions. How to log the results for monitoring & maintaining the security & privacy of data.

When we train Machine Learning using a specific algorithm, the best way to deploy the model in production depends on a number of factors:

The acceptable downtime of our Machine Learning Solution.
The operation cost & the human involvement in deploying the model.
The ease with which we can roll back the model in case of a drift.
Whether there is a need to test with production traffic or not.

Now that we have understood what are the different challenges in deploying a Model, let's take a look into the different deployment strategies.

Recreate Deployment

In this strategy, we scale down the prior model before scaling up the new model version. Because it takes time to scale down the current model and scale up the new model version, the recreate technique is slow and causes downtime for the ML solution. Because we just have one version of the model, this strategy is incredibly straightforward to use. Recreate Deployment is not a scalable method and is best suited for small-scale applications.
We should use the Recreate Deployment when we can afford downtime with the product or when we don't want the new deployment to be backward compatible.
Example: In Machine Learning applications where we run the predictions in the form of batches.

Shadow Deployment

Shadow Deployment technique is used when we already have an ML model running in production. We used this technique to run the new model alongside the existing one in production. The forecast from the previous model is returned to the application, while the response data from the new model is saved for testing and comparing the outcomes. We require sufficient monitoring to access performance and must operate more servers for the new prediction service.
We should use Shadow Deployment when we want to test the new version across actual production data & at the same time don't disrupt the existing users.
Example: In Machine Learning applications where we want to forecast the business performance or growth, we can use shadow deployment to compare the predicted value from the model & actual growth.

Gradual Ramp-Up with Monitoring

The next 2 deployment strategies involve releasing the model for a certain % of users & then based on the performance monitoring, making it available to 100% of users.

Canary Deployment

In Canary Deployment, we have the old & new versions both running in production & serving the application. The major difference between canary & shadow deployment is that in Shadow, the response data from the new model is used for performance monitoring whereas here it is used to serve the application. The new model version is made available to a minimum set of users and then exposed to the entire set.
We should use Canary Deployment when we want to test the new version across actual production data & at the same time evaluate the existing user's response to the model. It allows us to spot problems early on before there are maybe overly large consequences to the application with no downtime.
Example: In Machine Learning applications that serve as recommendation systems like content or product. We can compare the interactions of different users with different models applied & then determine which was effective in providing recommendations.

A/B Testing Deployment

As the name suggests in A/B Testing Deployment, we have many different versions of the model. We divide the users into different groups based on the number of models we have & then decide the best model based on the performance & the user's interaction. With A/B testing we can discard the low-performing models fast with no downtime.
We can use A/B testing deployment when we have a couple of models which provide almost similar results. With this technique, we can determine the best model using production data & response.
Example: Similar to Canary Deployment, A/B Testing can also be used for recommendation systems like content or product recommendation.

Blue-Green Deployment

The blue-green deployment is accomplished by utilizing an existing prediction service. Then, as the staging environment, we build a new prediction service, the green version. Once the performance and functionality testing in the green environment is completed, we have the router switch traffic from the old to the new. It incurs additional costs due to the upkeep of various settings. The benefit of a blue-green deployment is that it enables simple rollback. If something goes wrong, we may simply reset the router or switch to the blue version to divert traffic.
We can use Blue-Green deployment when the application can afford no downtime and backward compatibility is required.
Example: Real-time prediction system like fraud/anomaly detection.

Below is a brief table of differences based on the explanation of each method and the four main considerations we mentioned above for picking a deployment strategy. I hope this article helps you select the right deployment strategy for your Machine Learning applications.

	Recreate	Shadow	A/B Testing	Canary	Blue/Green
Leads to Downtime	Yes	No	No	No	No
Possibility of Rollback	Yes but with downtime	No need for a rollback	Yes, fast	Yes, fast	Yes, very fast
Testing with production traffic	No	Yes	Yes	Yes	No
Extra costs of deployment	No	Yes, for testing the new model with production data.	No	No	Yes, need to maintain two separate environments.

ML Model Deployment Strategies

Recreate Deployment

Shadow Deployment

Gradual Ramp-Up with Monitoring

Canary Deployment

A/B Testing Deployment

Blue-Green Deployment

Comments

Technology

Decoding Web Scraping with Python

More from this blog

A Leap into the Stars

A Walk down the familiar/unfamiliar path

Happy Teacher's Day

3 years as a Software Engineer

Savandurga Trek

Command Palette

Recreate Deployment

Shadow Deployment

Gradual Ramp-Up with Monitoring

Canary Deployment

A/B Testing Deployment

Blue-Green Deployment

Comments

Technology

Decoding Web Scraping with Python

More from this blog