Thumbnail

Most Common Kubernetes Deployment Strategies

Deploying applications reliably in Kubernetes requires more than just pushing a new image. Choosing the right deployment strategy can mean the difference between seamless updates and costly downtime. In this post, I'll walk through the most common Kubernetes deployment strategies, when to use each one, and how to configure them.

Prerequisites

Before diving in, you should be familiar with the following:

  • Basic Kubernetes concepts: Pods, Deployments, Services, and Labels
  • How to use kubectl to manage resources
  • YAML manifest syntax for Kubernetes objects

Goals

By the end of this guide, you will:

  • Understand six different deployment strategies and how they work
  • Know which strategy fits each use case
  • Have ready-to-use YAML examples for each approach

Rolling Update (Default Strategy)

The rolling update is the default strategy in Kubernetes and one of the most widely used. It allows you to upgrade your application gradually, replacing old pods with new ones without causing downtime.

If you don't configure a strategy section in your deployment manifest, Kubernetes applies these defaults:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 25%
    maxSurge: 25%

What do these parameters mean?

  • maxUnavailable is the maximum number of pods that can be down at any given time during the update. With 4 replicas, only 1 pod can be unavailable at a time.
  • maxSurge is the maximum number of additional pods that can be created above the desired replica count. With 4 replicas, at most 1 extra pod can exist during the rollout.

You can also use integers instead of percentages for more precise control. For example, setting both to 1 gives you a very slow but safe rollout, which is useful when you want to minimize risk at the cost of speed.

Important: Health checks (startup, readiness, and liveness probes) are essential for rolling updates to work correctly. Without them, Kubernetes won't know when a pod is truly ready to serve traffic.

Useful Rollout Commands

The kubectl rollout subcommand is your friend during upgrades:

# Check upgrade status
kubectl rollout status deployment/my-app

# Pause the rollout (e.g., if you spot a bug mid-deploy)
kubectl rollout pause deployment/my-app

# Resume a paused rollout
kubectl rollout resume deployment/my-app

# Roll back to the previous version
kubectl rollout undo deployment/my-app

# Restart all pods (useful in blue-green deployments)
kubectl rollout restart deployment/my-app

These commands work with Deployment, StatefulSet, and DaemonSet objects. In practice, many teams prefer a GitOps approach (e.g., ArgoCD or Flux) for rollbacks, but these built-in commands are invaluable when building Kubernetes operators or responding to incidents.

Recreate Strategy

The recreate strategy is straightforward but potentially disruptive: when an upgrade is triggered, all existing pods are immediately terminated before new ones are created.

strategy:
  type: Recreate

When does this make sense?

  • Development environments, where downtime is acceptable and you want fast, clean deploys every time you build a new image.
  • Resource-constrained nodes, where the node can only run a single pod. A rolling update would leave the new pod stuck in Pending. Recreate terminates the old pod first.
  • ReadWriteOnce volumes, where some volumes (like those used by Grafana) can only be attached to one pod at a time. A rolling update would fail trying to attach the volume to a second pod before the first is gone.

Note: If you're running a stateful application, you'd typically use a StatefulSet instead of a Deployment, but the recreate pattern still applies. The key idea is the same: terminate everything first, then start fresh.

Blue-Green Deployment

In larger organizations, minimizing risk during production updates is critical. Blue-green deployments address this by maintaining two separate environments, "blue" (currently live) and "green" (the new version), and switching traffic between them.

The color names are arbitrary. What matters is the pattern: one deployment serves real traffic while the other is on standby, ready to take over.

How It Works with Native Kubernetes

Each pod gets an additional label (e.g., replica: blue or replica: green). The Service selector points to only one label at a time:

# Service selector pointing to blue
selector:
  app: my-app
  replica: blue

When the green version is ready and verified (by your QA team or automated tests), switching traffic is as simple as updating the selector:

selector:
  app: my-app
  replica: green

Kubernetes updates the Endpoints object and immediately redirects traffic. If anything goes wrong, you switch back to blue just as easily. Once you're confident in the new version, you can tear down the old deployment.

Bonus use case: Blue-green is also useful when you need even distribution of connections across instances. With a rolling update, new instances may receive zero connections initially while old ones stay saturated.

The main downside is resource overhead. You need to run two complete instances of your application at the same time, and only one of them is serving real traffic.

Canary Deployment

A canary deployment lets you shift only a small percentage of traffic to a new version, observe its behavior, and gradually increase traffic if everything looks healthy.

This approach minimizes blast radius. If a bug exists, only a fraction of your users are affected.

Manual Canary with Native Kubernetes

With native Kubernetes, you control traffic distribution by adjusting replica counts across two deployments sharing the same Service:

# 10 pods for v1 + 1 pod for v2 = ~10% traffic to v2
v1 deployment: replicas: 10
v2 (canary):   replicas: 1

# Shift to 50%
v1 deployment: replicas: 10
v2 (canary):   replicas: 10

# Complete the rollout
v1 deployment: replicas: 0
v2 (canary):   replicas: 10

This works but is cumbersome. You're managing traffic by scaling pods rather than by percentage. For more fine-grained control, tools like Argo Rollouts or Flagger let you define canary steps with weight-based routing:

# Argo Rollouts example
spec:
  replicas: 5
  strategy:
    canary:
      steps:
        - setWeight: 20
        - pause: { duration: 2m }
        - setWeight: 40
        - pause: { duration: 5m }
        - setWeight: 100

This approach is much cleaner. Instead of juggling replica counts, you define explicit traffic percentages and pause between each step to run tests or observe metrics before proceeding.

Tip: Canary deployments are only as good as your observability. Make sure you have monitoring in place (Prometheus, Grafana, or similar) to detect issues during the rollout. Without metrics, you're flying blind.

A/B Testing Deployment

An A/B testing deployment is similar to canary in that you run two versions side by side, but the traffic routing is based on request attributes rather than percentages. You route users to different versions based on things like headers, cookies, geographic location, or user segments.

This requires a service mesh or ingress controller like Istio since native Kubernetes Services don't support header-based routing.

Example with Istio

Suppose you want to route internal testers to a new version while keeping production users on the stable one. You can define a VirtualService that matches on a custom header:

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: my-app
spec:
  hosts:
    - my-app
  http:
    - match:
        - headers:
            end-user:
              exact: testing
      route:
        - destination:
            host: my-app
            subset: v2
    - route:
        - destination:
            host: my-app
            subset: v1

Requests with the end-user: testing header go to v2, while everything else stays on v1. This is powerful for compliance scenarios as well, where you might need to route users from specific regions to deployments that meet local data regulations.

The tradeoff is complexity. You need Istio or a similar tool running in your cluster, which adds operational overhead.

Shadow Deployment

A shadow deployment (sometimes called traffic mirroring) runs the new version alongside the production version, but the new version never serves responses to real users. Instead, production traffic is duplicated and sent to the shadow instance for testing purposes. The shadow's responses are analyzed and then discarded.

This is useful when you want to validate a new version against real-world traffic patterns without any risk to users.

Example with Istio

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: my-app
spec:
  hosts:
    - my-app
  http:
    - route:
        - destination:
            host: my-app
            subset: v1
      mirror:
        host: my-app
        subset: v2
      mirrorPercentage:
        value: 100.0

In this example, 100% of traffic is mirrored to v2, but all real responses come from v1. You can adjust mirrorPercentage to mirror only a portion of traffic if you want to reduce load on the shadow instance.

Note: Shadow deployments are great for catching performance regressions and unexpected errors, but they won't help you test features that depend on user interaction (like UI changes). They work best for backend services, APIs, and data processing pipelines.

Choosing the Right Strategy

The right strategy depends on your application's requirements, your team's risk tolerance, and the tools you have available. Here's a quick comparison:

Strategy Downtime Risk Complexity Best For
Rolling Update None Low to Medium Low Most production workloads
Recreate Yes High Low Dev environments, RWO volumes
Blue-Green None Low Medium High-stakes production updates
Canary None Very Low Medium to High Gradual, data-driven rollouts
A/B Testing None Low High User-segment testing, compliance
Shadow None Very Low High Validating against real traffic

Which Strategy for Which Use Case?

Stateless applications are the simplest case. A rolling update works well since there's no state to keep in sync across pods.

Stateful applications often benefit from the recreate strategy to ensure all instances run the same version. In most cases, you'd also want a StatefulSet instead of a Deployment.

High-traffic applications that can't afford to route all users to an untested version should consider canary deployments. The ability to shift traffic gradually gives you time to detect issues before they affect everyone.

Mission-critical and zero-downtime applications are good candidates for blue-green deployments. You can validate the new version completely before switching traffic. Shadow deployments also fit here, especially for backend services where you want to test with real traffic patterns without any user impact.

Batch processing and background jobs usually don't need anything fancy. A recreate or rolling update is typically sufficient since there's less concern about user-facing downtime.

Conclusion

There's no single "best" deployment strategy. Each one makes a tradeoff between simplicity, risk, resource cost, and the level of control you get over the rollout process. For most teams, the default rolling update covers the majority of use cases. When you need more control, canary and blue-green strategies are the natural next steps. A/B testing and shadow deployments add power but also require tools like Istio or Argo Rollouts.

The key takeaway is this: always pair your deployment strategy with proper health checks and observability. No strategy can save you if you can't detect that something went wrong.

Strategy When to Reach for It
Rolling Update Default choice, works for most workloads
Recreate Single-instance apps, RWO volumes, dev environments
Blue-Green Zero-risk cutover with full pre-validation
Canary Gradual rollout with metrics-driven promotion
A/B Testing Route traffic by user attributes or headers
Shadow Test with real traffic, zero user impact

Comments