Kubernetes — Deployment

A production-ready Kubernetes Deployment manifest with 3 replicas, CPU and memory resource limits, HTTP readiness and liveness health probes, and environment variables sourced from a Kubernetes Secret.

Overview

A Kubernetes Deployment is the standard way to run a stateless application on a cluster. It wraps a ReplicaSet — which ensures the desired number of Pod replicas are always running — and adds rolling-update semantics on top. When you change the image tag or any Pod spec field and apply the manifest, Kubernetes will incrementally replace old Pods with new ones, keeping the service online throughout the update.

This example runs three replicas of a Node.js application behind a /health HTTP endpoint. It declares resource requests and limits so the scheduler can make informed placement decisions and so runaway processes cannot starve other workloads. Sensitive connection strings are injected from a Kubernetes Secret rather than hardcoded in the manifest, following the principle of least privilege.

Apply this manifest with kubectl apply -f deployment.yaml. Check rollout status with kubectl rollout status deployment/my-app and roll back with kubectl rollout undo deployment/my-app if needed.

Full YAML Copy-paste ready

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: default
  labels:
    app: my-app
    version: v1
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
        version: v1
    spec:
      containers:
        - name: my-app
          image: ghcr.io/myorg/my-app:1.2.3
          ports:
            - containerPort: 3000
          env:
            - name: NODE_ENV
              value: production
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: my-app-secrets
                  key: database-url
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 20

Key sections explained

`selector.matchLabels` and pod template labels

The selector.matchLabels block tells the Deployment which Pods it owns. This must exactly match the labels defined in spec.template.metadata.labels — if they diverge, Kubernetes will reject the manifest with a validation error. In this example, both use app: my-app. The label system is how Kubernetes loosely couples resources: Services use the same label selector to decide which Pods receive traffic, and the Deployment uses it to count and manage its replicas.

The version: v1 label is optional but useful for canary deployments and traffic splitting with a service mesh such as Istio. Adding labels to the pod template costs nothing and makes it much easier to filter pods with kubectl get pods -l app=my-app,version=v1.

Resource requests vs. limits

resources.requests tells the Kubernetes scheduler how much CPU and memory this container needs in order to run. The scheduler uses requests — not limits — to decide which node has enough free capacity to host the Pod. resources.limits is the maximum the container is allowed to consume at runtime: if it exceeds the memory limit, the container is OOM-killed; if it exceeds the CPU limit, it is throttled. Setting requests without limits means a container can potentially consume all resources on a node, which is dangerous in a shared cluster.

CPU values are expressed in millicores (100m = 0.1 of one CPU core). Memory values use binary suffixes: Mi = mebibytes, Gi = gibibytes. A good starting point is to set requests low (based on observed idle usage) and limits 2–4x higher to handle traffic spikes. Tune these values using kubectl top pods after your workload has been running under real load.

Readiness probe vs. liveness probe

These two probes serve different purposes and must not be confused. The readiness probe controls whether a Pod receives traffic. When it fails, Kubernetes removes the Pod from the Service's endpoint list — it stops getting requests but keeps running. This is the right way to handle temporary unavailability such as a database reconnect, a cold start, or loading a large model into memory. The liveness probe determines whether a Pod is alive. When it fails, Kubernetes kills and restarts the container. Use the liveness probe only to detect a truly stuck or deadlocked process that cannot recover on its own.

The initialDelaySeconds gives the container time to start before Kubernetes begins probing. Set this to slightly longer than your typical cold-start time. periodSeconds controls how often the check runs. In this manifest, readiness is checked every 10 seconds starting 5 seconds after container start, and liveness is checked every 20 seconds starting at 15 seconds.

`valueFrom.secretKeyRef` for injecting secrets

Instead of hardcoding the database URL in the manifest (which would end up in version control), valueFrom.secretKeyRef pulls the value from a Kubernetes Secret at runtime. The container sees it as a normal environment variable, but the value is never stored in the Deployment YAML. The Secret named my-app-secrets must exist in the same namespace before you apply this Deployment — otherwise, the Pod will fail to start with an InvalidRef error. See the ConfigMap + Secret example for how to create that Secret.

Pinning the image tag

This manifest uses ghcr.io/myorg/my-app:1.2.3 — a specific, immutable tag — rather than :latest. The :latest tag is mutable: a new push can change what image it points to, and your Deployment may silently start running a different version than intended. Pinning to a concrete version, or better yet a full digest like ghcr.io/myorg/my-app@sha256:abc123..., makes deployments reproducible and auditable. Use a CI pipeline with the Docker Build & Push workflow to automate updating the tag on every release.

Tips & variations

Add a rolling update strategy

By default, Kubernetes uses RollingUpdate with maxSurge: 25% and maxUnavailable: 25%. For a 3-replica deployment this means at most 1 Pod can be unavailable and at most 1 extra Pod can be created during an update. To tighten this for zero-downtime deploys, add:

rolling update strategy

  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

Mount a ConfigMap as a volume

If your app reads config from a file rather than environment variables, you can mount a ConfigMap directly as a file inside the container using volumes and volumeMounts. See the ConfigMap + Secret example for the full pattern.

Use a Helm chart for reusability

If you deploy the same application to multiple environments (dev, staging, production), consider templating this Deployment with Helm. The Helm values.yaml example shows how image tags, replica counts, and resource limits can be varied per environment without duplicating the entire manifest.