Kubernetes — Deployment
A production-ready Kubernetes Deployment manifest with 3 replicas, CPU and memory resource limits, HTTP readiness and liveness health probes, and environment variables sourced from a Kubernetes Secret.
Overview
A Kubernetes Deployment is the standard way to run a stateless application on a cluster. It wraps a ReplicaSet — which ensures the desired number of Pod replicas are always running — and adds rolling-update semantics on top. When you change the image tag or any Pod spec field and apply the manifest, Kubernetes will incrementally replace old Pods with new ones, keeping the service online throughout the update.
This example runs three replicas of a Node.js application behind a /health HTTP endpoint.
It declares resource requests and limits so the scheduler can make informed placement decisions and so
runaway processes cannot starve other workloads. Sensitive connection strings are injected from a
Kubernetes Secret rather than hardcoded in the manifest, following the principle of least privilege.
kubectl apply -f deployment.yaml. Check rollout status with
kubectl rollout status deployment/my-app and roll back with
kubectl rollout undo deployment/my-app if needed.
Full YAML Copy-paste ready
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: default
labels:
app: my-app
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
version: v1
spec:
containers:
- name: my-app
image: ghcr.io/myorg/my-app:1.2.3
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: production
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: my-app-secrets
key: database-url
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
Key sections explained
selector.matchLabels and pod template labels
The selector.matchLabels block tells the Deployment which Pods it owns. This must exactly
match the labels defined in spec.template.metadata.labels — if they diverge, Kubernetes
will reject the manifest with a validation error. In this example, both use app: my-app.
The label system is how Kubernetes loosely couples resources: Services use the same label selector to
decide which Pods receive traffic, and the Deployment uses it to count and manage its replicas.
The version: v1 label is optional but useful for canary deployments and traffic splitting
with a service mesh such as Istio. Adding labels to the pod template costs nothing and makes it much
easier to filter pods with kubectl get pods -l app=my-app,version=v1.
Resource requests vs. limits
resources.requests tells the Kubernetes scheduler how much CPU and memory this container
needs in order to run. The scheduler uses requests — not limits — to decide which node has enough free
capacity to host the Pod. resources.limits is the maximum the container is allowed to
consume at runtime: if it exceeds the memory limit, the container is OOM-killed; if it exceeds the CPU
limit, it is throttled. Setting requests without limits means a container can potentially consume all
resources on a node, which is dangerous in a shared cluster.
CPU values are expressed in millicores (100m = 0.1 of one CPU core). Memory values use
binary suffixes: Mi = mebibytes, Gi = gibibytes. A good starting point is
to set requests low (based on observed idle usage) and limits 2–4x higher to handle traffic spikes.
Tune these values using kubectl top pods after your workload has been running under real load.
Readiness probe vs. liveness probe
These two probes serve different purposes and must not be confused. The readiness probe controls whether a Pod receives traffic. When it fails, Kubernetes removes the Pod from the Service's endpoint list — it stops getting requests but keeps running. This is the right way to handle temporary unavailability such as a database reconnect, a cold start, or loading a large model into memory. The liveness probe determines whether a Pod is alive. When it fails, Kubernetes kills and restarts the container. Use the liveness probe only to detect a truly stuck or deadlocked process that cannot recover on its own.
The initialDelaySeconds gives the container time to start before Kubernetes begins probing.
Set this to slightly longer than your typical cold-start time. periodSeconds controls how
often the check runs. In this manifest, readiness is checked every 10 seconds starting 5 seconds after
container start, and liveness is checked every 20 seconds starting at 15 seconds.
valueFrom.secretKeyRef for injecting secrets
Instead of hardcoding the database URL in the manifest (which would end up in version control),
valueFrom.secretKeyRef pulls the value from a Kubernetes Secret at runtime. The container
sees it as a normal environment variable, but the value is never stored in the Deployment YAML.
The Secret named my-app-secrets must exist in the same namespace before you apply this
Deployment — otherwise, the Pod will fail to start with an InvalidRef error. See the
ConfigMap + Secret example for
how to create that Secret.
Pinning the image tag
This manifest uses ghcr.io/myorg/my-app:1.2.3 — a specific, immutable tag — rather than
:latest. The :latest tag is mutable: a new push can change what image it
points to, and your Deployment may silently start running a different version than intended. Pinning
to a concrete version, or better yet a full digest like
ghcr.io/myorg/my-app@sha256:abc123..., makes deployments reproducible and auditable.
Use a CI pipeline with the
Docker Build & Push workflow
to automate updating the tag on every release.
Tips & variations
Add a rolling update strategy
By default, Kubernetes uses RollingUpdate with maxSurge: 25% and
maxUnavailable: 25%. For a 3-replica deployment this means at most 1 Pod can be
unavailable and at most 1 extra Pod can be created during an update. To tighten this for zero-downtime
deploys, add:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
Mount a ConfigMap as a volume
If your app reads config from a file rather than environment variables, you can mount a ConfigMap
directly as a file inside the container using volumes and volumeMounts.
See the ConfigMap + Secret example
for the full pattern.
Use a Helm chart for reusability
If you deploy the same application to multiple environments (dev, staging, production), consider templating this Deployment with Helm. The Helm values.yaml example shows how image tags, replica counts, and resource limits can be varied per environment without duplicating the entire manifest.