Kubernetes for Web Developers: Deploying Containerized Applications | SoniNow Blog

Limited TimeLearn More

kubernetesk8scontainersorchestrationdevops

Kubernetes for Web Developers: Deploying Containerized Applications

Published

2026-06-23

Read Time

3 mins

Kubernetes for Web Developers: Deploying Containerized Applications

Kubernetes has a steep reputation, but the core concepts that web developers need to deploy applications are surprisingly approachable. You don't need to understand the entire control plane or become a cluster administrator to ship a containerized web app on Kubernetes. You just need to know how to model your application as a set of Kubernetes resources.

Pods: The Atomic Unit of Deployment

A Pod is the smallest deployable unit in Kubernetes—one or more containers that share a network namespace and storage. For most web apps, you'll run a single container per Pod with a sidecar for logging, service mesh, or secrets injection:

apiVersion: v1
kind: Pod
metadata:
  name: web-app
  labels:
    app: web
    tier: frontend
spec:
  containers:
    - name: app
      image: myapp:v1.2.3
      ports:
        - containerPort: 3000
      resources:
        requests:
          memory: "256Mi"
          cpu: "250m"
        limits:
          memory: "512Mi"
          cpu: "500m"
      readinessProbe:
        httpGet:
          path: /health
          port: 3000
        initialDelaySeconds: 5

Never run bare Pods in production. Use a Deployment—it manages replica sets, handles rolling updates, and auto-restarts failed Pods.

Services and Ingress: Routing Traffic to Your App

A Service provides a stable network endpoint for a set of Pods. The most common type for web workloads is ClusterIP with an Ingress controller handling external traffic:

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
    tier: frontend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: ClusterIP

The Ingress resource routes external HTTP/HTTPS traffic to that Service. Most production clusters use an NGINX ingress controller or a cloud-provider native one (ALB on AWS, GLB on GCP):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
spec:
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-service
                port:
                  number: 80

ConfigMaps and Secrets: Managing Configuration

Separate configuration from container images using ConfigMaps for non-sensitive data and Secrets for credentials, API keys, and database passwords:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  NODE_ENV: "production"
  LOG_LEVEL: "info"

---
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
stringData:
  DATABASE_URL: "postgresql://..."
  API_KEY: "sk-..."

Reference these in your Deployment:

envFrom:
  - configMapRef:
      name: app-config
  - secretRef:
      name: app-secrets

Secrets are stored in etcd as base64 only—encrypt them at rest by enabling KMS encryption in your cluster.

Horizontal Pod Autoscaling

HPA automatically adjusts replica counts based on CPU, memory, or custom metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Set minReplicas to at least 2 for production to survive node failures. HPA works best when combined with cluster autoscaling so new nodes spin up as Pods increase.

Resource Requests, Limits, and Quality of Service

Request values guarantee your Pod gets at least that much resource. Limits cap what it can use. Misconfigured requests lead to either resource waste or OOM kills:

  • Set CPU requests closer to observed idle usage, not peak usage. CPU is compressible—throttling is preferable to wasted allocation.
  • Set memory requests equal to limits for predictable behavior. Memory is incompressible—overcommit leads to Pods being OOMKilled.
  • Use the Burstable QoS class for most apps. Guaranteed (requests == limits for all resources) is for latency-sensitive services.

Run Containers at Scale with SoniNow

Kubernetes gives web developers a consistent deployment platform that works the same way on a laptop as it does in a multi-region production cluster. Let SoniNow help your team bridge the gap between Docker Compose and production Kubernetes.