Kubernetes Best Practices for 2019

January 14, 2019

Kubernetes is no longer new. After several years of production experience across the industry, clear patterns have emerged for successful deployments. Here are the practices that matter in 2019.

Resource Management

Always Set Resource Requests and Limits

Every container needs explicit resource definitions:

containers:
- name: api
  resources:
    requests:
      memory: "256Mi"
      cpu: "100m"
    limits:
      memory: "512Mi"
      cpu: "1000m"

Why this matters:

Requests determine scheduling:

Limits prevent runaway containers:

Right-Size Based on Data

Don’t guess—measure:

# View actual resource usage
kubectl top pods

# Historical usage via Prometheus
container_memory_usage_bytes{pod="api-xxx"}
container_cpu_usage_seconds_total{pod="api-xxx"}

Use Vertical Pod Autoscaler in recommendation mode to get sizing suggestions.

Quality of Service Classes

Kubernetes assigns QoS based on resource configuration:

Guaranteed: requests == limits for all containers

Burstable: requests < limits

BestEffort: No requests or limits

Critical services should be Guaranteed or high-priority Burstable.

Pod Design

One Process Per Container

Containers should do one thing:

# Good - separate containers
containers:
- name: api
  image: myapp/api
- name: metrics-exporter
  image: myapp/metrics

# Bad - multiple processes in one container
containers:
- name: everything
  command: ["./run-api-and-metrics.sh"]

Benefits:

Use Init Containers for Setup

Separate initialization from runtime:

initContainers:
- name: wait-for-db
  image: busybox
  command: ['sh', '-c', 'until nc -z db 5432; do sleep 1; done']

- name: run-migrations
  image: myapp/api
  command: ['./migrate']

containers:
- name: api
  image: myapp/api

Init containers:

Configure Health Checks

Every production pod needs probes:

containers:
- name: api
  livenessProbe:
    httpGet:
      path: /healthz
      port: 8080
    initialDelaySeconds: 15
    periodSeconds: 10
    failureThreshold: 3

  readinessProbe:
    httpGet:
      path: /ready
      port: 8080
    initialDelaySeconds: 5
    periodSeconds: 5

  startupProbe:
    httpGet:
      path: /healthz
      port: 8080
    failureThreshold: 30
    periodSeconds: 10

Liveness: Is the process healthy? Failure triggers container restart.

Readiness: Can it handle traffic? Failure removes from service endpoints.

Startup (1.16+): For slow-starting apps, prevents liveness probe failures during startup.

Graceful Shutdown

Handle SIGTERM properly:

containers:
- name: api
  lifecycle:
    preStop:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]

Application should:

  1. Stop accepting new requests
  2. Complete in-flight requests
  3. Close connections gracefully
  4. Exit cleanly

Set terminationGracePeriodSeconds appropriately (default 30s).

Configuration Management

Separate Config from Code

Use ConfigMaps for configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
data:
  LOG_LEVEL: "info"
  CACHE_TTL: "3600"
---
containers:
- name: api
  envFrom:
  - configMapRef:
      name: api-config

Secrets as Volumes, Not Environment Variables

Environment variables are visible in process listings:

# Better - mount as files
volumes:
- name: secrets
  secret:
    secretName: api-secrets
containers:
- name: api
  volumeMounts:
  - name: secrets
    mountPath: /etc/secrets
    readOnly: true

Applications read from /etc/secrets/ at startup.

Use External Secrets for Production

Kubernetes secrets are base64 encoded, not encrypted:

# External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: api-secrets
spec:
  secretStoreRef:
    kind: ClusterSecretStore
    name: vault
  target:
    name: api-secrets
  data:
  - secretKey: db-password
    remoteRef:
      key: secret/api/db-password

Options: HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault.

Networking

Use Network Policies

Default: all pods can talk to all pods. Change this:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-api-to-db
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: database
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: api
    ports:
    - port: 5432

Start with default-deny, add explicit allows.

Service Naming Conventions

DNS is <service>.<namespace>.svc.cluster.local:

# Short form within same namespace
db.default.svc.cluster.local → db

# Cross-namespace requires full name
monitoring.prometheus.svc.cluster.local

Establish naming conventions:

Deployment Strategies

Configure Pod Disruption Budgets

Protect availability during voluntary disruptions:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 2  # Or maxUnavailable: 1
  selector:
    matchLabels:
      app: api

Without PDBs, node drains can take down entire services.

Use Deployment Strategies Appropriately

apiVersion: apps/v1
kind: Deployment
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%        # Extra pods during update
      maxUnavailable: 25%  # Pods that can be unavailable

For zero-downtime with few replicas:

maxSurge: 1
maxUnavailable: 0

Anti-Affinity for High Availability

Spread pods across nodes:

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            app: api
        topologyKey: kubernetes.io/hostname

Use requiredDuringSchedulingIgnoredDuringExecution for strict requirements.

Observability

Structured Logging

Logs should be JSON:

{"timestamp": "2019-01-14T10:30:00Z", "level": "info", "message": "Request processed", "request_id": "abc123", "duration_ms": 45}

Kubernetes adds pod metadata automatically. Don’t duplicate in application logs.

Prometheus Metrics

Expose /metrics endpoint:

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"

Standard metrics:

Labels for Cardinality

Good labels enable useful queries:

metadata:
  labels:
    app: api
    version: v1.2.3
    environment: production
    team: platform

Use consistent labeling across all resources.

Security

Run as Non-Root

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 1000

Read-Only Root Filesystem

securityContext:
  readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
  mountPath: /tmp
volumes:
- name: tmp
  emptyDir: {}

Drop Capabilities

securityContext:
  capabilities:
    drop:
    - ALL
  allowPrivilegeEscalation: false

Use Pod Security Standards

Enforce baseline security:

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted

Key Takeaways

These practices aren’t optional for production Kubernetes. Apply them from the start.