Kubernetes is no longer new. After several years of production experience across the industry, clear patterns have emerged for successful deployments. Here are the practices that matter in 2019.
Resource Management
Always Set Resource Requests and Limits
Every container needs explicit resource definitions:
containers:
- name: api
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "1000m"
Why this matters:
Requests determine scheduling:
- Kubernetes places pods based on available resources
- Without requests, pods compete unfairly for resources
Limits prevent runaway containers:
- Memory limits trigger OOMKill when exceeded
- CPU limits throttle (consider omitting for latency-sensitive apps)
Right-Size Based on Data
Don’t guess—measure:
# View actual resource usage
kubectl top pods
# Historical usage via Prometheus
container_memory_usage_bytes{pod="api-xxx"}
container_cpu_usage_seconds_total{pod="api-xxx"}
Use Vertical Pod Autoscaler in recommendation mode to get sizing suggestions.
Quality of Service Classes
Kubernetes assigns QoS based on resource configuration:
Guaranteed: requests == limits for all containers
- Last to be evicted
- Predictable performance
Burstable: requests < limits
- Evicted before Guaranteed
- Can burst when resources available
BestEffort: No requests or limits
- First to be evicted
- Avoid in production
Critical services should be Guaranteed or high-priority Burstable.
Pod Design
One Process Per Container
Containers should do one thing:
# Good - separate containers
containers:
- name: api
image: myapp/api
- name: metrics-exporter
image: myapp/metrics
# Bad - multiple processes in one container
containers:
- name: everything
command: ["./run-api-and-metrics.sh"]
Benefits:
- Independent scaling
- Clear resource attribution
- Easier debugging
- Proper lifecycle management
Use Init Containers for Setup
Separate initialization from runtime:
initContainers:
- name: wait-for-db
image: busybox
command: ['sh', '-c', 'until nc -z db 5432; do sleep 1; done']
- name: run-migrations
image: myapp/api
command: ['./migrate']
containers:
- name: api
image: myapp/api
Init containers:
- Run sequentially before main containers
- Must succeed for pod to start
- Don’t count against pod resource limits
Configure Health Checks
Every production pod needs probes:
containers:
- name: api
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
Liveness: Is the process healthy? Failure triggers container restart.
Readiness: Can it handle traffic? Failure removes from service endpoints.
Startup (1.16+): For slow-starting apps, prevents liveness probe failures during startup.
Graceful Shutdown
Handle SIGTERM properly:
containers:
- name: api
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
Application should:
- Stop accepting new requests
- Complete in-flight requests
- Close connections gracefully
- Exit cleanly
Set terminationGracePeriodSeconds appropriately (default 30s).
Configuration Management
Separate Config from Code
Use ConfigMaps for configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: api-config
data:
LOG_LEVEL: "info"
CACHE_TTL: "3600"
---
containers:
- name: api
envFrom:
- configMapRef:
name: api-config
Secrets as Volumes, Not Environment Variables
Environment variables are visible in process listings:
# Better - mount as files
volumes:
- name: secrets
secret:
secretName: api-secrets
containers:
- name: api
volumeMounts:
- name: secrets
mountPath: /etc/secrets
readOnly: true
Applications read from /etc/secrets/ at startup.
Use External Secrets for Production
Kubernetes secrets are base64 encoded, not encrypted:
# External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: api-secrets
spec:
secretStoreRef:
kind: ClusterSecretStore
name: vault
target:
name: api-secrets
data:
- secretKey: db-password
remoteRef:
key: secret/api/db-password
Options: HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault.
Networking
Use Network Policies
Default: all pods can talk to all pods. Change this:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-api-to-db
namespace: production
spec:
podSelector:
matchLabels:
app: database
ingress:
- from:
- podSelector:
matchLabels:
app: api
ports:
- port: 5432
Start with default-deny, add explicit allows.
Service Naming Conventions
DNS is <service>.<namespace>.svc.cluster.local:
# Short form within same namespace
db.default.svc.cluster.local → db
# Cross-namespace requires full name
monitoring.prometheus.svc.cluster.local
Establish naming conventions:
<app>-<component>(api-worker, api-web)- Environment in namespace, not service name
Deployment Strategies
Configure Pod Disruption Budgets
Protect availability during voluntary disruptions:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2 # Or maxUnavailable: 1
selector:
matchLabels:
app: api
Without PDBs, node drains can take down entire services.
Use Deployment Strategies Appropriately
apiVersion: apps/v1
kind: Deployment
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25% # Extra pods during update
maxUnavailable: 25% # Pods that can be unavailable
For zero-downtime with few replicas:
maxSurge: 1
maxUnavailable: 0
Anti-Affinity for High Availability
Spread pods across nodes:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: api
topologyKey: kubernetes.io/hostname
Use requiredDuringSchedulingIgnoredDuringExecution for strict requirements.
Observability
Structured Logging
Logs should be JSON:
{"timestamp": "2019-01-14T10:30:00Z", "level": "info", "message": "Request processed", "request_id": "abc123", "duration_ms": 45}
Kubernetes adds pod metadata automatically. Don’t duplicate in application logs.
Prometheus Metrics
Expose /metrics endpoint:
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
Standard metrics:
http_requests_total(counter)http_request_duration_seconds(histogram)process_*(runtime metrics)
Labels for Cardinality
Good labels enable useful queries:
metadata:
labels:
app: api
version: v1.2.3
environment: production
team: platform
Use consistent labeling across all resources.
Security
Run as Non-Root
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
Read-Only Root Filesystem
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
Drop Capabilities
securityContext:
capabilities:
drop:
- ALL
allowPrivilegeEscalation: false
Use Pod Security Standards
Enforce baseline security:
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
Key Takeaways
- Always set resource requests and limits; base them on measured usage
- Use init containers for setup, sidecars for cross-cutting concerns
- Configure all three probe types: liveness, readiness, startup
- Mount secrets as volumes, not environment variables
- Implement network policies with default-deny
- Use Pod Disruption Budgets for all production workloads
- Spread pods across nodes with anti-affinity rules
- Run as non-root with read-only root filesystem and dropped capabilities
- Use structured logging and expose Prometheus metrics
- Apply consistent labels for observability and management
These practices aren’t optional for production Kubernetes. Apply them from the start.