The serverless tutorials show simple functions: receive HTTP request, query database, return response. Real applications are more complex. They have workflows, handle failures, manage state, process data at scale, and integrate with multiple services.
After building several production serverless applications, these are the patterns that work.
API Backend Pattern
The most common serverless pattern: API Gateway + Lambda functions serving HTTP requests.
Client → API Gateway → Lambda → Database/Services → Response
Structure
One function per endpoint keeps functions focused and enables independent scaling. A monolithic “do everything” function loses serverless benefits.
/users GET → list-users-function
/users POST → create-user-function
/users/{id} GET → get-user-function
/users/{id} PUT → update-user-function
Shared code goes in layers or packages. Business logic, validation, database connections—share without duplicating.
Cold Starts
The challenge: cold starts add latency on first invocation after idle period. For user-facing APIs, 500ms-2s cold starts are noticeable.
Mitigations:
- Provisioned concurrency: Keep instances warm (costs money)
- Warmer functions: Scheduled invocation to prevent cooling
- Optimize function size: Smaller functions start faster
- Choose appropriate runtime: Some runtimes (Python, Node) start faster than others (Java, .NET)
Connection Management
Database connections are expensive to establish. Lambda functions potentially create connections per invocation.
Solutions:
- Connection pooling outside handler: Initialize connections globally, reuse across invocations
- RDS Proxy: Connection pooling as a service
- DynamoDB: Connectionless, designed for serverless
# Connection outside handler - reused across invocations
import psycopg2
conn = psycopg2.connect(...)
def handler(event, context):
# Reuse existing connection
cursor = conn.cursor()
...
Event Processing Pattern
Serverless excels at event-driven processing. Events trigger functions that process and emit further events.
S3 Upload → Lambda → Process → Store results
SQS Message → Lambda → Transform → Write to database
DynamoDB Stream → Lambda → Sync to search index
Fan-Out Pattern
Single event triggers multiple parallel processors:
┌→ Thumbnail Lambda → S3
Image Upload → → Metadata Lambda → DynamoDB
└→ ML Analysis Lambda → Results DB
Each processor handles one concern, scales independently.
Choreography vs. Orchestration
Choreography: Services react to events independently. No central coordinator. Each service knows only about events it cares about.
Orchestration: Central coordinator (Step Functions) manages workflow. Explicit control flow, easier to understand complex processes.
For simple flows, choreography is lighter weight. For complex, multi-step processes with conditional logic, orchestration is clearer.
Step Functions Pattern
AWS Step Functions orchestrate complex workflows with state machines.
{
"StartAt": "ValidateOrder",
"States": {
"ValidateOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:validate-order",
"Next": "CheckInventory"
},
"CheckInventory": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:check-inventory",
"Next": "ProcessPayment"
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:process-payment",
"Catch": [{
"ErrorEquals": ["PaymentFailed"],
"Next": "HandlePaymentFailure"
}],
"Next": "FulfillOrder"
}
...
}
}
Benefits:
- Visual workflow representation
- Built-in retry and error handling
- State management without custom code
- Long-running processes (up to 1 year)
Use cases:
- Order processing pipelines
- ETL workflows
- Approval processes
- Saga pattern implementation
Data Pipeline Pattern
Processing data at scale: ingest, transform, aggregate, store.
Kinesis → Lambda → Transform → DynamoDB/S3
↓
(Fan out to multiple consumers)
Stream Processing
Kinesis/Lambda integration:
def handler(event, context):
for record in event['Records']:
payload = base64.b64decode(record['kinesis']['data'])
process_record(payload)
Lambda polls Kinesis, processes batches of records. Automatic scaling based on shard count.
Considerations:
- Configure batch size and window for efficiency
- Handle partial batch failures
- Monitor iterator age (lag behind stream)
Batch Processing
Large-scale batch jobs that don’t fit Lambda’s limits:
- S3 triggers + Step Functions: Orchestrate processing of large files
- AWS Batch: For long-running, compute-intensive jobs
- Lambda + SQS: Distribute work across many concurrent functions
Scheduled Tasks Pattern
Cron-like execution for periodic tasks:
CloudWatch Events → Lambda (scheduled)
↓
Cleanup, Reports, Sync
# Serverless Framework
functions:
dailyReport:
handler: reports.daily
events:
- schedule: cron(0 8 * * ? *) # Daily at 8 AM
Use cases:
- Database cleanup
- Report generation
- Data synchronization
- Health checks
Considerations:
- Lambda has 15-minute timeout
- For longer jobs, trigger Step Functions or AWS Batch
- Handle missed executions (exactly-once isn’t guaranteed)
Webhooks Pattern
Receiving webhooks from external services:
External Service → API Gateway → Lambda → Process → Respond 200
↓
Queue for async processing
Best practices:
- Respond quickly (external services have timeouts)
- Validate webhook signatures
- Queue events for async processing if needed
- Implement idempotency (webhooks may retry)
def handler(event, context):
# Verify signature first
if not verify_signature(event):
return {'statusCode': 401}
# Quick processing or queue for later
body = json.loads(event['body'])
sqs.send_message(
QueueUrl=QUEUE_URL,
MessageBody=json.dumps(body)
)
return {'statusCode': 200}
Multi-Tenant Pattern
Serving multiple customers with isolation:
Tenant identification: From authentication token, request header, or subdomain.
Resource isolation options:
- Shared resources: All tenants use same tables/functions with tenant ID filtering (simplest, least isolated)
- Separate tables: Each tenant has own database tables
- Separate accounts: Full AWS account per tenant (maximum isolation, most complex)
Scaling considerations:
- Lambda scales automatically per tenant load
- DynamoDB tables can have per-tenant capacity
- Consider tenant-specific rate limiting
Error Handling Patterns
Dead Letter Queues
Failed events go to DLQ for investigation and retry:
functions:
processEvent:
handler: process.handler
events:
- sqs:
arn: !GetAtt Queue.Arn
onError: !GetAtt DeadLetterQueue.Arn
Retry with Backoff
Lambda automatically retries failed asynchronous invocations. For custom retry logic:
def handler(event, context):
retry_count = event.get('retry_count', 0)
try:
process(event)
except TemporaryError:
if retry_count < MAX_RETRIES:
# Re-queue with backoff
delay = min(2 ** retry_count, MAX_DELAY)
event['retry_count'] = retry_count + 1
sqs.send_message(
QueueUrl=QUEUE_URL,
MessageBody=json.dumps(event),
DelaySeconds=delay
)
else:
# Send to DLQ
send_to_dlq(event)
Circuit Breaker
When downstream services fail, stop calling them:
circuit_breaker = CircuitBreaker(
failure_threshold=5,
recovery_timeout=60
)
def handler(event, context):
if circuit_breaker.is_open():
return fallback_response()
try:
result = call_downstream_service()
circuit_breaker.record_success()
return result
except Exception:
circuit_breaker.record_failure()
raise
Key Takeaways
- Structure APIs as one function per endpoint with shared code in layers
- Handle cold starts with provisioned concurrency, warmers, or optimized function size
- Use Step Functions for complex workflows with branching and error handling
- Stream processing with Kinesis/Lambda scales automatically
- Respond quickly to webhooks; queue for async processing
- Implement robust error handling with DLQs, retries, and circuit breakers
- Choose patterns based on your specific needs; serverless isn’t one-size-fits-all