Serverless Patterns and Anti-Patterns

Serverless computing has matured significantly. AWS Lambda is now a proven production platform, and competitors like Azure Functions and Google Cloud Functions are viable options. But serverless has specific patterns that work well and anti-patterns that cause problems.

Here’s what we’ve learned from running serverless in production.

Patterns That Work

Event Processing

Serverless excels at event-driven workloads:

# S3 trigger - process uploaded files
def process_upload(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    # Process file
    file = s3.get_object(Bucket=bucket, Key=key)
    process(file['Body'].read())

Good fits:

File processing (upload → transform → store)
Message queue processing
Webhook handlers
IoT data ingestion
Log processing

Why it works:

Natural event-driven model
Scale automatically with event volume
Pay only for processing time
No idle resources

Scheduled Tasks

Cron jobs without managing servers:

# CloudWatch Events trigger
functions:
  dailyReport:
    handler: reports.daily
    events:
      - schedule: cron(0 9 * * ? *)  # 9 AM UTC daily

Good fits:

Report generation
Data cleanup
Batch processing
Sync jobs
Health checks

API Backends with Variable Load

APIs with unpredictable traffic:

functions:
  api:
    handler: api.handler
    events:
      - http:
          path: /users
          method: get
      - http:
          path: /users/{id}
          method: get

Good fits:

Internal tools with sporadic usage
Startups with unpredictable growth
APIs with spiky traffic patterns
Prototypes and MVPs

Glue Logic

Small pieces connecting systems:

# DynamoDB stream → Elasticsearch sync
def sync_to_elasticsearch(event, context):
    for record in event['Records']:
        if record['eventName'] in ['INSERT', 'MODIFY']:
            doc = record['dynamodb']['NewImage']
            es.index(index='products', id=doc['id'], body=doc)
        elif record['eventName'] == 'REMOVE':
            es.delete(index='products', id=record['dynamodb']['Keys']['id'])

Good fits:

Data sync between systems
Notification fanout
Format transformation
Integration webhooks

Anti-Patterns to Avoid

Long-Running Processes

Lambda has execution time limits (15 minutes max):

# Bad - might timeout
def process_large_dataset(event, context):
    for record in get_all_records():  # Millions of records
        process(record)

Solutions:

Step Functions for orchestration
SQS for work distribution
EC2/ECS for long-running tasks
Break into smaller chunks

Monolithic Functions

Single function doing everything:

# Bad - monolithic handler
def handler(event, context):
    if event['path'] == '/users':
        if event['method'] == 'GET':
            return get_users()
        elif event['method'] == 'POST':
            return create_user()
    elif event['path'] == '/orders':
        # ... hundreds more lines

Better:

One function per operation
Organized by domain
Shared code in layers

VPC for Everything

VPC adds cold start latency (can add seconds):

# Only use VPC when necessary
functions:
  publicApi:
    handler: public.handler
    # No VPC - faster cold starts

  databaseAccess:
    handler: private.handler
    vpc:
      securityGroupIds:
        - sg-123
      subnetIds:
        - subnet-456

Only use VPC when accessing private resources. For public APIs and services, avoid it.

Ignoring Cold Starts

Cold starts affect latency:

Cold start: 500ms - 5s (depends on runtime, VPC, etc.)
Warm invocation: 10-50ms

Mitigation strategies:

# Provisioned concurrency
functions:
  api:
    handler: api.handler
    provisionedConcurrency: 10

# Or warm with scheduled pings
functions:
  warmer:
    handler: warmer.handler
    events:
      - schedule: rate(5 minutes)

Synchronous Call Chains

Lambda calling Lambda calling Lambda:

# Bad - synchronous chain
def order_handler(event, context):
    user = lambda_client.invoke(FunctionName='get-user')
    inventory = lambda_client.invoke(FunctionName='check-inventory')
    payment = lambda_client.invoke(FunctionName='process-payment')
    # Each hop adds latency and failure points

Better patterns:

Async via SNS/SQS
Step Functions for workflows
Single function for simple logic

Stateful Functions

Lambda functions are stateless:

# Bad - state won't persist
connection = None

def handler(event, context):
    global connection
    if connection is None:
        connection = create_connection()  # Might not be reused

Reality:

Execution context may be reused (warm start)
But you can’t rely on it
Don’t store state that must persist

Use external state (DynamoDB, ElastiCache, S3) for anything important.

Cost Patterns

Understand the Pricing Model

Lambda charges:

Per request ($0.20 per million)
Per compute time ($0.0000166667 per GB-second)

Cost = Requests × $0.20/million + Duration × Memory × $0.0000166667/GB-second

Right-Size Memory

More memory = faster execution = might be cheaper:

128MB  × 1000ms = 0.125 GB-seconds = $0.0000020833
512MB  × 300ms  = 0.15 GB-seconds  = $0.0000025
1024MB × 150ms  = 0.15 GB-seconds  = $0.0000025

Test different memory sizes. Sometimes more memory is more cost-effective.

High-Volume Can Be Expensive

At high volumes, serverless can cost more than traditional:

1 million requests/day × 500ms × 512MB
= ~260 GB-hours/day
= ~$4.17/day
= ~$125/month (Lambda only)

vs.

t3.medium: ~$30/month (always on)

Calculate breakeven. High-volume, consistent load often cheaper on containers.

Operational Patterns

Structured Logging

CloudWatch Logs work best with structured logs:

import json

def handler(event, context):
    logger.info(json.dumps({
        'request_id': context.aws_request_id,
        'event': 'order_created',
        'order_id': order.id,
        'customer_id': customer.id,
        'total': order.total
    }))

Then query with CloudWatch Insights:

fields @timestamp, order_id, total
| filter event = 'order_created'
| stats sum(total) by bin(1h)

Correlation IDs

Track requests across functions:

def handler(event, context):
    correlation_id = event.get('correlation_id') or str(uuid.uuid4())

    # Include in all logs
    logger = logger.bind(correlation_id=correlation_id)

    # Propagate to downstream calls
    invoke_downstream(correlation_id=correlation_id)

Error Handling

Lambda retries on failure (for async invocations):

def handler(event, context):
    try:
        process(event)
    except RetryableError:
        raise  # Lambda will retry
    except PermanentError as e:
        # Don't retry, send to DLQ
        send_to_dlq(event, e)
        return  # Return success to prevent retry

Configure dead-letter queues for failed invocations:

functions:
  processor:
    handler: process.handler
    onError: arn:aws:sqs:region:account:dlq

Deployment Strategies

# Canary deployment
functions:
  api:
    handler: api.handler
    deploymentSettings:
      type: Canary10Percent5Minutes

Or use aliases and weighted routing:

# Shift 10% traffic to new version
aws lambda update-alias --name prod --function-name my-function \
    --routing-config AdditionalVersionWeights={"2"=0.1}

When Not to Use Serverless

Latency-Sensitive Applications

Cold starts add latency
Provisioned concurrency helps but adds cost
Consider containers if sub-100ms P99 required

Long-Running Workloads

15-minute limit is hard
Step Functions add complexity
Containers or EC2 better for batch processing

Heavy Compute

Lambda CPU scales with memory
Maximum 10GB memory
GPU not available
Use EC2/ECS for compute-intensive work

Predictable High Load

If you know you need 1000 concurrent always
Servers/containers likely cheaper
Serverless shines for variable, unpredictable load

Key Takeaways

Serverless excels at event processing, scheduled tasks, APIs with variable load, and glue logic
Avoid long-running processes, monolithic functions, and synchronous Lambda-to-Lambda chains
VPC adds cold start latency; avoid when not needed
Right-size memory—more memory can be cheaper if it reduces duration
High-volume, consistent workloads may be cheaper on containers
Use structured logging and correlation IDs for observability
Handle errors appropriately; use DLQs for async failures
Don’t use serverless for latency-sensitive, long-running, or compute-intensive workloads

Serverless is a powerful tool for the right problems. Understanding patterns and anti-patterns helps you use it effectively.