Serverless Architecture Patterns for Real Applications

The serverless tutorials show simple functions: receive HTTP request, query database, return response. Real applications are more complex. They have workflows, handle failures, manage state, process data at scale, and integrate with multiple services.

After building several production serverless applications, these are the patterns that work.

API Backend Pattern

The most common serverless pattern: API Gateway + Lambda functions serving HTTP requests.

Client → API Gateway → Lambda → Database/Services → Response

Structure

One function per endpoint keeps functions focused and enables independent scaling. A monolithic “do everything” function loses serverless benefits.

/users GET → list-users-function
/users POST → create-user-function
/users/{id} GET → get-user-function
/users/{id} PUT → update-user-function

Shared code goes in layers or packages. Business logic, validation, database connections—share without duplicating.

Cold Starts

The challenge: cold starts add latency on first invocation after idle period. For user-facing APIs, 500ms-2s cold starts are noticeable.

Mitigations:

Provisioned concurrency: Keep instances warm (costs money)
Warmer functions: Scheduled invocation to prevent cooling
Optimize function size: Smaller functions start faster
Choose appropriate runtime: Some runtimes (Python, Node) start faster than others (Java, .NET)

Connection Management

Database connections are expensive to establish. Lambda functions potentially create connections per invocation.

Solutions:

Connection pooling outside handler: Initialize connections globally, reuse across invocations
RDS Proxy: Connection pooling as a service
DynamoDB: Connectionless, designed for serverless

# Connection outside handler - reused across invocations
import psycopg2
conn = psycopg2.connect(...)

def handler(event, context):
    # Reuse existing connection
    cursor = conn.cursor()
    ...

Event Processing Pattern

Serverless excels at event-driven processing. Events trigger functions that process and emit further events.

S3 Upload → Lambda → Process → Store results
SQS Message → Lambda → Transform → Write to database
DynamoDB Stream → Lambda → Sync to search index

Fan-Out Pattern

Single event triggers multiple parallel processors:

             ┌→ Thumbnail Lambda → S3
Image Upload → → Metadata Lambda → DynamoDB
             └→ ML Analysis Lambda → Results DB

Each processor handles one concern, scales independently.

Choreography vs. Orchestration

Choreography: Services react to events independently. No central coordinator. Each service knows only about events it cares about.

Orchestration: Central coordinator (Step Functions) manages workflow. Explicit control flow, easier to understand complex processes.

For simple flows, choreography is lighter weight. For complex, multi-step processes with conditional logic, orchestration is clearer.

Step Functions Pattern

AWS Step Functions orchestrate complex workflows with state machines.

{
  "StartAt": "ValidateOrder",
  "States": {
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:validate-order",
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:check-inventory",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:process-payment",
      "Catch": [{
        "ErrorEquals": ["PaymentFailed"],
        "Next": "HandlePaymentFailure"
      }],
      "Next": "FulfillOrder"
    }
    ...
  }
}

Benefits:

Visual workflow representation
Built-in retry and error handling
State management without custom code
Long-running processes (up to 1 year)

Use cases:

Order processing pipelines
ETL workflows
Approval processes
Saga pattern implementation

Data Pipeline Pattern

Processing data at scale: ingest, transform, aggregate, store.

Kinesis → Lambda → Transform → DynamoDB/S3
         ↓
    (Fan out to multiple consumers)

Stream Processing

Kinesis/Lambda integration:

def handler(event, context):
    for record in event['Records']:
        payload = base64.b64decode(record['kinesis']['data'])
        process_record(payload)

Lambda polls Kinesis, processes batches of records. Automatic scaling based on shard count.

Considerations:

Configure batch size and window for efficiency
Handle partial batch failures
Monitor iterator age (lag behind stream)

Batch Processing

Large-scale batch jobs that don’t fit Lambda’s limits:

S3 triggers + Step Functions: Orchestrate processing of large files
AWS Batch: For long-running, compute-intensive jobs
Lambda + SQS: Distribute work across many concurrent functions

Scheduled Tasks Pattern

Cron-like execution for periodic tasks:

CloudWatch Events → Lambda (scheduled)
                    ↓
              Cleanup, Reports, Sync

# Serverless Framework
functions:
  dailyReport:
    handler: reports.daily
    events:
      - schedule: cron(0 8 * * ? *)  # Daily at 8 AM

Use cases:

Database cleanup
Report generation
Data synchronization
Health checks

Considerations:

Lambda has 15-minute timeout
For longer jobs, trigger Step Functions or AWS Batch
Handle missed executions (exactly-once isn’t guaranteed)

Webhooks Pattern

Receiving webhooks from external services:

External Service → API Gateway → Lambda → Process → Respond 200
                                          ↓
                                  Queue for async processing

Best practices:

Respond quickly (external services have timeouts)
Validate webhook signatures
Queue events for async processing if needed
Implement idempotency (webhooks may retry)

def handler(event, context):
    # Verify signature first
    if not verify_signature(event):
        return {'statusCode': 401}

    # Quick processing or queue for later
    body = json.loads(event['body'])
    sqs.send_message(
        QueueUrl=QUEUE_URL,
        MessageBody=json.dumps(body)
    )

    return {'statusCode': 200}

Multi-Tenant Pattern

Serving multiple customers with isolation:

Tenant identification: From authentication token, request header, or subdomain.

Resource isolation options:

Shared resources: All tenants use same tables/functions with tenant ID filtering (simplest, least isolated)
Separate tables: Each tenant has own database tables
Separate accounts: Full AWS account per tenant (maximum isolation, most complex)

Scaling considerations:

Lambda scales automatically per tenant load
DynamoDB tables can have per-tenant capacity
Consider tenant-specific rate limiting

Error Handling Patterns

Dead Letter Queues

Failed events go to DLQ for investigation and retry:

functions:
  processEvent:
    handler: process.handler
    events:
      - sqs:
          arn: !GetAtt Queue.Arn
    onError: !GetAtt DeadLetterQueue.Arn

Retry with Backoff

Lambda automatically retries failed asynchronous invocations. For custom retry logic:

def handler(event, context):
    retry_count = event.get('retry_count', 0)

    try:
        process(event)
    except TemporaryError:
        if retry_count < MAX_RETRIES:
            # Re-queue with backoff
            delay = min(2 ** retry_count, MAX_DELAY)
            event['retry_count'] = retry_count + 1
            sqs.send_message(
                QueueUrl=QUEUE_URL,
                MessageBody=json.dumps(event),
                DelaySeconds=delay
            )
        else:
            # Send to DLQ
            send_to_dlq(event)

Circuit Breaker

When downstream services fail, stop calling them:

circuit_breaker = CircuitBreaker(
    failure_threshold=5,
    recovery_timeout=60
)

def handler(event, context):
    if circuit_breaker.is_open():
        return fallback_response()

    try:
        result = call_downstream_service()
        circuit_breaker.record_success()
        return result
    except Exception:
        circuit_breaker.record_failure()
        raise

Key Takeaways

Structure APIs as one function per endpoint with shared code in layers
Handle cold starts with provisioned concurrency, warmers, or optimized function size
Use Step Functions for complex workflows with branching and error handling
Stream processing with Kinesis/Lambda scales automatically
Respond quickly to webhooks; queue for async processing
Implement robust error handling with DLQs, retries, and circuit breakers
Choose patterns based on your specific needs; serverless isn’t one-size-fits-all