// Topics / AI and Systems Architecture

AI and Systems Architecture

Architecture is the set of decisions that become expensive to change later. In AI systems, those decisions usually sit around model access, retrieval, evaluation, data freshness, cost control, and ownership boundaries.

This hub connects AI architecture with older systems lessons: keep interfaces small, make failure modes explicit, and avoid distributing complexity before the team can operate it.

Start Here

AI-Native Architecture Patterns 2026 is the current overview of gateways, retrieval, evaluation, and graceful degradation.
Why Most Enterprise AI Architecture Fails in Year One explains why brittle demos fail when they become production systems.
Your AI Infrastructure Is Not Special maps AI infrastructure back to ordinary production patterns: gateways, caching, budgets, and circuit breakers.

Architecture Questions That Matter

Before adding another model, service, or queue, answer these:

Where is the stable interface between product code and model behavior?
How is context assembled, filtered, and refreshed?
Which validation path catches bad output before users rely on it?
What happens when the preferred model is slow, unavailable, expensive, or wrong?
Who owns incidents caused by model, retrieval, or data drift?

Supporting Patterns

Retrieval and context:

Agents and tools:

Older systems tradeoffs:

Failure Modes

Letting prompts become hidden architecture.
Hard-coding provider behavior three layers deep.
Treating retrieval as a static index instead of a living data pipeline.
Running AI features without cost attribution, evaluation, and rollback paths.

References

68 entries tagged “AI and Systems Architecture”

How to Run an AI Incident Review That Changes Architecture, Not Slides June 2, 2026 · 2 min Incident reviews should produce architecture deltas and control updates, not narrative theater. reliability ai governance

Build the System the Model Cannot Break May 14, 2026 · 12 min A manifesto for building AI-native organizations. Twelve tenets across strategy, architecture, economics, and people — and the only test that matters in year two. manifesto ai strategy

The 2026 AI Build vs. Buy Calculus (It’s Just Operational Cost) April 30, 2026 · 3 min By mid-2026, AI build vs buy has nothing to do with novelty. It is a ruthless mathematical calculation of telemetry, context freshness, and infrastructure lock-in. build-vs-buy ai architecture

Why Most Enterprise AI Architecture Fails in Year One April 21, 2026 · 3 min In 2026, enterprise AI isn't failing because models are bad. It is failing because organizations are building brittle demos instead of bounded, operable systems. architecture ai reliability

Sovereign Systems: Building for a World Where Data Privacy Is Non-Optional April 6, 2026 · 6 min Privacy is an architecture constraint, not a feature toggle. Teams that build sovereignty into their systems early avoid painful retrofits and close enterprise deals faster. privacy security data-residency

AI-Native Architecture Patterns 2026: Production Guide January 26, 2026 · 7 min Production AI architecture patterns for gateways, retrieval, evaluation, fallbacks, cost control, and ownership. architecture ai patterns

Agent Orchestration: Four Patterns, Honest Tradeoffs May 12, 2025 · 5 min Multi-agent systems aren't magic. They're distributed systems with all the usual coordination headaches. Here are the four patterns I've seen work, and when each one falls apart. agents orchestration ai

MCP in Practice: Building Tool Servers in Go March 17, 2025 · 7 min Model Context Protocol promises to standardize how AI talks to tools. I built an MCP server in Go to see if the promise holds up. Here's what I found. mcp ai golang

Your AI Infrastructure Is Not Special December 9, 2024 · 4 min AI infrastructure at scale is just infrastructure. The same boring patterns -- gateways, caching, circuit breakers, budget enforcement -- solve the same boring problems. ai infrastructure scale

Agent Patterns That Survive Production October 28, 2024 · 7 min Single-prompt agents break on real tasks. Plan-execute-replan, orchestrated specialists, structured memory, and explicit recovery -- in Go -- are what actually works. agents ai go

The Best Model Is the Smallest One That Works August 5, 2024 · 3 min Everyone reaches for GPT-4 by default. Most production tasks don't need it. Small models are faster, cheaper, and often better when the task is well-defined. small-models llm ai

Stop Stuffing Your Context Window July 22, 2024 · 4 min Bigger context windows aren't an excuse to stop thinking about what goes into them. Most teams are paying for irrelevant tokens and wondering why quality degrades. context-window llm ai

Why I Run Multiple Models in Production March 18, 2024 · 4 min Betting on a single model provider is like having a single database with no failover. Here is why multi-model is the only sane production strategy. ai architecture llm

Architecting AI-Native Applications (Without the Delusion) February 5, 2024 · 7 min The architecture of an AI-native app is fundamentally different from bolting a model onto a CRUD app. Here is how I structure them -- with code, layers, and hard-won opinions. architecture ai design

OpenAI DevDay Happened and I Have Opinions November 27, 2023 · 4 min OpenAI DevDay was not just a product launch. It was a platform play that changes the build-vs-buy calculus for every team shipping AI features. openai ai devday

Agent Architecture Patterns That Actually Work in Production September 18, 2023 · 6 min Most agent demos are impressive. Most agent production systems are not. Here is what separates the two. ai agents llm

RAG Patterns That Actually Work in Production April 17, 2023 · 8 min RAG is the default architecture for grounding LLMs in private data. Here are the patterns that survive real traffic, with Go examples from production systems. rag ai llm

My First Week Building with GPT-4 March 6, 2023 · 4 min GPT-4 landed and everything changed. What I learned in the first week of building with it, and the architecture decisions that followed. ai gpt-4 openai

LLM Integration Patterns That Actually Survive Production January 23, 2023 · 6 min Practical patterns for integrating LLMs into real applications -- prompt management, structured outputs, caching, fallbacks, and tool use -- with Go examples. ai llm go

Monorepo vs. Polyrepo: A Practical Decision Guide October 31, 2022 · 4 min Monorepo or polyrepo depends on coupling, team shape, and your appetite for build tooling. Here is how to decide without getting religious about it. architecture monorepo git

When to Go Async (And When to Resist the Urge) July 25, 2022 · 5 min Async patterns solve real problems -- bursty traffic, slow dependencies, decoupled teams. But the complexity tax is real. Lessons from building event-driven systems at Decloud. async architecture message-queues

Rate Limiting: The Boring Feature That Saves You at 3 AM June 27, 2022 · 4 min Rate limiting algorithms, implementation tradeoffs, and practical lessons from building limiters for high-traffic APIs at a real-time messaging company. rate-limiting api backend

Distributed Systems Patterns I Keep Reaching For May 30, 2022 · 6 min The patterns that actually survive production across failure handling, consistency, messaging, coordination, and scaling. distributed-systems architecture patterns

You Probably Don't Need a Service Mesh April 4, 2022 · 5 min Service meshes solve real problems at real scale. But most teams adopt them before the problems exist. Here's how to decide honestly. service-mesh istio linkerd

API Versioning: Pick One and Stop Overthinking It March 7, 2022 · 4 min API versioning is a maintenance commitment, not a design exercise. URL paths win for public APIs, headers for internal ones. The real discipline is not versioning -- it's avoiding breaking changes in the first place. api versioning rest

The AWS us-east-1 Outage Was Predictable. Your Architecture Was Not Ready. December 20, 2021 · 4 min December 7 reminded everyone that us-east-1 is a single point of failure for half the internet. Again. I am annoyed. aws outage reliability

Event Sourcing in Practice: What I Learned Building Financial Event Pipelines October 25, 2021 · 7 min Event sourcing is powerful but expensive to get wrong. Here's what actually works, with Go code, drawn from building event pipelines at the fintech startup. event-sourcing architecture cqrs

Most 'Technical Debt' Is Just Decisions You Disagree With Now September 20, 2021 · 4 min The term 'technical debt' has become meaningless. Everything inconvenient is debt. Here's what it actually is, when it matters, and why most teams handle it wrong. technical-debt engineering-leadership architecture

Zero Trust Architecture: What It Actually Looks Like August 23, 2021 · 6 min Zero trust from two perspectives: my NATO background in defense systems and work at a major telecom. The architecture patterns, the implementation path, and what most companies get wrong. zero-trust security architecture

Most Teams Should Just Use Postgres July 12, 2021 · 3 min Serverless databases are solving problems most teams don't have. Here's why Postgres with a connection pooler is still the right answer. serverless databases postgresql

API Gateway Patterns That Actually Work May 31, 2021 · 5 min Edge gateways, BFFs, and service mesh ingress -- what I've learned running them at Decloud and at large telecoms. api-gateway microservices architecture

Multi-Cloud Is Mostly a Marketing Strategy April 5, 2021 · 4 min Multi-cloud sounds great in vendor pitches. In practice, it doubles your operational burden for benefits most teams will never need. multi-cloud cloud architecture

API Gateways: Build, Buy, or Regret October 5, 2020 · 6 min I've built a custom Go gateway, run Kong in prod, evaluated Envoy, and used managed cloud gateways. Here's what I actually recommend after doing all of them wrong at least once. api-gateway go kong

GraphQL Federation Is Probably Not For You August 17, 2020 · 4 min Most teams adopting GraphQL federation don't need it. A frank take on when it makes sense, when REST is fine, and why conference talks are a terrible basis for architecture decisions. graphql federation api

Event-Driven Architecture: What I Got Wrong and What Survived July 6, 2020 · 10 min Lessons from building event-driven systems at the fintech startup and Decloud. What actually works, what silently corrupts your data, and Go patterns for handling events without losing your mind. architecture events golang

Serverless vs Containers: Where the Math Stops Working June 22, 2020 · 5 min Serverless is great until it isn't. A comparison of serverless and containers at different traffic scales, with actual numbers on where the economics flip. serverless containers architecture

I Tried Every API Versioning Strategy. Here's the One I Actually Use. February 3, 2020 · 5 min After dealing with versioning messes at multiple companies, I landed on URL path versioning for anything public. Here's why the alternatives didn't survive contact with reality. api versioning rest

Database Replication Patterns That Actually Matter January 20, 2020 · 8 min A practical breakdown of replication modes, topologies, and the tradeoffs between consistency, availability, and not losing your users' data at 3am. databases replication postgresql

Your Cloud Bill Is a Design Document December 2, 2019 · 6 min Cloud cost management isn't a finance problem. It's an architecture problem disguised as a spreadsheet. Here's how to treat your AWS bill like the engineering signal it actually is. finops cloud cost-optimization

Most Edge Computing Projects Are Premature Optimization November 18, 2019 · 3 min Edge computing is real, but most teams adopting it don't have an edge problem. They have an architecture problem they're solving with geography. edge-computing architecture distributed-systems

Message Queues: The Patterns Nobody Tells You About Until 3 AM September 9, 2019 · 8 min Queues look simple on a whiteboard. Then you deploy them. Here are the messaging patterns I've learned the hard way across three startups, with Go code and real failure stories. messaging architecture rabbitmq

Data Mesh Is an Org Chart Fix, Not a Tech One July 29, 2019 · 3 min Most data problems are ownership problems. Data mesh gets that right. But adopting it as an architecture diagram exercise misses the point entirely. data architecture data-mesh

Your Monolith Is Probably Fine July 1, 2019 · 5 min Most teams shouldn't be migrating to microservices. Here's how to tell if you actually should, and how to do it without wrecking your delivery for eighteen months. microservices architecture monolith

You Probably Don't Need Multi-Region June 17, 2019 · 5 min Multi-region architecture is a strategic decision most teams make too early. Here's when it actually pays off, the patterns that work, and why data is the part that will ruin your week. architecture multi-region distributed-systems

Design for Failure or It Will Design Your Weekend May 6, 2019 · 3 min Failure is not an edge case. It is the default state you temporarily hold off with good engineering. A few hard-won rules for building systems that bend instead of shatter. reliability architecture distributed-systems

Async Job Processing: Patterns That Saved Us at a Fintech Startup December 17, 2018 · 7 min Hard-won patterns for reliable background job processing -- queues, retries, idempotency, and the failures that taught me to care about all three. backend architecture async

API Rate Limiting: What Actually Works October 15, 2018 · 7 min Algorithms, headers, and deployment patterns for rate limiting APIs -- drawn from building financial data services at the fintech startup. api rate-limiting backend

What Building Distributed Systems at a Fintech Startup Taught Me About Failure September 17, 2018 · 6 min Hard-won lessons from designing distributed systems that survive real-world failures -- timeouts, retries, bulkheads, and the operational habits that actually keep things running. distributed-systems reliability architecture

Serverless: What Works, What Doesn't, and What Will Bite You September 3, 2018 · 6 min Real patterns and antipatterns from running serverless at the fintech startup. Where Lambda shines, where it hurts, and how to tell the difference before it's too late. serverless aws-lambda architecture

Database Sharding: You Probably Don't Need It Yet August 6, 2018 · 8 min Most teams shard too early. Here's how we thought about it at the fintech startup, when it actually makes sense, and the SQL-level decisions that matter most. database postgresql sharding

Securing Microservices: What Actually Works July 23, 2018 · 7 min You split the monolith. Now every service-to-service call is an attack surface. Here's how I think about identity, authorization, encryption, and secrets management in distributed systems. security microservices authentication

Event Sourcing in Practice: What I Got Right and Wrong March 19, 2018 · 7 min Lessons from building event-sourced systems at the fintech startup -- the patterns that held up, the modeling mistakes that bit us, and the operational realities nobody warns you about. architecture event-sourcing cqrs

Zero Trust Is Not a Product. Here's How We Actually Built It. February 19, 2018 · 5 min Perimeter security is dead. At the fintech startup, I ripped out the castle-and-moat model and replaced it with zero trust — identity-first, micro-segmented, no implicit trust anywhere. Here's what that actually looked like. security architecture zero-trust

Stop Trying to Fix All Your Tech Debt December 18, 2017 · 4 min A two-number scoring system for tech debt that tells you what to fix now, what to schedule, and what to quietly accept. technical-debt engineering prioritization

Multi-Region Architecture: What I Wish Someone Had Told Me October 2, 2017 · 6 min We serve financial data to users across Europe at the fintech startup. Here's what I've learned about going multi-region -- the patterns that work, the ones that burn you, and when you should even bother. architecture distributed-systems cloud

Serverless Patterns That Actually Work in Production June 5, 2017 · 2 min Most serverless tutorials teach you the wrong thing. Here's what matters when you're running it for real. serverless aws-lambda architecture

API Versioning: What Actually Works and What Doesn't May 29, 2017 · 4 min We tried multiple API versioning approaches at the fintech startup. URL path versioning won. Here's why, plus how to handle deprecation without burning your consumers. api versioning rest

How I Build Data Pipelines That Actually Survive Production April 24, 2017 · 6 min Every pipeline I've built at the fintech startup broke at some point. Here's the design approach that made them recoverable instead of catastrophic. data-engineering etl pipelines

Why We Went Event-Driven (and What Nearly Broke) April 10, 2017 · 5 min Lessons from building event-driven systems at the fintech startup and Dropbyke -- what worked, what broke, and why I'd do it again. architecture event-driven microservices

GraphQL vs REST: Pick the Boring One February 6, 2017 · 4 min Everyone wants to debate GraphQL vs REST like it's a religion. It's not. One reduces round trips, the other is dead simple to cache. Here's how I actually decide. graphql rest api

Why We Chose Go for Our Backend Services November 28, 2016 · 5 min How Go became the default backend language at Dropbyke and a fintech startup, what it replaced, and the honest tradeoffs we accepted along the way. golang go backend

The Economics of State: Why Scaling Up Beats Sharding (Until It Doesn't) November 14, 2016 · 8 min A production-grounded case for exhausting single-server headroom with pooling, replicas, and partitioning before taking on sharding complexity. postgresql databases scaling

Building Resilient Systems: Lessons from Production Failures July 18, 2016 · 7 min Production incidents show where architecture bends and where it breaks. These lessons focus on designing for failure, limiting blast radius, and making recovery routine. reliability resilience architecture

API Design Principles That Stand the Test of Time May 9, 2016 · 5 min Lessons from building the fintech startup's financial data API: the REST conventions that actually matter, the ones that don't, and why consistency beats cleverness every time. api rest design

Postgres vs MySQL in 2016: A Practical Comparison April 12, 2016 · 5 min A grounded look at PostgreSQL and MySQL as of April 2016, focusing on integrity, query power, and operational tradeoffs rather than benchmark hype. postgresql mysql databases

AWS Lambda: When Serverless Makes Sense (And When It Doesn't) March 28, 2016 · 4 min Lambda is a sharp tool for specific jobs. The problem is everyone wants to use it for everything. serverless aws lambda

The True Cost of Technical Debt February 22, 2016 · 3 min A pragmatic look at technical debt in 2016: what it is, how it shows up, how to measure it, and how to make a business case for paying it down without stalling delivery. technical-debt engineering leadership

Why Microservices Aren't Always the Answer January 15, 2016 · 5 min Most teams adopt microservices too early and pay for complexity they don't need yet. A well-structured monolith is faster, simpler, and keeps your options open. architecture microservices monolith