AI Agent Architecture Patterns

AI agents extend LLMs beyond text generation—they can use tools, access APIs, and execute multi-step workflows. The vision of autonomous AI agents is compelling, but building reliable agents requires careful architecture.

Here are patterns for building AI agents.

What Are AI Agents

Agent vs. Simple LLM Call

simple_llm:
  flow: Prompt → LLM → Response
  capability: Generate text based on input
  limitation: No access to external data/actions

ai_agent:
  flow: Task → Reason → Act → Observe → Repeat
  capability: Use tools, access data, take actions
  key_feature: Iterative reasoning and tool use

The ReAct Pattern

react_pattern:
  name: Reasoning + Acting
  loop:
    1. Thought: Reason about what to do next
    2. Action: Choose and execute a tool
    3. Observation: See the result
    4. Repeat: Until task complete

  example:
    task: "What's the weather in the city where Apple is headquartered?"
    thought_1: "I need to find where Apple is headquartered"
    action_1: search("Apple headquarters location")
    observation_1: "Apple is headquartered in Cupertino, California"
    thought_2: "Now I need the weather in Cupertino"
    action_2: get_weather("Cupertino, CA")
    observation_2: "72°F, sunny"
    final: "The weather in Cupertino, where Apple is headquartered, is 72°F and sunny"

Core Architecture

Agent Components

from abc import ABC, abstractmethod
from dataclasses import dataclass

@dataclass
class Tool:
    name: str
    description: str
    function: callable

@dataclass
class AgentStep:
    thought: str
    action: str
    action_input: any
    observation: str

class Agent:
    def __init__(self, llm, tools: list[Tool], max_iterations=10):
        self.llm = llm
        self.tools = {tool.name: tool for tool in tools}
        self.max_iterations = max_iterations

    def run(self, task: str) -> str:
        steps = []

        for i in range(self.max_iterations):
            # Get next action from LLM
            prompt = self._build_prompt(task, steps)
            response = self.llm.generate(prompt)

            # Parse response
            parsed = self._parse_response(response)

            if parsed.action == "finish":
                return parsed.action_input  # Final answer

            # Execute tool
            tool = self.tools.get(parsed.action)
            if not tool:
                observation = f"Unknown tool: {parsed.action}"
            else:
                try:
                    observation = tool.function(parsed.action_input)
                except Exception as e:
                    observation = f"Error: {str(e)}"

            steps.append(AgentStep(
                thought=parsed.thought,
                action=parsed.action,
                action_input=parsed.action_input,
                observation=str(observation)
            ))

        return "Max iterations reached without completing task"

Tool Definition

# Example tools
def search(query: str) -> str:
    """Search the web for information."""
    # Implementation
    results = search_api.search(query)
    return "\n".join([r.snippet for r in results[:3]])

def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        return str(eval(expression))
    except:
        return "Invalid expression"

def get_current_time() -> str:
    """Get the current date and time."""
    return datetime.now().isoformat()

tools = [
    Tool(
        name="search",
        description="Search the web. Input: search query string",
        function=search
    ),
    Tool(
        name="calculate",
        description="Calculate math expressions. Input: expression like '2+2'",
        function=calculate
    ),
    Tool(
        name="get_time",
        description="Get current date/time. No input needed",
        function=get_current_time
    )
]

Planning Patterns

Plan-Then-Execute

class PlanAndExecuteAgent:
    def __init__(self, planner_llm, executor_llm, tools):
        self.planner = planner_llm
        self.executor = executor_llm
        self.tools = tools

    def run(self, task: str) -> str:
        # Step 1: Create plan
        plan = self._create_plan(task)

        # Step 2: Execute each step
        results = []
        for step in plan.steps:
            result = self._execute_step(step, results)
            results.append(result)

        # Step 3: Synthesize final answer
        return self._synthesize(task, results)

    def _create_plan(self, task: str) -> Plan:
        prompt = f"""Create a step-by-step plan to complete this task.
Available tools: {self._format_tools()}

Task: {task}

Plan (numbered steps):"""

        response = self.planner.generate(prompt)
        return self._parse_plan(response)

Hierarchical Agents

hierarchical_pattern:
  manager_agent:
    role: Coordinate and delegate
    capabilities:
      - Break down complex tasks
      - Assign to specialist agents
      - Synthesize results

  specialist_agents:
    research_agent:
      tools: [search, read_document]
      focus: Information gathering

    code_agent:
      tools: [write_code, run_code]
      focus: Programming tasks

    data_agent:
      tools: [query_database, analyze_data]
      focus: Data operations

Memory Patterns

Conversation Memory

class ConversationMemory:
    def __init__(self, max_turns=10):
        self.history = []
        self.max_turns = max_turns

    def add(self, role: str, content: str):
        self.history.append({"role": role, "content": content})
        if len(self.history) > self.max_turns * 2:
            self.history = self.history[-self.max_turns * 2:]

    def get_context(self) -> str:
        return "\n".join([
            f"{msg['role']}: {msg['content']}"
            for msg in self.history
        ])

Long-Term Memory with RAG

class LongTermMemory:
    def __init__(self, vector_store):
        self.vector_store = vector_store

    def store(self, content: str, metadata: dict):
        embedding = embed(content)
        self.vector_store.upsert({
            "content": content,
            "embedding": embedding,
            "metadata": metadata,
            "timestamp": datetime.now().isoformat()
        })

    def recall(self, query: str, k: int = 5) -> list:
        embedding = embed(query)
        results = self.vector_store.query(embedding, top_k=k)
        return [r["content"] for r in results]

class MemoryEnhancedAgent:
    def __init__(self, agent, memory):
        self.agent = agent
        self.memory = memory

    def run(self, task: str) -> str:
        # Recall relevant memories
        relevant = self.memory.recall(task)
        context = "\n".join(relevant)

        # Enhance task with context
        enhanced_task = f"""
Context from previous interactions:
{context}

Current task: {task}
"""

        result = self.agent.run(enhanced_task)

        # Store new memory
        self.memory.store(
            f"Task: {task}\nResult: {result}",
            {"type": "task_completion"}
        )

        return result

Safety and Control

Guardrails

class GuardedAgent:
    def __init__(self, agent, guardrails):
        self.agent = agent
        self.guardrails = guardrails

    def run(self, task: str) -> str:
        # Pre-execution checks
        for guardrail in self.guardrails:
            check = guardrail.pre_check(task)
            if not check.allowed:
                return f"Task blocked: {check.reason}"

        # Run with monitoring
        result = self.agent.run_with_hooks(
            task,
            on_action=self._check_action
        )

        # Post-execution checks
        for guardrail in self.guardrails:
            check = guardrail.post_check(result)
            if not check.allowed:
                return f"Result blocked: {check.reason}"

        return result

    def _check_action(self, action, input):
        """Check each action before execution."""
        for guardrail in self.guardrails:
            check = guardrail.action_check(action, input)
            if not check.allowed:
                raise ActionBlocked(check.reason)

Human-in-the-Loop

class HumanInLoopAgent:
    def __init__(self, agent, approval_fn, high_risk_actions):
        self.agent = agent
        self.approval_fn = approval_fn
        self.high_risk_actions = high_risk_actions

    def run(self, task: str) -> str:
        # Override action execution
        original_execute = self.agent.execute_action

        def guarded_execute(action, input):
            if action in self.high_risk_actions:
                # Request human approval
                approved = self.approval_fn(
                    action=action,
                    input=input,
                    context=self.agent.current_context
                )
                if not approved:
                    return "Action cancelled by user"

            return original_execute(action, input)

        self.agent.execute_action = guarded_execute
        return self.agent.run(task)

Evaluation

Agent Metrics

agent_metrics:
  task_completion:
    - Success rate
    - Partial completion rate
    - Failure types

  efficiency:
    - Steps to completion
    - Tool call count
    - Unnecessary actions

  quality:
    - Answer accuracy
    - Hallucination rate
    - Tool use appropriateness

  safety:
    - Guardrail triggers
    - Error recovery
    - Boundary respect

Key Takeaways

Agents combine LLMs with tools for complex tasks
ReAct pattern: Think → Act → Observe → Repeat
Plan-then-execute works for complex multi-step tasks
Memory enables context across interactions
Guardrails and human-in-the-loop are essential for safety
Hierarchical agents handle specialized subtasks
Evaluation is challenging—define clear success criteria
Start simple, add complexity as needed
Agents are powerful but unpredictable—build defensively
Production agents need extensive monitoring

Agents are the frontier of LLM applications. Build them carefully.