Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Agent Architecture Basics

The Perception-Reasoning-Action Loop

Every agent operates on a fundamental cycle that mirrors how humans approach tasks:

┌─────────────┐
│  PERCEIVE   │ ← Gather information about current state
└──────┬──────┘
       │
       ▼
┌─────────────┐
│   REASON    │ ← Decide what to do next
└──────┬──────┘
       │
       ▼
┌─────────────┐
│    ACT      │ ← Execute the chosen action
└──────┬──────┘
       │
       └──────→ (back to PERCEIVE)

Perceive

The agent observes its environment:

  • User input and instructions
  • Tool outputs and results
  • Current state and context
  • Available resources

Reason

The agent decides on the next action:

  • Analyze the current situation
  • Consider available options
  • Plan the next step
  • Evaluate potential outcomes

Act

The agent executes its decision:

  • Call a tool or function
  • Generate a response
  • Update internal state
  • Request more information

Memory Systems

Agents need memory to maintain context and learn from experience. There are two primary types:

Short-Term Memory (Working Memory)

Holds information for the current task:

  • Conversation history: Recent messages and responses
  • Intermediate results: Outputs from previous steps
  • Current plan: What the agent is trying to accomplish
  • Execution state: Where the agent is in the workflow

Implementation: Typically stored in the LLM’s context window

Limitations:

  • Fixed size (token limits)
  • Cleared when task completes
  • Can become cluttered

Long-Term Memory (Persistent Memory)

Retains information across sessions:

  • Facts and knowledge: Learned information about the user or domain
  • Past interactions: Historical conversations
  • Successful strategies: What worked before
  • User preferences: Personalization data

Implementation:

  • Vector databases (semantic search)
  • Traditional databases (structured data)
  • File systems (documents, logs)

Key Operations:

  • Store: Save important information
  • Retrieve: Find relevant past information
  • Update: Modify existing memories
  • Forget: Remove outdated information

Planning and Goal-Oriented Behavior

Agents don’t just react—they plan ahead to achieve goals efficiently.

Goal Decomposition

Breaking complex goals into manageable sub-goals:

Goal: "Research and summarize recent AI papers"
  ├─ Sub-goal 1: Search for relevant papers
  ├─ Sub-goal 2: Read and extract key points
  ├─ Sub-goal 3: Synthesize findings
  └─ Sub-goal 4: Format summary

Planning Strategies

Reactive Planning: Decide next step based on current state

  • Simple and fast
  • Good for straightforward tasks
  • Limited lookahead

Proactive Planning: Create full plan upfront, then execute

  • Better for complex tasks
  • Can optimize entire workflow
  • May need replanning if things change

Hybrid Planning: Plan a few steps ahead, adapt as needed

  • Balances flexibility and efficiency
  • Most common in practice

Plan Representation

Plans can be represented as:

  • Linear sequences: Step 1 → Step 2 → Step 3
  • Trees: Branching based on conditions
  • Graphs: Complex dependencies between steps
  • Natural language: Human-readable descriptions

Multi-Step Task Execution

Agents excel at tasks requiring multiple actions:

Execution Patterns

Sequential Execution

Step 1 → Step 2 → Step 3 → Done

Each step depends on the previous one.

Parallel Execution

Step 1a ─┐
Step 1b ─┼→ Combine → Done
Step 1c ─┘

Independent steps run simultaneously.

Conditional Execution

Step 1 → Decision
         ├─ If A → Step 2a → Done
         └─ If B → Step 2b → Done

Path depends on intermediate results.

Iterative Execution

Step 1 → Step 2 → Check
         ↑         │
         └─────────┘ (repeat if needed)

Loop until condition is met.

Error Handling

Robust agents handle failures gracefully:

  1. Detect: Recognize when something went wrong
  2. Diagnose: Understand the cause
  3. Recover: Try alternative approaches
  4. Escalate: Ask for help if stuck

Progress Tracking

Agents monitor their progress:

  • Checkpoints: Mark completed sub-goals
  • State management: Track what’s been done
  • Backtracking: Undo steps if needed
  • Resumption: Continue after interruption

Core Components of Agent Architecture

1. Controller (Brain)

The central decision-making component:

  • Interprets user goals
  • Manages the reasoning loop
  • Coordinates other components
  • Handles control flow

2. Memory Manager

Manages information storage and retrieval:

  • Maintains conversation context
  • Stores and retrieves long-term memories
  • Decides what to remember/forget
  • Optimizes memory usage

3. Tool Interface

Connects agent to external capabilities:

  • Defines available tools
  • Handles tool invocation
  • Parses tool outputs
  • Manages tool errors

4. Planner

Develops strategies for achieving goals:

  • Decomposes complex tasks
  • Generates action sequences
  • Optimizes execution order
  • Adapts plans based on results

5. Executor

Carries out planned actions:

  • Invokes tools with correct parameters
  • Monitors execution
  • Collects results
  • Reports status

Putting It Together

A complete agent architecture integrates these components:

User Input
    ↓
┌─────────────────────────────────┐
│         CONTROLLER              │
│  (Orchestrates everything)      │
└────┬────────────────────────┬───┘
     │                        │
     ▼                        ▼
┌─────────┐              ┌─────────┐
│ MEMORY  │←────────────→│ PLANNER │
└─────────┘              └────┬────┘
     ↑                        │
     │                        ▼
     │                   ┌─────────┐
     └──────────────────│EXECUTOR │
                        └────┬────┘
                             │
                             ▼
                        ┌─────────┐
                        │  TOOLS  │
                        └─────────┘
                             ↓
                         Results

Design Principles

When architecting agents, follow these principles:

  1. Modularity: Separate concerns into distinct components
  2. Observability: Make agent reasoning transparent
  3. Flexibility: Allow easy addition of new tools and capabilities
  4. Robustness: Handle errors and edge cases gracefully
  5. Efficiency: Minimize unnecessary steps and API calls
  6. Safety: Validate inputs and outputs, respect boundaries

Next Steps

Now that you understand the basic architecture, we’ll explore how LLMs power these components in the next section on LLM Fundamentals for Agents.