Skip to content
AUTH

The Anatomy of an Agent

Modern AI agents often appear mysterious. You give them a goal such as: “Research the impact of quantum computing on cryptography and produce a report.”

And somehow the system:

But internally, the architecture of an agent is surprisingly structured.

A useful mental model is this:

ComponentAnalogy
LLMCPU
Agent RuntimeOperating System
ToolsSystem Calls
MemoryRAM + Storage
PlannerScheduler
EnvironmentExternal World

Understanding this architecture is essential if you want to:


The Core Insight

The most important conceptual leap is this:

In other words:

Agent = LLM + Runtime + Tools + Memory + Environment

The LLM provides reasoning ability.

The runtime provides control and execution.


The LLM as a CPU

Large Language Models function as the reasoning processor of an agent.

Just as a CPU executes machine instructions, an LLM executes reasoning instructions encoded in text.

What the LLM Actually Does

At every step of an agent loop, the LLM performs several tasks:

  1. Interpret the current state
  2. Reason about the goal
  3. Choose the next action
  4. Produce structured output

The output may include:

For example, the LLM might produce structured output like:

{
"thought": "I should search for recent research papers",
"action": "web_search",
"arguments": {
"query": "post quantum cryptography NIST progress"
}
}

This output becomes an instruction for the runtime.


The Limits of the LLM

Despite their intelligence, LLMs have important limitations.

They cannot:

They only generate tokens.

Therefore something else must execute the real work.

That component is the agent runtime.


The Agent Runtime as an Operating System

If the LLM is the CPU, then the agent runtime is the operating system.

It orchestrates everything.

Responsibilities of the runtime include:

A typical runtime loop looks like this:

while not done:
observation = environment.get_state()
reasoning = LLM(prompt + observation)
action = parse(reasoning)
result = execute(action)
environment.update(result)

This simple loop is the heartbeat of every autonomous agent.


A Minimal Agent Loop

Let us examine a minimal example.

Code Examples

from typing import Any, Callable
import json
from ollama import chat
from pydantic import BaseModel
MODEL = "qwen3.5:9b"
MAX_STEPS = 6
class Action(BaseModel):
type: str # "tool" | "final"
name: str | None = None
args: dict[str, Any] = {}
answer: str | None = None
class Agent:
def __init__(self, tools: dict[str, Callable[..., str]]):
self.tools = tools
self.history: list[dict[str, str]] = []
def build_messages(self, observation: str) -> list[dict[str, str]]:
system_prompt = """
You are an agent runtime planner.
Return ONLY JSON with schema:
{
"type": "tool" | "final",
"name": string | null,
"args": object,
"answer": string | null
}
Use "tool" when you need external data. Use "final" only when done.
"""
return [{"role": "system", "content": system_prompt}, *self.history, {"role": "user", "content": observation}]
def step(self, observation: str) -> tuple[str, bool]:
response = chat(model=MODEL, messages=self.build_messages(observation), options={"temperature": 0.2})
raw = response.message.content
action = Action.model_validate_json(raw)
self.history.append({"role": "assistant", "content": raw})
if action.type == "final":
return action.answer or "No answer provided.", True
if action.type == "tool":
if not action.name or action.name not in self.tools:
tool_result = f"Tool '{action.name}' not available."
else:
try:
tool_result = self.tools[action.name](**action.args)
except Exception as exc:
tool_result = f"Tool error: {exc}"
self.history.append({"role": "tool", "content": tool_result})
return tool_result, False
return f"Unknown action type: {action.type}", False
def run(self, goal: str) -> str:
observation = goal
for step_no in range(1, MAX_STEPS + 1):
observation, done = self.step(observation)
print(f"[step {step_no}/{MAX_STEPS}] completed")
if done:
return observation
return "Stopped: reached max iterations without final answer."
def web_search(query: str) -> str:
return f"[mock] top search results for: {query}"
agent = Agent(tools={"web_search": web_search})
final_answer = agent.run("Research the current state of MCP and summarize key updates.")
print(final_answer)

Even this tiny program already contains the essential ingredients of an agent.


Agent Architecture Overview

Modern agent systems typically include several layers.

+--------------------+
| User Goal |
+---------+----------+
|
v
+----------------------+
| Agent Runtime |
| (Control Loop) |
+----------+-----------+
|
+-----------------+----------------+
| |
v v
+---------------+ +----------------+
| LLM | | Tools |
| Reasoning | | APIs / Code |
+-------+-------+ +-------+--------+
| |
v v
+---------------+ +----------------+
| Working Mem | | Environment |
| Scratchpad | | External Data |
+---------------+ +----------------+

In the next few articles we will examine each component in detail.


Responsibilities of the Agent Runtime

A production agent runtime performs many complex tasks.

1 — State Management

Agents maintain internal state such as:

Without state, agents cannot perform multi-step tasks.


2 — Tool Execution

The runtime executes external capabilities such as:

These are similar to system calls in operating systems.


3 — Context Construction

The runtime decides what information the LLM receives.

This may include:

Because context windows are limited, this step is critical.


4 — Safety and Guardrails

The runtime enforces constraints such as:

Without guardrails, autonomous agents can become dangerous or unstable.


The Agent Execution Cycle

Putting everything together, the lifecycle of an agent looks like this:

User Goal
Perception
Reasoning (LLM)
Planning
Tool Execution
Observation
Reflection
Termination or Next Step

This loop is often called the Agent Control Loop.

We introduced the conceptual version in Module 1:

observe → reason → plan → act → reflect

Now we see how it is implemented inside a runtime.


Why This Architecture Matters

Understanding the internal architecture of agents has several practical benefits.

1 — Debugging Agents

Most agent failures occur in the runtime layer, not the LLM.

Examples include:


2 — Performance Optimization

Agent performance depends heavily on:

Optimizing the runtime can dramatically improve speed and cost.


3 — Building Custom Agents

Many frameworks exist:

But all of them implement the same underlying architecture.

Once you understand the internals, you can:


Looking Ahead

In this article we introduced the two central components of an agent:

In the upcoming articles we will explore the remaining components of agent architecture.

Next we will examine the Perception Layer, which transforms raw inputs such as:

into machine-readable representations that agents can reason about.

This is where embeddings, parsing pipelines, and multimodal models enter the architecture.

→ Continue to 2.2 — The Perception Layer