Skip to content
AUTH

The Cognitive Architecture of Agents

Once you understand what an agent is, the next question becomes more interesting: what is happening inside the agent itself?

Why does an autonomous system feel different from a chatbot? Why can some systems pursue goals, use tools, adapt to changing situations, and recover from failure while others simply respond one turn at a time?

The answer lies in the agent’s cognitive architecture: the internal structure that helps it keep track of the situation, decide what to do next, use tools, and improve based on feedback.

In this article, we will build a compact mental model for that architecture and then make it concrete with a small working example.


What Makes an Agent Cognitive?

An agent is not just a model that replies to a prompt once. It is a system that can work toward a goal over multiple steps.

What makes that possible is its internal structure.

A useful way to think about it is this: a cognitive agent maintains a working view of the world. That includes things like:

This active working view is often called the agent’s world model. In LLM-based systems, part of it may live directly in the current context window, while other useful information may be retrieved from memory or external tools.

This is what makes an agent feel different from a chatbot. A chatbot mostly reacts to the current message. An agent, by contrast, can track progress, adjust its behavior, and continue working until the task is complete.

To do that, an agent usually needs a few core ingredients:

Not every agent needs a heavy implementation of all of these, but some version of them is what gives agentic systems their sense of autonomy.

One more detail matters: a loop must eventually stop. A good agent does not spin forever. It stops when the goal is reached, when a limit is hit, or when it decides that it needs escalation or human help.


The Cognitive Loop

These ideas come together inside a repeating loop:

observe → reason → plan → act → reflect

This loop is the core pattern behind many real agent systems.

Observe

The agent gathers the current state of the task. That may include the user’s request, previous observations, recent tool outputs, retrieved memory, or information from the environment.

Reason

The agent interprets what it currently knows. It asks: What is already clear? What is missing? What matters most right now?

Plan

Based on that reasoning, the agent chooses the next step. It may decide to call a tool, gather more evidence, make a guess, or stop and return a final answer.

Act

The agent executes the chosen step. This is the point where it interacts with the outside world by calling a tool, querying a system, or producing a response.

Reflect

After acting, the agent looks at the result and updates its working view of the task. Did the action help? Is the goal now complete? Should it continue, revise its approach, or stop?

Reflection is what turns a loop into an adaptive process rather than blind repetition.

To make this pattern visible, it helps to place the agent in a tiny environment where it cannot solve the task in one shot. That is exactly what the following example does.


Implementing the Loop in Code

In the code example below, we place the agent inside a small game world with a hidden secret number.

The agent does not know the answer directly. Instead, it has access to a few tools that reveal partial clues, such as whether the number is greater than some value, whether it is divisible by another value, and whether a final guess is correct.

This creates a simple but useful environment for demonstrating agent behavior.

import json
from pathlib import Path
import ollama
# -----------------------------
# Hidden game world
# -----------------------------
SECRET_NUMBER = 12
# -----------------------------
# Tools
# -----------------------------
def greater_than(n: int) -> str:
return f"Is the secret number greater than {n}? {'yes' if SECRET_NUMBER > n else 'no'}"
def is_divisible_by(n: int) -> str:
if n == 0:
return "Error: division by zero is not allowed."
return f"Is the secret number divisible by {n}? {'yes' if SECRET_NUMBER % n == 0 else 'no'}"
def guess_number(n: int) -> str:
if SECRET_NUMBER == n:
return f"Correct! The secret number is {n}."
return f"Incorrect guess: {n} is not the secret number."
# -----------------------------
# Agent loop (Ollama version)
# -----------------------------
def run_hidden_number_agent(goal: str, max_iterations: int = 8) -> str:
model = "qwen3.5:9b"
system_prompt = """
You are an autonomous puzzle-solving agent.
A secret integer exists between 1 and 20.
Your job is to discover it using tools strategically.
Rules:
- Do NOT guess too early
- Use previous observations
- Think step by step
- Call ONE tool per step
Available tools:
1. greater_than(n)
2. is_divisible_by(n)
3. guess_number(n)
To call a tool, respond ONLY in JSON:
{
"tool": "tool_name",
"arguments": { "n": number }
}
After receiving tool results, continue reasoning.
"""
history = []
messages = [
{"role": "system", "content": system_prompt},
]
current_prompt = (
f"Goal: {goal}\n"
"The secret number is between 1 and 20.\n"
"Start by choosing the best first tool call."
)
for iteration in range(1, max_iterations + 1):
print(f"\n=== Iteration {iteration} ===")
print("Prompt to agent:\n", current_prompt)
messages.append({"role": "user", "content": current_prompt})
response = ollama.chat(
model=model,
messages=messages,
)
content = response["message"]["content"]
print("\nAgent response:\n", content)
# -----------------------------
# Try parsing tool call
# -----------------------------
try:
parsed = json.loads(content)
tool = parsed.get("tool")
n = parsed.get("arguments", {}).get("n")
if tool and n is not None:
if tool == "greater_than":
result = greater_than(int(n))
elif tool == "is_divisible_by":
result = is_divisible_by(int(n))
elif tool == "guess_number":
result = guess_number(int(n))
else:
result = "Error: unknown tool"
print("\n📦 Tool result:", result)
history.append(f"Iteration {iteration}: {result}")
# Add assistant + tool result
messages.append({"role": "assistant", "content": content})
messages.append({
"role": "user",
"content": f"Tool result: {result}"
})
# Stop if solved
if "Correct! The secret number is" in result:
return result
continue
except json.JSONDecodeError:
pass
# If model didn't call tool properly → treat as final answer
history.append(f"Iteration {iteration}: {content}")
if "Correct! The secret number is" in content:
return content
# Build next prompt
observation_block = "\n".join(history)
current_prompt = (
f"Goal: {goal}\n"
f"Iteration: {iteration}/{max_iterations}\n"
"Observations so far:\n"
f"{observation_block}\n\n"
"Choose the best next tool call.\n"
"If confident, use guess_number."
)
return "Maximum iterations reached before solving."
# -----------------------------
# Main
# -----------------------------
if __name__ == "__main__":
goal = "Find the secret number."
answer = run_hidden_number_agent(goal)
print("\nFinal answer:\n")
print(answer)

What the Code Is Actually Doing

This example turns the cognitive loop into something visible.

The agent starts with a goal: find the secret number. But it cannot access that answer directly. It can only interact with the environment through tools.

Each tool reveals only a partial clue:

That means the agent cannot solve the task in one shot. It has to gather evidence over multiple iterations.

At a high level, the loop looks like this:

  1. the agent begins with the current goal and the observations collected so far
  2. the model decides what tool to call next
  3. the tool returns a result from the environment
  4. that result is added back into the agent’s observations
  5. the next iteration starts with a richer view of the world

In other words, the code is implementing the cognitive loop directly:

The Python version keeps this compact and easy to read. The Rust version makes the moving parts more explicit by showing the tool schemas, session handling, state updates, and repeated loop structure more directly.

This is still a minimal teaching example, not a production-grade agent. Real systems may add richer memory, stronger planning, retries, safety controls, and more structured state management. But the underlying pattern is already here.

That is the main takeaway: many modern agent systems are, underneath the abstractions, variations of this same loop.


Wrapping Up

A cognitive agent is not defined by a fancy prompt. It is defined by its internal loop.

It maintains a working model of the task, updates that model through observations, chooses actions, uses tools, and keeps going until it reaches a stopping condition.

That is what gives agents their sense of autonomy.

Once you understand that loop, the rest of agent engineering starts to make more sense. Different frameworks may package it in different ways, but the core idea remains remarkably similar:

observe → reason → plan → act → reflect

Next up: The Inference-Time Compute Revolution