The Cognitive Architecture of Agents

Once you understand what an agent is, the next question becomes more interesting: what is happening inside the agent itself?

Why does an autonomous system feel different from a chatbot? Why can some systems pursue goals, use tools, adapt to changing situations, and recover from failure while others simply respond one turn at a time?

The answer lies in the agent’s cognitive architecture: the internal structure that helps it keep track of the situation, decide what to do next, use tools, and improve based on feedback.

In this article, we will build a compact mental model for that architecture and then make it concrete with a small working example.

What Makes an Agent Cognitive?

An agent is not just a model that replies to a prompt once. It is a system that can work toward a goal over multiple steps.

What makes that possible is its internal structure.

A useful way to think about it is this: a cognitive agent maintains a working view of the world. That includes things like:

the current goal
what it has already observed
what actions it has already taken
what remains uncertain
what constraints or stopping conditions apply

This active working view is often called the agent’s world model. In LLM-based systems, part of it may live directly in the current context window, while other useful information may be retrieved from memory or external tools.

This is what makes an agent feel different from a chatbot. A chatbot mostly reacts to the current message. An agent, by contrast, can track progress, adjust its behavior, and continue working until the task is complete.

To do that, an agent usually needs a few core ingredients:

a loop so it can operate over multiple steps
state so it knows where it stands
tools so it can act beyond pure text generation
memory so it can retain or retrieve useful information
feedback so it can learn from what just happened

Not every agent needs a heavy implementation of all of these, but some version of them is what gives agentic systems their sense of autonomy.

One more detail matters: a loop must eventually stop. A good agent does not spin forever. It stops when the goal is reached, when a limit is hit, or when it decides that it needs escalation or human help.

The Cognitive Loop

These ideas come together inside a repeating loop:

observe → reason → plan → act → reflect

This loop is the core pattern behind many real agent systems.

Observe

The agent gathers the current state of the task. That may include the user’s request, previous observations, recent tool outputs, retrieved memory, or information from the environment.

Reason

The agent interprets what it currently knows. It asks: What is already clear? What is missing? What matters most right now?

Plan

Based on that reasoning, the agent chooses the next step. It may decide to call a tool, gather more evidence, make a guess, or stop and return a final answer.

Act

The agent executes the chosen step. This is the point where it interacts with the outside world by calling a tool, querying a system, or producing a response.

Reflect

After acting, the agent looks at the result and updates its working view of the task. Did the action help? Is the goal now complete? Should it continue, revise its approach, or stop?

Reflection is what turns a loop into an adaptive process rather than blind repetition.

To make this pattern visible, it helps to place the agent in a tiny environment where it cannot solve the task in one shot. That is exactly what the following example does.

Implementing the Loop in Code

In the code example below, we place the agent inside a small game world with a hidden secret number.

The agent does not know the answer directly. Instead, it has access to a few tools that reveal partial clues, such as whether the number is greater than some value, whether it is divisible by another value, and whether a final guess is correct.

This creates a simple but useful environment for demonstrating agent behavior.

Python
Rust

import json
from pathlib import Path
import ollama

# -----------------------------
# Hidden game world
# -----------------------------
SECRET_NUMBER = 12


# -----------------------------
# Tools
# -----------------------------
def greater_than(n: int) -> str:
    return f"Is the secret number greater than {n}? {'yes' if SECRET_NUMBER > n else 'no'}"


def is_divisible_by(n: int) -> str:
    if n == 0:
        return "Error: division by zero is not allowed."
    return f"Is the secret number divisible by {n}? {'yes' if SECRET_NUMBER % n == 0 else 'no'}"


def guess_number(n: int) -> str:
    if SECRET_NUMBER == n:
        return f"Correct! The secret number is {n}."
    return f"Incorrect guess: {n} is not the secret number."


# -----------------------------
# Agent loop (Ollama version)
# -----------------------------
def run_hidden_number_agent(goal: str, max_iterations: int = 8) -> str:
    model = "qwen3.5:9b"

    system_prompt = """
You are an autonomous puzzle-solving agent.

A secret integer exists between 1 and 20.
Your job is to discover it using tools strategically.

Rules:
- Do NOT guess too early
- Use previous observations
- Think step by step
- Call ONE tool per step

Available tools:
1. greater_than(n)
2. is_divisible_by(n)
3. guess_number(n)

To call a tool, respond ONLY in JSON:

{
  "tool": "tool_name",
  "arguments": { "n": number }
}

After receiving tool results, continue reasoning.
"""

    history = []

    messages = [
        {"role": "system", "content": system_prompt},
    ]

    current_prompt = (
        f"Goal: {goal}\n"
        "The secret number is between 1 and 20.\n"
        "Start by choosing the best first tool call."
    )

    for iteration in range(1, max_iterations + 1):
        print(f"\n=== Iteration {iteration} ===")
        print("Prompt to agent:\n", current_prompt)

        messages.append({"role": "user", "content": current_prompt})

        response = ollama.chat(
            model=model,
            messages=messages,
        )

        content = response["message"]["content"]
        print("\nAgent response:\n", content)

        # -----------------------------
        # Try parsing tool call
        # -----------------------------
        try:
            parsed = json.loads(content)

            tool = parsed.get("tool")
            n = parsed.get("arguments", {}).get("n")

            if tool and n is not None:
                if tool == "greater_than":
                    result = greater_than(int(n))
                elif tool == "is_divisible_by":
                    result = is_divisible_by(int(n))
                elif tool == "guess_number":
                    result = guess_number(int(n))
                else:
                    result = "Error: unknown tool"

                print("\n📦 Tool result:", result)

                history.append(f"Iteration {iteration}: {result}")

                # Add assistant + tool result
                messages.append({"role": "assistant", "content": content})
                messages.append({
                    "role": "user",
                    "content": f"Tool result: {result}"
                })

                # Stop if solved
                if "Correct! The secret number is" in result:
                    return result

                continue

        except json.JSONDecodeError:
            pass

        # If model didn't call tool properly → treat as final answer
        history.append(f"Iteration {iteration}: {content}")

        if "Correct! The secret number is" in content:
            return content

        # Build next prompt
        observation_block = "\n".join(history)

        current_prompt = (
            f"Goal: {goal}\n"
            f"Iteration: {iteration}/{max_iterations}\n"
            "Observations so far:\n"
            f"{observation_block}\n\n"
            "Choose the best next tool call.\n"
            "If confident, use guess_number."
        )

    return "Maximum iterations reached before solving."


# -----------------------------
# Main
# -----------------------------
if __name__ == "__main__":
    goal = "Find the secret number."
    answer = run_hidden_number_agent(goal)

    print("\nFinal answer:\n")
    print(answer)

use ollama_rs::{
    generation::chat::{request::ChatMessageRequest, ChatMessage},
    Ollama,
};
use serde_json::{json, Value};

// -----------------------------
// Secret
// -----------------------------
const SECRET_NUMBER: i32 = 12;

// -----------------------------
// Tools
// -----------------------------
fn greater_than(n: i32) -> Value {
    json!({
        "tool": "greater_than",
        "input": n,
        "result": SECRET_NUMBER > n
    })
}

fn is_divisible_by(n: i32) -> Value {
    if n == 0 {
        return json!({
            "tool": "is_divisible_by",
            "input": n,
            "error": "Division by zero"
        });
    }

    json!({
        "tool": "is_divisible_by",
        "input": n,
        "result": SECRET_NUMBER % n == 0
    })
}

fn guess_number(n: i32) -> Value {
    json!({
        "tool": "guess_number",
        "input": n,
        "correct": SECRET_NUMBER == n,
        "message": if SECRET_NUMBER == n {
            "Correct! You found the secret number."
        } else {
            "Incorrect guess."
        }
    })
}

// -----------------------------
// Agent State
// -----------------------------
struct AgentState {
    goal: String,
    observations: Vec<String>,
    iteration: usize,
    max_iterations: usize,
}

impl AgentState {
    fn new(goal: &str, max_iterations: usize) -> Self {
        Self {
            goal: goal.to_string(),
            observations: vec![],
            iteration: 0,
            max_iterations,
        }
    }

    fn build_prompt(&self) -> String {
        let mut prompt = format!(
            "You are a careful puzzle-solving agent.\n\
             Goal: {}\n\
             Iteration: {}/{}\n\
             The secret number is between 1 and 20.\n\
             Use tools strategically. Do not guess early.\n\n",
            self.goal, self.iteration, self.max_iterations
        );

        if self.observations.is_empty() {
            prompt.push_str("Observations: none\n");
        } else {
            prompt.push_str("Observations:\n");
            for (i, obs) in self.observations.iter().enumerate() {
                prompt.push_str(&format!("{}. {}\n", i + 1, obs));
            }
        }

        prompt.push_str(
            "\nAvailable tools:\n\
            1. greater_than(n)\n\
            2. is_divisible_by(n)\n\
            3. guess_number(n)\n\n\
            Respond ONLY with JSON:\n\
            { \"tool\": \"name\", \"arguments\": { \"n\": number } }\n"
        );

        prompt
    }
}

// -----------------------------
// Main
// -----------------------------
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ollama = Ollama::default();
    let model = "qwen3.5:9b".to_string();

    let mut state = AgentState::new(
        "Find the secret number between 1 and 20.",
        8,
    );

    let mut messages = vec![
        ChatMessage::system(
            "You are an autonomous puzzle-solving agent. Use tools step by step.".into(),
        ),
    ];

    loop {
        if state.iteration >= state.max_iterations {
            println!("❌ Max iterations reached");
            break;
        }

        state.iteration += 1;

        println!("\n=== Iteration {} ===", state.iteration);

        let prompt = state.build_prompt();
        println!("Prompt:\n{}\n", prompt);

        messages.push(ChatMessage::user(prompt));

        let request = ChatMessageRequest::new(model.clone(), messages.clone());
        let res = ollama.send_chat_messages(request).await?;

        let msg = res.message;
        let content = msg.content.clone();

        println!("Model:\n{}\n", content);

        // -----------------------------
        // Parse tool call
        // -----------------------------
        let parsed: Result<Value, _> = serde_json::from_str(&content);

        if let Ok(json) = parsed {
            if let Some(tool) = json.get("tool").and_then(|t| t.as_str()) {
                let n = json["arguments"]["n"].as_i64().unwrap_or(0) as i32;

                let result = match tool {
                    "greater_than" => greater_than(n),
                    "is_divisible_by" => is_divisible_by(n),
                    "guess_number" => {
                        let res = guess_number(n);

                        let observation = format!("{}", res);
                        println!("🎯 {}", observation);

                        if res["correct"] == true {
                            println!("✅ Solved!");
                            return Ok(());
                        }

                        res
                    }
                    _ => json!({"error": "unknown tool"}),
                };

                let observation = format!("{}", result);
                println!("📦 {}", observation);

                state.observations.push(observation.clone());

                messages.push(ChatMessage::assistant(content));
                messages.push(ChatMessage::user(format!(
                    "Tool result: {}",
                    observation
                )));

                continue;
            }
        }

        println!("⚠️ No valid tool call. Stopping.");
        break;
    }

    Ok(())
}

What the Code Is Actually Doing

This example turns the cognitive loop into something visible.

The agent starts with a goal: find the secret number. But it cannot access that answer directly. It can only interact with the environment through tools.

Each tool reveals only a partial clue:

greater_than(n) narrows the range
is_divisible_by(n) tests a mathematical property
guess_number(n) attempts a final answer

That means the agent cannot solve the task in one shot. It has to gather evidence over multiple iterations.

At a high level, the loop looks like this:

the agent begins with the current goal and the observations collected so far
the model decides what tool to call next
the tool returns a result from the environment
that result is added back into the agent’s observations
the next iteration starts with a richer view of the world

In other words, the code is implementing the cognitive loop directly:

observe by reading the current goal and prior observations
reason and plan inside the model call
act by executing the chosen tool
reflect by feeding the tool result back into the running state
repeat until the goal is achieved or the iteration limit is reached

The Python version keeps this compact and easy to read. The Rust version makes the moving parts more explicit by showing the tool schemas, session handling, state updates, and repeated loop structure more directly.

This is still a minimal teaching example, not a production-grade agent. Real systems may add richer memory, stronger planning, retries, safety controls, and more structured state management. But the underlying pattern is already here.

That is the main takeaway: many modern agent systems are, underneath the abstractions, variations of this same loop.

Wrapping Up

A cognitive agent is not defined by a fancy prompt. It is defined by its internal loop.

It maintains a working model of the task, updates that model through observations, chooses actions, uses tools, and keeps going until it reaches a stopping condition.

That is what gives agents their sense of autonomy.

Once you understand that loop, the rest of agent engineering starts to make more sense. Different frameworks may package it in different ways, but the core idea remains remarkably similar:

observe → reason → plan → act → reflect

→ Next up: The Inference-Time Compute Revolution