What is an Agent?

Updated on 15th March 2026

An agent is a system that uses an LLM to pursue a goal through a loop of observing, deciding, acting, and adjusting.

Unlike a normal chatbot, an agent does not just reply once. It can decide what to do next, use tools, inspect results, and continue until it either completes the task or hits a stopping condition.

A useful mental model:
Agent = LLM + Tools + State + Loop

That is the core idea behind agentic AI.

In this article, we will answer five simple questions:

What is an agent?
What makes it different from a pipeline?
What does the agent loop look like?
What does the smallest useful agent look like in code?
When should you avoid building an agent?

What Is an Agent?

A system becomes an agent when it can do four things:

Observe the current state
This could be a user message, a file, a tool result, a database row, or an API response.
Decide what to do next
The next step is chosen at runtime, based on the situation.
Act on the world
It may call a tool, run code, query data, search the web, or return an answer.
Repeat until done
Instead of one input and one output, it can work through multiple steps.

The key idea is runtime decision-making.

A lot of software uses LLMs. That alone does not make it agentic. If the path is fixed in advance, it is still a pipeline.

Agent vs Pipeline

This is the distinction that matters most in practice.

Suppose the task is:

Search the web, collect useful information, and produce a summary.

Deterministic Pipeline

[User Query] → [Search] → [Collect Pages] → [Summarize] → [Output]

This is not an agent.

Why? Because the execution path is already fixed by the developer. The system does not choose whether to search again, whether the results are weak, or whether a different approach is needed.

Agent Loop

[User Query] → [Model decides next step]
   • Need more info? → Search
   • Results weak? → Search again differently
   • Enough evidence? → Summarize
   • Goal complete? → Stop

This is an agent.

The difference is not that one uses an LLM and the other does not. The difference is that in the second case, the model is acting as a decision engine inside a loop.

The Core Agent Loop

Most agents, no matter what framework they use, follow some version of this pattern:

observe → reason → plan → act → reflect

Observe

Read the current state: the user request, prior results, memory, or tool outputs.

Reason

Interpret the situation. What is known? What is missing? What matters right now?

Plan

Choose the next step. Should the system use a tool, ask for more information, try again, or stop?

Act

Execute that step.

Reflect

Look at the result, update state, and decide whether to continue.

This loop is what makes an agent adaptive instead of purely reactive.

The Minimal Agent Loop in Code

If your environment is not set up yet, please follow the instructions here and then come back.

Let’s now look at a tiny example.

Problem: Count the number of files in the current directory, and tell one interesting mathematical fact about that number multiplied by a random prime number.

For this exercice, a Gemini based code snippet is provided in the sample code github repo. So, that if you are using Gemini, you can get started. In future, we will only provide Ollama based sample code, unless there is a specific reason such as some feature being unavailable in Ollama.

Python
Rust

import json
from pathlib import Path
import ollama

# -----------------------------
# Tool: Count regular files
# -----------------------------
def count_files(directory: str = ".") -> str:
    try:
        path = Path(directory)

        if not path.exists():
            return f"Error: directory '{directory}' does not exist."

        if not path.is_dir():
            return f"Error: '{directory}' is not a directory."

        count = sum(
            1 for item in path.iterdir()
            if item.is_file() and not item.name.startswith(".")
        )

        return str(count)

    except Exception as e:
        return f"Error counting files: {e}"


# -----------------------------
# Agent loop (Ollama version)
# -----------------------------
def run_agent(user_query: str):
    model = "qwen3.5:9b"

    # 🔥 This replaces Gemini's tools=[count_files]
    system_prompt = """
You are a helpful assistant.

You have access to a tool:

Tool: count_files(directory: string)
Description: Count number of regular non-hidden files in a directory.

When the user asks about file counts:
- DO NOT guess
- ALWAYS call the tool

To call the tool, respond ONLY with JSON:

{
  "tool": "count_files",
  "arguments": { "directory": "." }
}

After receiving the tool result, continue normally.
"""

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_query},
    ]

    while True:
        response = ollama.chat(
            model=model,
            messages=messages,
        )

        content = response["message"]["content"]
        print("Model:", content, "\n")

        # Try to parse tool call
        try:
            parsed = json.loads(content)

            if parsed.get("tool") == "count_files":
                directory = parsed.get("arguments", {}).get("directory", ".")

                result = count_files(directory)
                print("📦 Tool Result:", result, "\n")

                # Add assistant tool call
                messages.append({"role": "assistant", "content": content})

                # Feed result back (like Gemini internally does)
                messages.append({
                    "role": "user",
                    "content": f"Tool result: {result}. Now answer the original question."
                })

                continue

        except json.JSONDecodeError:
            pass

        # Final response
        return content


# -----------------------------
# Main
# -----------------------------
if __name__ == "__main__":
    query = "Count the number of files in the current folder, and tell me one interesting mathematical fact about that number multiplied by a random prime number."

    answer = run_agent(query)

    print("\nFinal answer:\n")
    print(answer)

use ollama_rs::{
    generation::chat::{request::ChatMessageRequest, ChatMessage},
    Ollama,
};
use serde_json::{json, Value};
use std::fs;
use std::path::Path;

// -----------------------------
// Tool: Count Files in Directory
// -----------------------------
pub struct CountFiles;

impl CountFiles {
    pub async fn execute(directory: &Value) -> Value {
        let directory_str = directory.as_str().unwrap_or(".");
        let path = Path::new(directory_str);

        if !path.exists() {
            return json!({
                "error": format!("directory '{}' does not exist", directory_str)
            });
        }

        let count = match fs::read_dir(path) {
            Ok(entries) => entries
                .filter_map(|entry| entry.ok())
                .filter(|entry| {
                    let name = entry.file_name();
                    let name = name.to_string_lossy();
                    entry.path().is_file() && !name.starts_with('.')
                })
                .count(),
            Err(e) => {
                return json!({
                    "error": format!("failed to read directory: {}", e)
                })
            }
        };

        json!({
            "directory": directory_str,
            "count": count
        })
    }
}

// -----------------------------
// Main Agent Loop
// -----------------------------
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ollama = Ollama::default();
    let model = "qwen3.5:9b".to_string();

    let system_prompt = r#"
You are an AI agent.

You have access to a tool:

Tool: count_files
Description: Count files in a directory

When needed, respond with a JSON tool call:

{
  "tool": "count_files",
  "arguments": { "directory": "." }
}

Otherwise, respond normally.
"#;

    let user_prompt = "Count the number of files in the current folder and tell me a math fact about the number multiplied by a random prime number.";

    println!("User: {}\n", user_prompt);

    let mut messages = vec![
        ChatMessage::system(system_prompt.to_string()),
        ChatMessage::user(user_prompt.to_string()),
    ];

    loop {
        let request = ChatMessageRequest::new(model.clone(), messages.clone());
        let res = ollama.send_chat_messages(request).await?;

        let response_message = res.message;
        let content = response_message.content.clone();

        println!("Model: {}\n", content);

        // Try parsing tool call
        let parsed: Result<Value, _> = serde_json::from_str(&content);

        if let Ok(json) = parsed {
            if json.get("tool") == Some(&Value::String("count_files".to_string())) {
                let args = &json["arguments"];

                let result = CountFiles::execute(args).await;

                println!("📦 Tool Result: {}\n", result);

                // feed result back
                messages.push(ChatMessage::assistant(content));
                messages.push(ChatMessage::user(format!(
                    "Tool result: {}. Now complete the task.",
                    result
                )));

                continue;
            }
        }

        // No tool call → final answer
        println!("✅ Final Answer: {}\n", content);
        break;
    }

    Ok(())
}

This may look like a lot at first, but don’t worry — we will unpack each concept properly in upcoming articles.

What This Example Shows

This is the smallest useful shape of an agent.

The user asks for something the model cannot know just from text alone: the actual number of files in a directory.

So the system does not guess. It uses a tool.

That gives us a tiny but real agent flow:

User request → model decides to use tool → tool returns result → model answers

The important separation is this:

the tool performs the real operation
the model decides when to use it and how to explain the result

The Python version hides most of the plumbing. The Rust version exposes more of the machinery. But the pattern is the same in both.

This tiny loop is the foundation of much larger agent systems.

When Not to Build an Agent

This part matters just as much as the definition.

Agents are powerful, but they are also slower, costlier, and less predictable than fixed pipelines. A lot of tasks do not need autonomy at all.

Use a pipeline when:

the steps are known in advance
the task is short and predictable
reliability matters more than flexibility

Use an agent when:

the next step depends on intermediate results
the system may need retries or tool selection
the path cannot be fully known beforehand

A few examples:

Use Case	Better Fit	Why
Extract text from PDF and summarize	Pipeline	Fixed flow
Classify a support ticket	Pipeline	Single bounded task
Generate a report from known tables	Pipeline	Predictable steps
Research assistant with repeated searches	Agent	Search path is unknown upfront
Debug code with repeated test runs	Agent	Requires iteration and feedback
Multi-step workflow with branching outcomes	Agent	Next action depends on results

A good rule of thumb:

Do not add autonomy unless the task genuinely needs runtime decision-making.

That one idea saves a lot of unnecessary complexity.

Wrapping Up

An agent is not just an LLM with a fancy prompt.

It is a system that:

observes state
decides what to do next
acts using tools or outputs
loops until the goal is reached or it stops

That also means:

not every LLM app is an agent
pipelines and agents are different execution models
the loop is the real heart of agentic systems
sometimes the smartest move is not to build an agent at all

If you understand that distinction clearly, you already have a strong foundation.

→ Next up: The Cognitive Architecture of Agents