The Observation Processor
When an agent executes a tool, it receives raw output — JSON from APIs, HTML pages, database rows, log files, or search results. These outputs are often verbose, noisy, and contain far more information than the agent needs.
Feeding raw tool results directly into the LLM causes serious problems:
- Rapidly consumes context window tokens
- Introduces irrelevant noise that harms reasoning
- Makes multi-step planning unstable and expensive
The Observation Processor solves this by transforming raw tool outputs into clean, concise, and structured observations that are optimized for the agent’s reasoning loop.
The Role of Observations in the Agent Loop
A typical agent follows this cycle:
Observe → Reason → Plan → Act → ObserveThe Observation Processor powers the Observe step. It takes the Execution Engine’s raw output and produces a high-quality observation that is added to the agent’s working memory.
Raw Tool Output ↓Observation Processor (Parse → Filter → Summarize → Structure) ↓Structured Observation ↓Working Memory → Next Reasoning StepHigh-quality observations are essential for reliable multi-step reasoning.
Raw Output vs Structured Observation
Raw Tool Output (Example)
{ "weather": { "location": "Tokyo", "temperature": 26.4, "humidity": 78, "pressure": 1008, "wind_speed": 12.3, "wind_direction": "NE", "sunrise": "04:52", "sunset": "18:37", "condition": "partly cloudy" }}Processed Observation
Observation: Current weather in Tokyo is 26°C with 78% humidity and partly cloudy conditions.The processed version is dramatically shorter, focused, and reasoning-friendly.
Core Responsibilities
The Observation Processor typically performs four key operations, often in combination:
- Parsing — Extract relevant fields from structured formats (JSON, XML, etc.)
- Filtering — Remove noise, metadata, logs, and irrelevant details
- Summarization / Compression — Reduce large outputs while preserving key information
- Structuring — Convert the result into a consistent, machine-readable format
Parsing Structured Outputs
Many tools already return structured data. The processor extracts only what matters.
def process_weather(data: dict) -> str: weather = data.get("weather", {}) return ( f"Observation: Current temperature in {weather.get('location')} " f"is {weather.get('temperature')}°C with {weather.get('humidity')}% humidity." )use serde_json::Value;
fn process_weather(data: &Value) -> String { let weather = &data["weather"]; format!( "Observation: Current temperature in {} is {}°C with {}% humidity.", weather["location"].as_str().unwrap_or("unknown"), weather["temperature"].as_f64().unwrap_or(0.0), weather["humidity"].as_i64().unwrap_or(0) )}Summarization and Compression
For large outputs (search results, long documents, web pages), simple extraction is not enough — the processor must summarize.
Modern systems often use a smaller/faster LLM (or the same model with a tight prompt) for summarization.
Example prompt:
Summarize the following tool output in 1-2 sentences,focusing only on information relevant to the current goal: "{goal}"def summarize_observation(raw_output: str, goal: str, llm) -> str: prompt = f""" Summarize this tool output in one concise paragraph. Focus only on details relevant to: {goal}
Tool output: {raw_output} """ return llm.generate(prompt)fn summarize_observation(raw_output: &str, goal: &str, llm: &LLMClient) -> String { let prompt = format!( "Summarize this tool output in one concise paragraph.\n\ Focus only on details relevant to: {}\n\nTool output:\n{}", goal, raw_output ); llm.generate(&prompt)}Pro tip: Use token-aware compression and consider hierarchical summarization for very large outputs.
Filtering Noise
Tools often return debugging info, HTTP headers, metadata, or error traces alongside useful data. The processor must strip this away.
Example filtered observation:
Observation: Found 3 matching records. User 'alice' has admin privileges.Structured Observations (Recommended)
Many production systems go beyond plain text and use structured observations:
{ "type": "weather_report", "location": "Tokyo", "temperature_c": 26, "humidity": 78, "condition": "partly_cloudy", "source": "weather_api"}Benefits:
- More reliable state tracking
- Easier to query or validate in working memory
- Reduces hallucination risk
- Enables better reflection in later stages
You can use Pydantic (Python) or Serde + typed structs (Rust) to enforce observation schemas.
Integrating Observations into the Agent Loop
Processed observations are stored in working memory and become context for the next reasoning step.
Example trace:
Thought: I need current market trends for AI GPUs.Action: web_search(query="AI GPU market 2025")
Observation: Market expected to grow 40% annually through 2028, driven by data center expansion.
Thought: Compare leading vendors...This clean observe–reason–act cycle enables stable, long-horizon reasoning.
Why the Observation Processor Matters
Without proper observation processing, even powerful Tool Managers and Execution Engines fail because the LLM gets overwhelmed by noise. A good Observation Processor is what turns raw tool results into actionable intelligence.
Looking Ahead
The Observation Processor acts as the agent’s “interpreter”, converting noisy machine outputs into clear reasoning inputs.
→ Continue to 2.8 — Reflection and Termination: How agents evaluate progress, detect errors, and decide when to stop.