Agent Loop

Every Heartbit agent runs a ReAct (Reason-Act) loop: the LLM generates a response, optionally calls tools, receives results, and repeats until the task is complete or a limit is reached.

The Cycle

User message
    |
    v
+--------+     +------------+     +--------------+
|  LLM   | --> | Tool Calls | --> | Tool Results |
+--------+     +------------+     +--------------+
    ^                                     |
    +-------------------------------------+
              (repeat until done)

Each iteration is a turn. On each turn:

The full conversation history (system prompt + messages) is sent to the LLM.
The LLM responds with text, tool calls, or both.
If tool calls are present, they execute in parallel via tokio::JoinSet.
Tool results are appended to the conversation as a new message.
The loop continues with the next LLM call.

The loop ends when the LLM produces a response with no tool calls (stop reason: EndTurn), the turn limit is reached (MaxTurns), or the token budget is exhausted (MaxTokens).

AgentRunner

AgentRunner<P> is the core type that implements the agent loop. It is generic over a provider P: LlmProvider.

use std::sync::Arc;
use heartbit::{
    AnthropicProvider, BoxedProvider, RetryingProvider,
    AgentRunner,
};

let provider = Arc::new(BoxedProvider::new(
    RetryingProvider::with_defaults(
        AnthropicProvider::new(api_key, "claude-sonnet-4-20250514")
    )
));

let mut agent = AgentRunner::builder(provider)
    .system_prompt("You are a helpful assistant.")
    .on_text(Arc::new(|text| print!("{text}")))
    .build()?;

let output = agent.execute("Analyze the Rust ecosystem").await?;
println!("\nTokens: {} in / {} out",
    output.tokens_used.input_tokens,
    output.tokens_used.output_tokens);

Builder Configuration

AgentRunner::builder(provider) returns an AgentRunnerBuilder with these options:

Method	Description
`.system_prompt(s)`	Set the system prompt
`.tools(vec)`	Register `Vec<Arc<dyn Tool>>`
`.max_turns(n)`	Maximum turns before stopping (default: 10)
`.max_tokens(n)`	Max tokens per LLM response
`.max_total_tokens(n)`	Total token budget across all turns
`.guardrails(vec)`	Attach guardrails to the loop
`.memory(m)`	Enable memory system
`.context_strategy(s)`	Set context management strategy
`.on_text(cb)`	Streaming text callback
`.on_event(cb)`	Structured event callback
`.on_approval(cb)`	Human-in-the-loop approval callback
`.on_input(cb)`	Multi-turn input callback (for chat mode)
`.structured_schema(s)`	Force structured JSON output

AgentOutput

Every execute() call returns an AgentOutput:

pub struct AgentOutput {
    pub result: String,                   // Final text response
    pub tool_calls_made: usize,           // Total tool invocations
    pub tokens_used: TokenUsage,          // Input/output/cache token counts
    pub structured: Option<Value>,        // Structured output (if schema set)
    pub estimated_cost_usd: Option<f64>,  // Estimated USD cost (known models)
}

The loop ends when the LLM returns StopReason::EndTurn (natural completion), StopReason::MaxTokens (token limit), or when max_turns is exceeded (returns Error::MaxTurnsExceeded).

Token Tracking

TokenUsage accumulates across all turns in a run:

pub struct TokenUsage {
    pub input_tokens: u32,
    pub output_tokens: u32,
    pub cache_creation_input_tokens: u32,
    pub cache_read_input_tokens: u32,
    pub reasoning_tokens: u32,
}

Cost estimation is available via estimate_cost(model, usage) which returns USD cost for known Claude models, accounting for cache read/write rates.

Streaming

The on_text callback receives text deltas as they arrive from the LLM’s SSE stream. This provides real-time output without waiting for the full response:

let mut agent = AgentRunner::builder(provider)
    .on_text(Arc::new(|text| print!("{text}")))
    .build()?;

Agent Events

The on_event callback receives structured AgentEvent variants at key points in the loop:

RunStarted / RunCompleted / RunFailed — lifecycle boundaries
TurnStarted / LlmResponse — per-turn progress
ToolCallStarted / ToolCallCompleted — tool execution tracking
ApprovalRequested / ApprovalDecision — HITL interactions
ContextSummarized — context management actions
GuardrailDenied — guardrail interventions

Use --verbose in the CLI to emit events as JSON to stderr.