Agent Loop
Every Heartbit agent runs a ReAct (Reason-Act) loop: the LLM generates a response, optionally calls tools, receives results, and repeats until the task is complete or a limit is reached.
The Cycle
Section titled “The Cycle”User message | v+--------+ +------------+ +--------------+| LLM | --> | Tool Calls | --> | Tool Results |+--------+ +------------+ +--------------+ ^ | +-------------------------------------+ (repeat until done)Each iteration is a turn. On each turn:
- The full conversation history (system prompt + messages) is sent to the LLM.
- The LLM responds with text, tool calls, or both.
- If tool calls are present, they execute in parallel via
tokio::JoinSet. - Tool results are appended to the conversation as a new message.
- The loop continues with the next LLM call.
The loop ends when the LLM produces a response with no tool calls (stop reason: EndTurn), the turn limit is reached (MaxTurns), or the token budget is exhausted (MaxTokens).
AgentRunner
Section titled “AgentRunner”AgentRunner<P> is the core type that implements the agent loop. It is generic over a provider P: LlmProvider.
use std::sync::Arc;use heartbit::{ AnthropicProvider, BoxedProvider, RetryingProvider, AgentRunner,};
let provider = Arc::new(BoxedProvider::new( RetryingProvider::with_defaults( AnthropicProvider::new(api_key, "claude-sonnet-4-20250514") )));
let mut agent = AgentRunner::builder(provider) .system_prompt("You are a helpful assistant.") .on_text(Arc::new(|text| print!("{text}"))) .build()?;
let output = agent.execute("Analyze the Rust ecosystem").await?;println!("\nTokens: {} in / {} out", output.tokens_used.input_tokens, output.tokens_used.output_tokens);Builder Configuration
Section titled “Builder Configuration”AgentRunner::builder(provider) returns an AgentRunnerBuilder with these options:
| Method | Description |
|---|---|
.system_prompt(s) | Set the system prompt |
.tools(vec) | Register Vec<Arc<dyn Tool>> |
.max_turns(n) | Maximum turns before stopping (default: 10) |
.max_tokens(n) | Max tokens per LLM response |
.max_total_tokens(n) | Total token budget across all turns |
.guardrails(vec) | Attach guardrails to the loop |
.memory(m) | Enable memory system |
.context_strategy(s) | Set context management strategy |
.on_text(cb) | Streaming text callback |
.on_event(cb) | Structured event callback |
.on_approval(cb) | Human-in-the-loop approval callback |
.on_input(cb) | Multi-turn input callback (for chat mode) |
.structured_schema(s) | Force structured JSON output |
AgentOutput
Section titled “AgentOutput”Every execute() call returns an AgentOutput:
pub struct AgentOutput { pub result: String, // Final text response pub tool_calls_made: usize, // Total tool invocations pub tokens_used: TokenUsage, // Input/output/cache token counts pub structured: Option<Value>, // Structured output (if schema set) pub estimated_cost_usd: Option<f64>, // Estimated USD cost (known models)}The loop ends when the LLM returns StopReason::EndTurn (natural completion), StopReason::MaxTokens (token limit), or when max_turns is exceeded (returns Error::MaxTurnsExceeded).
Token Tracking
Section titled “Token Tracking”TokenUsage accumulates across all turns in a run:
pub struct TokenUsage { pub input_tokens: u32, pub output_tokens: u32, pub cache_creation_input_tokens: u32, pub cache_read_input_tokens: u32, pub reasoning_tokens: u32,}Cost estimation is available via estimate_cost(model, usage) which returns USD cost for known Claude models, accounting for cache read/write rates.
Streaming
Section titled “Streaming”The on_text callback receives text deltas as they arrive from the LLM’s SSE stream. This provides real-time output without waiting for the full response:
let mut agent = AgentRunner::builder(provider) .on_text(Arc::new(|text| print!("{text}"))) .build()?;Agent Events
Section titled “Agent Events”The on_event callback receives structured AgentEvent variants at key points in the loop:
RunStarted/RunCompleted/RunFailed— lifecycle boundariesTurnStarted/LlmResponse— per-turn progressToolCallStarted/ToolCallCompleted— tool execution trackingApprovalRequested/ApprovalDecision— HITL interactionsContextSummarized— context management actionsGuardrailDenied— guardrail interventions
Use --verbose in the CLI to emit events as JSON to stderr.