Observability

Heartbit emits structured events throughout agent execution, giving you full visibility into LLM calls, tool usage, orchestration decisions, and safety guardrails.

AgentEvent System

Every significant action in the agent loop emits an AgentEvent — a tagged enum serialized as JSON with snake_case type discriminators. Events carry the agent name for identification in multi-agent runs.

Lifecycle Events

Event	Fields	Description
`run_started`	`agent`, `task`	Agent loop begins
`turn_started`	`agent`, `turn`, `max_turns`	New reasoning turn
`run_completed`	`agent`, `total_usage`, `tool_calls_made`	Successful completion
`run_failed`	`agent`, `error`, `partial_usage`	Failure with partial token usage

LLM Events

Event	Fields	Description
`llm_response`	`agent`, `turn`, `usage`, `stop_reason`, `tool_call_count`, `text`, `latency_ms`, `model`, `time_to_first_token_ms`	LLM response with TTFT tracking
`retry_attempt`	`agent`, `attempt`, `max_retries`, `delay_ms`, `error_class`	Retry before sleep delay
`model_escalated`	`agent`, `from_tier`, `to_tier`, `reason`	Cascade escalation between model tiers

The llm_response event includes wall-clock latency and time-to-first-token (streaming only, 0 for non-streaming). The model field is present when using cascading providers.

Tool Events

Event	Fields	Description
`tool_call_started`	`agent`, `tool_name`, `tool_call_id`, `input`	Tool execution begins
`tool_call_completed`	`agent`, `tool_name`, `tool_call_id`, `is_error`, `duration_ms`, `output`	Tool execution finished

Approval Events

Event	Fields	Description
`approval_requested`	`agent`, `turn`, `tool_names`	Human-in-the-loop prompt sent
`approval_decision`	`agent`, `turn`, `approved`	Human approval response received

Orchestration Events

Event	Fields	Description
`sub_agents_dispatched`	`agent`, `agents`	Sub-agents dispatched by orchestrator
`sub_agent_completed`	`agent`, `success`, `usage`	Sub-agent finished
`agent_spawned`	`agent`, `spawned_name`, `tools`, `task`	Dynamic agent created at runtime
`task_routed`	`decision`, `reason`, `selected_agent`, `complexity_score`, `escalated`	Routing decision by complexity analyzer

Safety Events

Event	Fields	Description
`guardrail_denied`	`agent`, `hook`, `reason`, `tool_name`	Guardrail blocked an operation
`guardrail_warned`	`agent`, `hook`, `reason`, `tool_name`	Guardrail warned but allowed
`budget_exceeded`	`agent`, `used`, `limit`, `partial_usage`	Token budget exceeded

The hook field indicates which guardrail hook triggered: "post_llm", "pre_tool", or "post_tool". The tool_name field is present for tool-level hooks.

Context Events

Event	Fields	Description
`context_summarized`	`agent`, `turn`, `usage`	Context compacted at threshold
`auto_compaction_triggered`	`agent`, `turn`, `success`, `usage`	Overflow recovery attempt
`doom_loop_detected`	`agent`, `turn`, `consecutive_count`, `tool_names`	Stuck loop detected
`session_pruned`	`agent`, `turn`, `tool_results_pruned`, `bytes_saved`, `tool_results_total`	Old tool results pruned

Sensor Events

Event	Fields	Description
`sensor_event_processed`	`sensor_name`, `decision`, `priority`, `story_id`	Sensor triage decision
`story_updated`	`story_id`, `subject`, `event_count`, `priority`	Story correlation update

OnEvent Callback

Wire event handling via the builder:

use heartbit::AgentRunner;
use heartbit::agent::AgentEvent;

let runner = AgentRunner::builder(provider)
    .name("researcher")
    .system_prompt("You are a researcher.")
    .on_event(Arc::new(|event: AgentEvent| {
        eprintln!("{}", serde_json::to_string(&event).unwrap());
    }))
    .build()?;

The callback type is dyn Fn(AgentEvent) + Send + Sync. Keep handlers fast to avoid blocking the agent loop.

Payload Truncation

Event payloads (LLM text, tool input/output) are truncated at 64KB (EVENT_MAX_PAYLOAD_BYTES = 65536). Truncated strings include a suffix like [truncated: 1234 bytes omitted]. Truncation respects UTF-8 character boundaries.

CLI Integration

Use --verbose or -v to emit events as JSON to stderr:

heartbit run --config heartbit.toml --verbose "Analyze this codebase"

Each event is a single JSON line, suitable for piping to jq or log aggregators.

Observability Modes

Control verbosity via the HEARTBIT_OBSERVABILITY environment variable or the [telemetry] config section. Priority order:

HEARTBIT_OBSERVABILITY env var (highest)
[telemetry] observability_mode in config TOML
AgentRunnerBuilder::observability_mode() / OrchestratorBuilder::observability_mode()
Default: production

Mode	Span Data	Metrics	Payloads
`production`	Names + durations only	No	No
`analysis`	Names + durations	Tokens, latencies, costs, stop reasons	No
`debug`	Names + durations	Tokens, latencies, costs, stop reasons	Full (truncated to 4KB)

HEARTBIT_OBSERVABILITY=debug heartbit run --config heartbit.toml "debug this"

Or in config:

[telemetry]
observability_mode = "analysis"

OpenTelemetry Integration

Heartbit supports optional OTLP export for distributed tracing. Configure via the [telemetry] section:

[telemetry]
enabled = true
endpoint = "http://localhost:4317"
service_name = "heartbit-agent"
observability_mode = "analysis"

Span attributes follow the OpenTelemetry GenAI Semantic Conventions (v1.38.0), so OTel-compatible backends (Jaeger, Grafana Tempo, Honeycomb) render agent traces with standard attribute names.

The init_tracing_from_config() function wires telemetry for all CLI commands (run, chat, serve).