Skip to content

Configuration Reference

Heartbit uses TOML configuration files. Pass via --config heartbit.toml or place in the working directory.

[provider]
name = "anthropic" # or "openrouter"
model = "claude-sonnet-4-20250514"
prompt_caching = true # Anthropic only; default false
[provider.retry] # optional: retry transient failures
max_retries = 3
base_delay_ms = 500
max_delay_ms = 30000

Retries on HTTP status codes 429, 500, 502, 503, 529, and network errors with exponential backoff.

[provider.cascade] # optional: try cheaper models first
enabled = true
[[provider.cascade.tiers]]
model = "anthropic/claude-3.5-haiku" # cheapest tier tried first
[provider.cascade.gate]
type = "heuristic" # escalate if response is low-quality
min_output_tokens = 10 # escalate on very short responses
accept_tool_calls = false # escalate if cheap model wants to use tools
escalate_on_max_tokens = false # escalate on max_tokens stop reason

The cascading provider tries the cheapest model first and escalates to more expensive tiers when the heuristic gate rejects the response.

[orchestrator]
max_turns = 10
max_tokens = 4096
run_timeout_seconds = 300 # wall-clock deadline for the entire run
routing = "auto" # "auto", "always_orchestrate", or "single_agent"
dispatch_mode = "parallel" # "parallel" or "sequential" (sub-agent dispatch)
reasoning_effort = "high" # "high", "medium", "low", or "none"
tool_profile = "standard" # "conversational", "standard", or "full"
FieldDefaultDescription
max_turns10Maximum reasoning turns for the orchestrator
max_tokens4096Maximum tokens per LLM response
run_timeout_secondsWall-clock deadline for the entire run
routing"auto"auto selects single-agent when only one agent is defined. always_orchestrate forces orchestrator. single_agent skips orchestration.
dispatch_mode"parallel"How sub-agents are dispatched: parallel (concurrent via tokio::JoinSet) or sequential
reasoning_effortControls extended thinking: high, medium, low, or none
tool_profilePre-filters tools before each turn: conversational (minimal), standard (default set), full (all tools)
[[agents]]
name = "researcher"
description = "Research specialist"
system_prompt = "You are a research specialist."
mcp_servers = ["http://localhost:8000/mcp"]
# All optional:
max_turns = 20 # override orchestrator default
max_tokens = 16384
tool_timeout_seconds = 60
max_tool_output_bytes = 16384
run_timeout_seconds = 120 # per-agent wall-clock deadline
summarize_threshold = 80000
reasoning_effort = "medium" # per-agent override
tool_profile = "full" # per-agent override
context_strategy = { type = "sliding_window", max_tokens = 100000 }
# context_strategy = { type = "summarize", threshold = 80000 }
# context_strategy = { type = "unlimited" }
[agents.session_prune] # optional: trim old tool results before LLM calls
keep_recent_n = 2 # keep N most recent message pairs at full fidelity
pruned_tool_result_max_bytes = 200 # truncate older tool results to this size
preserve_task = true # keep the first user message (task) intact
# Simple URL
mcp_servers = ["http://localhost:8000/mcp"]
# With authentication header
mcp_servers = [{ url = "http://localhost:8000/mcp", auth_header = "Bearer tok_xxx" }]
[agents.provider]
name = "anthropic"
model = "claude-opus-4-20250514"
prompt_caching = true

Each sub-agent can use a different LLM model by specifying its own [agents.provider] section.

[agents.response_schema]
type = "object"
[agents.response_schema.properties.score]
type = "number"
[agents.response_schema.properties.summary]
type = "string"

When set, a synthetic __respond__ tool is injected. The agent produces structured JSON via the tool call.

StrategyDescription
unlimitedNo trimming (default)
sliding_windowKeep system prompt + recent messages within max_tokens budget
summarizeLLM-generated summary when context exceeds threshold
[memory]
type = "in_memory" # or: type = "postgres", database_url = "..."
[memory.embedding] # optional: enables hybrid retrieval (BM25 + vector)
provider = "local" # "openai", "local", or "none" (default)
model = "all-MiniLM-L6-v2" # model name (provider-specific)
cache_dir = "/tmp/fastembed" # local provider only: model cache directory
# api_key_env = "OPENAI_API_KEY" # openai provider only

Memory backends:

BackendDescription
in_memoryIn-process memory store (lost on restart)
postgresPostgreSQL with pgvector for persistent memory

Embedding providers enable hybrid retrieval (BM25 keyword scoring + vector cosine similarity fused via RRF):

ProviderDescription
noneBM25 only (default)
localOffline ONNX embeddings via fastembed (requires local-embedding feature)
openaiOpenAI embeddings API
[knowledge]
chunk_size = 1000 # max bytes per chunk (default: 1000)
chunk_overlap = 200 # overlap bytes between chunks (default: 200)
[[knowledge.sources]]
type = "file"
path = "README.md"
[[knowledge.sources]]
type = "glob"
pattern = "docs/**/*.md"
[[knowledge.sources]]
type = "url"
url = "https://docs.example.com/api"

Knowledge base sources are loaded at startup. Files are split into overlapping chunks using paragraph-aware splitting.

[restate]
endpoint = "http://localhost:9070"

Restate endpoint for durable execution. Requires the restate feature flag.

[daemon]
bind = "127.0.0.1:3000" # HTTP API bind address
max_concurrent_tasks = 4 # bounded concurrency
[daemon.auth]
bearer_tokens = ["$YOUR_API_KEY"] # static API keys (multiple for rotation)
jwks_url = "https://idp.example.com/.well-known/jwks.json" # JWT/JWKS auth
issuer = "https://idp.example.com" # optional: validate iss claim
audience = "heartbit-daemon" # optional: validate aud claim
# user_id_claim = "sub" # JWT claim for user ID (default: "sub")
# tenant_id_claim = "tid" # JWT claim for tenant ID (default: "tid")
# roles_claim = "roles" # JWT claim for roles (default: "roles")

Supports both static bearer tokens and JWT/JWKS authentication. Multiple bearer tokens can be configured for key rotation.

[daemon.kafka]
brokers = "localhost:9092"
consumer_group = "heartbit-daemon" # default
commands_topic = "heartbit.commands"
events_topic = "heartbit.events"
[[daemon.schedules]]
name = "daily-review"
cron = "0 0 9 * * *" # 6-field cron (sec min hr dom mon dow)
task = "Review yesterday's work"

Uses 6-field cron expressions: second minute hour day-of-month month day-of-week.

[telemetry]
otlp_endpoint = "http://localhost:4317"
service_name = "heartbit"

OpenTelemetry tracing via OTLP exporter. Traces are exported to the configured endpoint.

[provider]
name = "anthropic"
model = "claude-sonnet-4-20250514"
[[agents]]
name = "researcher"
description = "Research specialist"
system_prompt = "You are a research specialist."
[[agents]]
name = "writer"
description = "Writing specialist"
system_prompt = "You are a writing specialist."