Skip to content

Providers

Heartbit communicates with LLMs through the LlmProvider trait. Providers are composable — you wrap a base provider with retry logic, cascading, or other behaviors.

The LlmProvider trait defines two methods:

  • complete() — send a completion request and receive the full response
  • stream_complete() — send a completion request and receive an SSE stream of deltas

Both accept messages, tools, system prompt, and configuration (max tokens, temperature, tool choice).

Direct integration with the Anthropic API. Supports Claude model families with native SSE streaming.

use heartbit::AnthropicProvider;
let provider = AnthropicProvider::new(api_key, "claude-sonnet-4-20250514");

Enable prompt caching to reduce costs on repeated conversations:

let provider = AnthropicProvider::new(api_key, "claude-sonnet-4-20250514")
.with_prompt_caching();

Prompt caching places 3 cache breakpoints: the system prompt, the last tool definition, and the second-to-last user message.

Routes requests through OpenRouter, supporting a wide range of models. Translates between OpenAI-format SSE and Heartbit’s internal format.

use heartbit::OpenRouterProvider;
let provider = OpenRouterProvider::new(api_key, "anthropic/claude-sonnet-4-20250514");

Providers compose by wrapping. The typical stack looks like:

CascadingProvider (optional)
-> RetryingProvider
-> AnthropicProvider / OpenRouterProvider

Wraps any provider with exponential backoff retry on transient failures:

use heartbit::RetryingProvider;
let provider = RetryingProvider::with_defaults(
AnthropicProvider::new(api_key, "claude-sonnet-4-20250514")
);

Retries on HTTP status codes: 429 (rate limit), 500, 502, 503, 529, and network errors.

Tries cheaper models first and escalates to more expensive ones when a confidence gate rejects the response:

[provider.cascade]
use heartbit::CascadingProvider;
// Configured via TOML:
// enabled = true
// tiers = [
// { model = "claude-haiku-4-5-20251001" },
// { model = "claude-sonnet-4-20250514" },
// ]

The ConfidenceGate trait evaluates responses. The built-in HeuristicGate checks for:

  • Refusal patterns in the response
  • Minimum token thresholds
  • Tool call acceptance
  • MaxTokens stop reason (response may be truncated)

Non-final tiers use complete() even for streaming requests to avoid streaming tokens that get discarded on escalation.

BoxedProvider provides object-safe wrapping for providers. Since LlmProvider uses async methods (RPITIT), it can’t be used as a trait object directly. BoxedProvider bridges this gap:

use heartbit::BoxedProvider;
let provider = Arc::new(BoxedProvider::new(
RetryingProvider::with_defaults(
AnthropicProvider::new(api_key, "claude-sonnet-4-20250514")
)
));

This is the standard way to pass providers to AgentRunner and Orchestrator.

Control how the LLM selects tools:

VariantBehavior
AutoLLM decides whether to use tools (default)
AnyLLM must use at least one tool
Tool { name }LLM must use a specific tool
[provider]
name = "anthropic" # or "openrouter"
model = "claude-sonnet-4-20250514"
api_key_env = "ANTHROPIC_API_KEY" # env var for API key
max_tokens = 16384
prompt_caching = true
[provider.retry]
max_retries = 3
base_delay_ms = 1000
[provider.cascade]
enabled = true
tiers = [
{ model = "claude-haiku-4-5-20251001" },
{ model = "claude-sonnet-4-20250514" },
]