Guardrails
Guardrails intercept the agent loop at key points to enforce safety policies, content restrictions, and tool usage rules. They run in the standalone execution path.
The Guardrail Trait
Section titled “The Guardrail Trait”The Guardrail trait provides four async hooks:
| Hook | When it runs | What it receives |
|---|---|---|
pre_llm | Before each LLM call | &mut CompletionRequest (mutable) |
post_llm | After each LLM response | &CompletionResponse |
pre_tool | Before each tool execution | &ToolCall (name and input) |
post_tool | After each tool execution | &ToolCall and &mut ToolOutput |
post_llm and pre_tool return a GuardAction:
- Allow — continue execution
- Deny { reason } — block the action with a reason message
- Warn { reason } — allow but emit
AgentEvent::GuardrailWarned
pre_llm returns Result<(), Error> — it can mutate the request but cannot deny selectively.
post_tool returns Result<(), Error> — it can mutate the output directly but cannot deny.
Multiple guardrails are evaluated in order. First Deny wins — if any guardrail denies, the action is blocked.
When post_llm denies a response, a synthetic assistant placeholder is inserted before the denial feedback to maintain alternating message roles.
Registering Guardrails
Section titled “Registering Guardrails”use std::sync::Arc;
let agent = AgentRunner::builder(provider) .guardrails(vec![ Arc::new(content_fence), Arc::new(tool_policy), ]) .build()?;Built-in Guardrails
Section titled “Built-in Guardrails”ContentFenceGuardrail
Section titled “ContentFenceGuardrail”Blocks LLM responses containing forbidden content patterns. Configure with regex patterns or keyword lists.
InjectionClassifierGuardrail
Section titled “InjectionClassifierGuardrail”Detects prompt injection attempts in user input and tool outputs. Helps protect against indirect prompt injection via tool results.
PiiGuardrail
Section titled “PiiGuardrail”Identifies and blocks personally identifiable information (PII) from appearing in agent responses.
ToolPolicyGuardrail
Section titled “ToolPolicyGuardrail”Enforces per-tool access policies. Define which tools are allowed, denied, or require approval:
use heartbit::{ToolPolicyGuardrail, ToolRule, GuardAction};
let policy = ToolPolicyGuardrail::new( vec![ ToolRule { tool_pattern: "bash".into(), action: GuardAction::Deny { reason: "Blocked".into() }, input_constraints: vec![] }, ToolRule { tool_pattern: "read".into(), action: GuardAction::Allow, input_constraints: vec![] }, ], GuardAction::Allow, // default action for unmatched tools);Rules are evaluated in order — first match wins. If no rule matches, the default_action is used.
LlmJudgeGuardrail
Section titled “LlmJudgeGuardrail”Uses a separate (typically cheaper) LLM to evaluate agent responses against safety criteria:
use heartbit::LlmJudgeGuardrail;
let judge = LlmJudgeGuardrail::builder(judge_provider) .criterion("Response must not contain harmful instructions") .criterion("Response must stay on topic") .timeout(Duration::from_secs(10)) .evaluate_tool_inputs(true) // Also check tool inputs .build()?;The judge returns a verdict:
VERDICT: SAFE— allow the responseVERDICT: UNSAFE: reason— deny with explanationVERDICT: WARN: reason— allow but flag
The judge fails open — if the judge times out or errors, the response is allowed. This ensures production reliability.
Builder options:
| Method | Description |
|---|---|
.criterion(s) | Add a safety criterion |
.criteria(vec) | Add multiple criteria |
.timeout(d) | Judge evaluation timeout |
.evaluate_tool_inputs(bool) | Also run pre_tool hook |
.max_judge_tokens(n) | Token limit for judge response |
.system_prompt(s) | Custom system prompt for judge |
SensorSecurityGuardrail
Section titled “SensorSecurityGuardrail”Requires the
sensorfeature flag.
Validates data from sensor pipeline inputs, checking for malicious payloads in RSS feeds, webhooks, and other external sources.
ConditionalGuardrail
Section titled “ConditionalGuardrail”Wraps another guardrail with a condition function. The inner guardrail only runs when the condition evaluates to true.
GuardrailChain
Section titled “GuardrailChain”Combines multiple guardrails into a single unit. Useful for grouping related guardrails that should be applied together.