Overview

Guardrail settings live under guardrail: in ~/.defenseclaw/config.yaml and map to internal/config/config.go::GuardrailConfig. This page only lists fields present in that struct or defaults registered in internal/config/defaults.go.

Source-backed block

guardrail:
  enabled: false
  mode: observe
  scanner_mode: both
  host: ""
  port: 4000
  stream_buffer_bytes: 1024
  block_message: ""
  rule_pack_dir: ~/.defenseclaw/policies/guardrail/default
  detection_strategy: regex_judge
  detection_strategy_prompt: ""
  detection_strategy_completion: regex_only
  detection_strategy_tool_call: ""
  judge_sweep: true
  retain_judge_bodies: true
  allow_unknown_llm_domains: false
  judge:
    enabled: false
    injection: true
    pii: true
    pii_prompt: true
    pii_completion: true
    tool_injection: true
    timeout: 30.0
    adjudication_timeout: 5.0

Key reference

Key	Default	Meaning
`guardrail.enabled`	`false`	Enables the proxy listener.
`guardrail.mode`	`observe`	`observe` records blocks as alerts; `action` enforces blocks.
`guardrail.scanner_mode`	`both`	`local`, `remote`, or `both`. Controls local rules and Cisco AI Defense participation.
`guardrail.host`	`""`	Host override. Empty uses the effective host chosen by the proxy.
`guardrail.port`	`4000`	Guardrail proxy port.
`guardrail.stream_buffer_bytes`	`1024`	Initial streaming text buffer threshold in action mode.
`guardrail.block_message`	`""`	Optional custom message for blocked prompt/completion responses.
`guardrail.rule_pack_dir`	`~/.defenseclaw/policies/guardrail/default`	Rule-pack directory. Missing files fall back to embedded defaults.
`guardrail.detection_strategy`	`regex_judge`	Global strategy fallback.
`guardrail.detection_strategy_prompt`	`""`	Prompt-specific override; empty inherits global.
`guardrail.detection_strategy_completion`	`regex_only`	Completion-specific override.
`guardrail.detection_strategy_tool_call`	`""`	Tool-call-specific override; empty inherits global.
`guardrail.judge_sweep`	`true`	In `regex_judge`, no-signal content can still run full judge classification.
`guardrail.retain_judge_bodies`	`true`	Retains raw judge responses locally unless opted out. Sink-forwarded copies are redacted.
`guardrail.allow_unknown_llm_domains`	`false`	Allows LLM-shaped passthrough to unknown provider domains when explicitly enabled.

Judge keys

Key	Default	Meaning
`guardrail.judge.enabled`	`false`	Constructs an `LLMJudge` only when enabled and model/key resolution succeeds.
`guardrail.judge.injection`	`true`	Enables prompt-injection judge checks for prompt direction.
`guardrail.judge.pii`	`true`	Enables PII judge checks.
`guardrail.judge.pii_prompt`	`true`	Allows PII judge on prompt direction.
`guardrail.judge.pii_completion`	`true`	Allows PII judge on completion direction.
`guardrail.judge.tool_injection`	`true`	Enables the tool-injection judge path.
`guardrail.judge.timeout`	`30.0`	Seconds for full judge classification.
`guardrail.judge.adjudication_timeout`	`5.0`	Seconds for `regex_judge` adjudication calls.
`guardrail.judge.llm.*`	inherited	Per-judge override layered on top of top-level `llm:` settings.
`guardrail.judge.fallbacks`	empty	Provider fallbacks passed into judge LLM calls.

LLM credential resolution

Guardrail and judge provider settings use LLMConfig. Prefer api_key_env, which defaults to DEFENSECLAW_LLM_KEY at the top level. Inline api_key is still honored as a fallback, but Load warns about plaintext secrets.

Recognized model prefixes include openai, anthropic, azure, gemini, vertex_ai, bedrock, groq, mistral, cohere, ollama, vllm, deepseek, xai, fireworks_ai, perplexity, huggingface, replicate, openrouter, together_ai, and cerebras.

Runtime update boundary

The API/runtime path updates only mode, scanner_mode, and block_message through guardrail_runtime.json, which the proxy reloads in applyRuntime. Other keys are process startup configuration in the code inspected here.

Configuration — DefenseClaw