defenseclaw setup guardrail
The central command. Routes LLM traffic through the Go guardrail proxy, configures observe vs action mode, picks the connector, scanner, rule pack, judge, and HITL behaviour, then restarts the gateway.
defenseclaw setup guardrail is the operator command. It picks the connector, picks the mode, points the scanners at the right backend, optionally enables the LLM judge, and configures human-in-the-loop. Every other setup verb in DefenseClaw is a thin wrapper around this one.
Run it interactively the first time — the wizard explains each choice and picks safe defaults. Once the configuration is happy, re-run with explicit flags (or use --non-interactive) for unattended setups and CI.
Watch the interactive flow
The animation below replays a real defenseclaw setup guardrail session — connector pick, integration mode, observe vs action, hook fail-mode, scanner backend, judge, advanced options, and the final summary. Hover or press Pause to study a frame; press Restart to replay.
Same setup, no prompts
Every choice the wizard makes has a flag (with a couple of documented exceptions, below). The CI-friendly equivalent of the demo above is one command:
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode observe \
--scanner-mode local \
--detection-strategy regex_only \
--restartPass --non-interactive (or --accept-defaults) to skip every prompt; missing flags fall back to the same defaults the wizard would have offered. The judge stays off until you pass --judge-model (or pick a strategy that uses it); --detection-strategy regex_only here just makes that explicit.
Prompt → flag mapping
Each row corresponds to one prompt in the animation. Default shows what pressing Enter selects; Flag is the CI-shaped equivalent. Rows tagged interactive-only have no direct flag — the Note explains the workaround.
| Prompt (default) | Flag equivalent | Note |
|---|---|---|
| Which agent framework? (default: previous selection or auto-detected) | --connector / --agent | Accepts claudecode, codex, cursor, windsurf, geminicli, copilot, hermes, openclaw, zeptoclaw. |
| Select integration mode — Claude Code / Codex only (default: 2 = full guardrail) | interactive-only | Use setup claude-code or setup codex for the observability-only branch from a script. |
| Enable guardrail? (default: Y) | implicit when --connector is set | Pass --disable to roll the guardrail back. |
| Select mode (default: observe) | --mode observe|action | observe logs; action enforces. |
| Select hook fail mode (default: open) | interactive-only here | Change later with defenseclaw guardrail fail-mode <open|closed>. Only asked on first setup or when --mode changes. |
| Human approval for risky actions? (action mode only, default: current) | --human-approval / --no-human-approval | Skipped entirely in observe mode. |
| Approval minimum severity (if HITL on, default: high) | --hilt-min-severity high|medium|low|critical | Click renders the choices lowercase; case-insensitive matching, so HIGH also works. |
| Select scanner engine (default: local) | --scanner-mode local|remote|both | Wizard never picks both — pass the flag to run the union. |
| API endpoint / key env / timeout (remote scanner only) | --cisco-endpoint, --cisco-api-key-env, --cisco-timeout-ms | Defaults inherit from existing aid.* config. |
| Enable LLM judge? (default: current) | implicit when --judge-model is set | Pass --detection-strategy regex_only to force the judge off; the bare guardrail command has no --no-judge toggle. |
| Select judge strategy (if judge on, default: regex_judge) | --detection-strategy regex_only|regex_judge|judge_first | — |
| Inherit unified LLM key for judge? (default: Y) | --judge-api-key-env | Override with a different env var to use a separate key. |
| Configure judge fallback models? (default: N) | interactive-only | Edit guardrail.judge.fallback_models in ~/.defenseclaw/config.yaml to script. |
| Configure advanced options? (default: N) | — | Gates the next three prompts. |
| Guardrail proxy port (default: 4000) | --port | — |
| Custom block message? (action mode only) | --block-message | — |
| Disable redaction? (default: current) | --disable-redaction / --enable-redaction | Only disable inside trusted, single-tenant environments. |
Three genuinely interactive-only choices on this command: integration mode (Claude Code / Codex split), hook fail mode (use defenseclaw guardrail fail-mode afterward), and judge fallback models (edit config.yaml). Everything else round-trips through flags.
Two modes you have to choose between
observe
Log findings to the audit DB and sinks. Block nothing. Run this for at least a week before promoting.
action
Block on configured severities. CRITICAL always blocks. HIGH blocks unless HITL is on.
Connector resolution
When you omit --connector, DefenseClaw resolves it in this order:
--connectorflag (operator intent always wins)- Existing
guardrail.connectorif you have run setup before <data_dir>/picked_connectorhint written byscripts/install.sh --connector ...- Filesystem auto-detection (only in interactive mode)
- Fallback to
openclaw
Tabs by mode (non-interactive recipes)
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode observe \
--scanner-mode local \
--restartThe audit DB fills up with every prompt and tool call. Nothing blocks. Open defenseclaw tui for the live audit panel, or tail -f ~/.defenseclaw/gateway.jsonl | jq for a scripted view.
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--scanner-mode local \
--rule-pack default \
--restartCRITICAL findings block immediately. HIGH findings block. Operators see the block reason in the agent UI; the audit log captures the verdict.
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--rule-pack strict \
--human-approval \
--hilt-min-severity high \
--restartHIGH findings pause for operator approval. CRITICAL still blocks unconditionally. On Claude Code, the approval prompt surfaces as a native PreToolUse ask; on Codex, it downgrades to a TUI prompt. See the HITL page for the full matrix.
export DEFENSECLAW_LLM_KEY=<your-key>
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--detection-strategy regex_judge \
--judge-model anthropic/claude-sonnet-4-20250514 \
--judge-api-key-env DEFENSECLAW_LLM_KEY \
--restartRegex still runs first (cheap and offline). The judge adjudicates anything regex flagged as ambiguous. Detection strategy judge_first flips the order — useful when regex is too noisy.
Every flag
Prop
Type
What setup writes
A hash-checked backup is stored before edits; teardown restores or surgically removes only DefenseClaw-owned entries. See the per-connector pages for the exact files mutated for each agent.
Verify it worked
defenseclaw doctor
defenseclaw status
defenseclaw alerts --limit 25doctor prints the full health report. status shows the resolved connector + enforcement flags (it has no flags of its own — it always reports the active connector from config and /health). alerts lists recent decisions as a table; pass --show <n> to expand a specific row. For a live stream, open defenseclaw tui (interactive) or tail -f ~/.defenseclaw/gateway.jsonl | jq (scripted).
Interactive vs non-interactive — every command
DefenseClaw is interactive-first. When you run a setup or init verb at a terminal, you get a wizard with sane defaults and inline help. The same commands accept flags for unattended runs.
The matrix below is the source of truth. ✓ means "supported"; ✗ means "not supported on this command — see Notes for the workaround". A blank cell in the Notes column means there's no asymmetry worth flagging.
| Command | Interactive wizard | Non-interactive flags | Notes |
|---|---|---|---|
defenseclaw init | ✓ default on a TTY | ✓ --non-interactive --yes + per-knob flags | Identical to the wizard mode of quickstart once choices are made; both call bootstrap.run_first_run(). |
defenseclaw quickstart | ✗ never prompts | ✓ always | Zero-prompt by design. There is no wizard variant — use init for that. |
defenseclaw setup guardrail | ✓ default | ✓ --non-interactive + flags | Two prompts have no flag — see callout above. |
defenseclaw setup claude-code | ✓ confirm prompt | ✓ --yes | Light wrapper that only configures observability. Full guardrail still goes through setup guardrail. |
defenseclaw setup codex | ✓ confirm prompt | ✓ --yes | Same shape as setup claude-code. |
defenseclaw setup cursor / setup windsurf / setup geminicli / setup copilot / setup hermes | ✓ confirm prompt | ✓ --yes | Generated wrappers — confirmation only, then delegate. |
defenseclaw setup openclaw / setup zeptoclaw | ✓ confirm prompt | ✓ --yes | Confirm is skippable; the full guardrail wizard is not available here — pass setup guardrail flags after the confirm. |
defenseclaw setup splunk | ✓ wizard when no mode flag | ✓ --non-interactive + --logs / --enterprise / --o11y | |
defenseclaw setup local-observability | ✗ no prompts | ✓ flags only | Compose-style up/down/logs subcommands; never asks. |
defenseclaw setup mcp-scanner | ✓ wizard | ✓ --non-interactive | |
defenseclaw setup skill-scanner | ✓ wizard | ✓ --non-interactive | |
defenseclaw registry add / edit / remove / sync | ✓ prompts for missing args | ✓ --non-interactive + required flags | sync (not refresh) fetches + scans + promotes. Module is built dual-path on purpose. |
defenseclaw doctor | conditional — only with --fix | ✓ --fix --yes | Default run is read-only and never prompts. |
defenseclaw agent discover | ✗ no prompts | ✓ flags / --json | Read-only inventory command. |
defenseclaw aibom scan | ✗ no prompts | ✓ flags only | Read-only. |
defenseclaw policy ... | ✗ no prompts | ✓ flags only | Pure CLI for repeatable policy authoring. |
defenseclaw tui | ✓ full TUI | — | Interactive only — Bubbletea dashboard with audit, alerts, logs, inventory panels. |
defenseclaw alerts | ✗ no prompts | ✓ --limit, --show <n>, acknowledge / dismiss subcommands | Snapshot view of recent alerts; not a live tail. |
defenseclaw-gateway audit export (Go binary) | ✗ no prompts | ✓ --output, --limit, --include-activity | JSONL export of audit_events from the SQLite DB. |
defenseclaw-gateway (sidecar daemon) | ✗ no prompts | ✓ flags only | Long-running gateway — --config, --port, --log-level. Started for you by setup guardrail --restart. |
The two interactive-only corners worth knowing about: the integration-mode submenu inside setup guardrail (Claude Code / Codex), and the judge fallback-models prompt. Everything else has a flag.
The non-interactive-only group is quickstart, setup local-observability, policy, agent discover, aibom scan, defenseclaw-gateway audit export, and gateway start — all by design (CI-shaped, read-only, or daemon).
Common follow-ups
Setup
Every defenseclaw setup verb in one place — from the central guardrail wizard to the auxiliary commands that wire keys, webhooks, registries, observability, and per-connector hooks.
Quick setup aliases
defenseclaw setup codex, setup claude-code, setup cursor, setup copilot, setup hermes, setup geminicli, setup windsurf, setup openclaw, setup zeptoclaw — one command per agent, no questions asked.