defenseclaw setup guardrail
The central command. Routes LLM traffic through the Go guardrail proxy, configures observe vs action mode, picks the connector, scanner, rule pack, judge, and HITL behaviour, then restarts the gateway.
defenseclaw setup guardrail is the operator command. It picks the connector, picks the mode, points the scanners at the right backend, optionally enables the LLM judge, and configures human-in-the-loop. Every other setup verb in DefenseClaw is a thin wrapper around this one.
Run it interactively the first time — the wizard explains each choice and picks safe defaults. Once the configuration is happy, re-run with explicit flags (or use --non-interactive) for unattended setups and CI.
Watch the interactive flow
The animation below replays a real defenseclaw setup guardrail session — connector pick, integration mode, observe vs action, hook fail-mode, scanner backend, judge, advanced options, and the final summary. Hover or press Pause to study a frame; press Restart to replay.
Same setup, no prompts
Every choice the wizard makes has a flag (with a couple of documented exceptions, below). The CI-friendly equivalent of the demo above is one command:
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode observe \
--scanner-mode local \
--detection-strategy regex_only \
--restartPass --non-interactive (or --accept-defaults) to skip every prompt; missing flags fall back to the same defaults the wizard would have offered. The judge stays off until you pass --judge-model (or pick a strategy that uses it); --detection-strategy regex_only here just makes that explicit.
Don't hand-roll the flags
The Command generator builds a non-interactive defenseclaw setup guardrail invocation for any connector with all the knobs below — mode, scanner, judge, HITL, advanced — and surfaces validation warnings inline.
Prompt → flag mapping
Each row corresponds to one prompt in the animation. Default shows what pressing Enter selects; Flag is the CI-shaped equivalent. Rows tagged interactive-only have no direct flag — the Note explains the workaround.
| Prompt (default) | Flag equivalent | Note |
|---|---|---|
| Which agent framework? (default: previous selection or auto-detected) | --connector / --agent | Explicitly selects the connector being configured. Accepts claudecode, codex, cursor, windsurf, geminicli, copilot, hermes, openhands, antigravity, opencode, omnigent, openclaw, zeptoclaw. |
| Direct-to-upstream quick setup (default: observe) | defenseclaw setup <connector> --mode observe|action | Claude Code, Codex, Cursor, Windsurf, Gemini CLI, Copilot CLI, Hermes, OpenHands, Antigravity, OpenCode, and OmniGent all have one-line hook or policy setup aliases. |
| Enable guardrail? (default: Y) | implicit when --connector is set | Pass --disable to roll the guardrail back. |
| Select mode (default: observe) | --mode observe|action | observe logs; action enforces. |
| Select hook fail mode (default: current config; closed on a fresh install) | interactive-only here | Change later with defenseclaw guardrail fail-mode <open|closed>. Only asked on first setup or when --mode changes. |
| Human approval for risky actions? (action mode only, default: current) | --human-approval / --no-human-approval | Skipped entirely in observe mode. |
| Approval minimum severity (if HITL on, default: high) | --hilt-min-severity high|medium|low|critical | Click renders the choices lowercase; case-insensitive matching, so HIGH also works. |
| Select scanner engine (default: local) | --scanner-mode local|remote|both | Wizard never picks both — pass the flag to run the union. |
| API endpoint / key env / timeout (remote scanner only) | --cisco-endpoint, --cisco-api-key-env, --cisco-timeout-ms | Defaults inherit from existing aid.* config. |
| Enable LLM judge? (default: current) | implicit when --judge-model is set | Pass --detection-strategy regex_only to force the judge off; the bare guardrail command has no --no-judge toggle. |
| Select judge strategy (if judge on, default: regex_judge) | --detection-strategy regex_only|regex_judge|judge_first | — |
| Inherit unified LLM key for judge? (default: Y) | --judge-api-key-env, or --inherit-llm to copy the connector's agent-side LLM block wholesale | Override with a different env var to use a separate key, or --inherit-from guardrail.judge to start from the previous judge config. |
| Who consumes the unified LLM key? (proxy connectors only) | --llm-role judge_only|judge_and_agent | Defaults to judge_only for hook-based connectors (Claude Code, Codex, ...) and judge_and_agent for proxy connectors (OpenClaw, ZeptoClaw). See Unified LLM key → Hook-based vs proxy-based connectors. |
| Judge provider / region / instance (judge on, advanced) | --judge-provider, --judge-region, --judge-instance-name | Use --judge-provider bedrock|vertex_ai|azure|anthropic|openai|... and the matching regional flags below to point the judge at a different backend than the agent LLM. --judge-instance-name binds to a ~/.defenseclaw/custom-providers.json overlay entry. |
| Configure judge fallback models? (default: N) | interactive-only | Edit guardrail.judge.fallback_models in ~/.defenseclaw/config.yaml to script. |
| Configure advanced options? (default: N) | — | Gates the next three prompts. |
| Guardrail proxy port (default: 4000) | --port | — |
| Custom block message? (action mode only) | --block-message | — |
| Disable redaction? (default: current) | --disable-redaction / --enable-redaction | Only disable inside trusted, single-tenant environments. |
On a multi-connector install, --connector <name> scopes this setup run to that connector. When --block-message is supplied without --connector, setup treats it as broad operator intent: it updates the shared block message and reconciles active connector overrides so one connector does not keep stale block text.
Two genuinely interactive-only choices on this command: hook fail mode (use defenseclaw guardrail fail-mode afterward), and judge fallback models (edit config.yaml). Everything else round-trips through flags.
Two modes you have to choose between
observe
Log findings to the audit DB and sinks. Block nothing. Run this for at least a week before promoting.
action
Apply the selected policy thresholds. In the default balanced profile, CRITICAL blocks, HIGH alerts or confirms with HITL, and MEDIUM alerts.
Connector resolution
When you omit --connector, DefenseClaw resolves it in this order:
--connectorflag (operator intent always wins)- Existing
guardrail.connectorif you have run setup before <data_dir>/picked_connectorhint written byscripts/install.sh --connector ...- Filesystem auto-detection (only in interactive mode)
- Fallback to
openclaw
Tabs by mode (non-interactive recipes)
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode observe \
--scanner-mode local \
--restartThe audit DB fills up with every prompt and tool call. Nothing blocks. Open defenseclaw tui for the live audit panel, or tail -f ~/.defenseclaw/gateway.jsonl | jq for a scripted view.
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--scanner-mode local \
--rule-pack default \
--restartWith the default balanced profile, CRITICAL findings block immediately, HIGH and MEDIUM findings alert, and LOW findings allow. Operators see the block or alert context supported by their connector; the audit log captures every verdict.
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--rule-pack default \
--human-approval \
--hilt-min-severity high \
--restartHIGH findings are eligible for confirmation. CRITICAL still blocks unconditionally. On Claude Code, PreToolUse can surface a native ask; on Codex, confirm falls back to an alert/system message with raw_action preserved and does not create a TUI approval. See the HITL page for the full matrix.
export DEFENSECLAW_LLM_KEY='replace-with-your-key'
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--detection-strategy regex_judge \
--judge-model anthropic/claude-sonnet-4-20250514 \
--judge-api-key-env DEFENSECLAW_LLM_KEY \
--restartRegex still runs first (cheap and offline). The judge adjudicates anything regex flagged as ambiguous. Detection strategy judge_first flips the order — useful when regex is too noisy.
defenseclaw setup guardrail \
--non-interactive \
--connector claudecode \
--mode action \
--detection-strategy regex_judge \
--judge-provider bedrock \
--judge-model us.anthropic.claude-sonnet-4-6 \
--judge-bedrock-region us-east-1 \
--judge-bedrock-auth-mode iam_credentials \
--judge-bedrock-access-key-env AWS_ACCESS_KEY_ID \
--judge-bedrock-secret-key-env AWS_SECRET_ACCESS_KEY \
--judge-bedrock-inference-profile us. \
--restartRoutes the judge through AWS Bedrock instead of a SaaS endpoint. boto3 ships in the base install, so no extra pip install is needed. Swap --judge-provider vertex_ai + --judge-vertex-{project-id,region,auth-mode,service-account-json-env} for GCP Vertex, or --judge-provider azure + --judge-azure-{endpoint,api-version,auth-mode,deployment-alias} for Azure OpenAI. For self-signed lab endpoints, add --judge-tls-ca-cert-file /etc/ssl/lab-root.pem. See Unified LLM key → Regional providers for the full matrix.
If the same Bedrock / Vertex / Azure posture is already configured on a custom-provider overlay entry (region, auth mode, deployment aliases, TLS — see Bedrock / Vertex AI / Azure on a custom instance), the judge can inherit it with a single --judge-instance-name <name> instead of repeating every flag. Role-level judge flags still win field-by-field, so a shared overlay can supply auth credentials while --judge-bedrock-region pins a different region per environment.
Every flag
Prop
Type
What setup writes
A hash-checked backup is stored before edits; teardown restores or surgically removes only DefenseClaw-owned entries. See the per-connector pages for the exact files mutated for each agent.
Verify it worked
defenseclaw doctor
defenseclaw status
defenseclaw alerts --limit 25doctor prints the full health report. status shows enforcement flags plus a per-connector block for every active connector (it has no flags of its own — it always reports the full Agents roster from config and /health, identical layout whether one or N connectors are wired). alerts lists recent decisions as a table; pass --connector <name> to filter by connector attribution and --show <n> to expand a specific row. For a live stream, open defenseclaw tui (interactive) or tail -f ~/.defenseclaw/gateway.jsonl | jq (scripted).
Interactive vs non-interactive — every command
DefenseClaw is interactive-first. When you run a setup or init verb at a terminal, you get a wizard with sane defaults and inline help. The same commands accept flags for unattended runs.
The matrix below is the source of truth. ✓ means "supported"; ✗ means "not supported on this command — see Notes for the workaround". A blank cell in the Notes column means there's no asymmetry worth flagging.
| Command | Interactive wizard | Non-interactive flags | Notes |
|---|---|---|---|
defenseclaw init | ✓ default on a TTY | ✓ --non-interactive --yes + per-knob flags | Identical to the wizard mode of quickstart once choices are made; both call bootstrap.run_first_run(). |
defenseclaw quickstart | ✗ never prompts | ✓ always | Zero-prompt by design. There is no wizard variant — use init for that. |
defenseclaw setup guardrail | ✓ default | ✓ --non-interactive + flags | Two prompts have no flag — see callout above. |
defenseclaw setup claude-code | ✓ confirm prompt | ✓ --yes, --mode, --restart/--no-restart, policy flags | Direct-to-upstream wrapper; observe by default, action returns supported lifecycle verdicts. |
defenseclaw setup codex | ✓ confirm prompt | ✓ --yes, --mode, --restart/--no-restart, policy flags | Same shape as setup claude-code. |
defenseclaw setup cursor / setup windsurf / setup geminicli / setup copilot / setup openhands / setup antigravity / setup hermes / setup opencode / setup omnigent | ✓ confirm prompt | ✓ --yes, --mode, --restart/--no-restart, connector policy flags | Generated wrappers; observe by default, action uses the connector's native hook or policy decision surface. |
defenseclaw setup openclaw / setup zeptoclaw | ✓ confirm prompt | ✓ --yes | Confirm is skippable; the full guardrail wizard is not available here — pass setup guardrail flags after the confirm. |
defenseclaw setup splunk | ✓ wizard when no mode flag | ✓ --non-interactive + --logs / --enterprise / --o11y | |
defenseclaw setup local-observability | ✗ no prompts | ✓ flags only | Compose-style up/down/logs subcommands; never asks. |
defenseclaw setup mcp-scanner | ✓ wizard | ✓ --non-interactive | |
defenseclaw setup skill-scanner | ✓ wizard | ✓ --non-interactive | |
defenseclaw registry add / edit / remove / sync | ✓ prompts for missing args | ✓ --non-interactive + required flags | sync (not refresh) fetches + scans + promotes. Module is built dual-path on purpose. |
defenseclaw doctor | conditional — only with --fix | ✓ --fix --yes | Default run is read-only and never prompts. |
defenseclaw agent discover | ✗ no prompts | ✓ flags / --json | Read-only inventory command. |
defenseclaw aibom scan | ✗ no prompts | ✓ flags only | Read-only. |
defenseclaw policy ... | ✗ no prompts | ✓ flags only | Pure CLI for repeatable policy authoring. |
defenseclaw tui | ✓ full TUI | — | Interactive only — Textual dashboard with audit, alerts, logs, inventory, and setup panels. |
defenseclaw alerts | ✗ no prompts | ✓ --limit, --show <n>, acknowledge / dismiss subcommands | Snapshot view of recent alerts; not a live tail. |
defenseclaw-gateway audit export (Go binary) | ✗ no prompts | ✓ --output, --limit, --include-activity | JSONL export of audit_events from the SQLite DB. |
defenseclaw-gateway (sidecar daemon) | ✗ no prompts | ✓ flags only | Long-running gateway — --config, --port, --log-level. Started for you by setup guardrail --restart. |
The two interactive-only corners worth knowing about: the integration-mode submenu inside setup guardrail (Claude Code / Codex), and the judge fallback-models prompt. Everything else has a flag.
The non-interactive-only group is quickstart, setup local-observability, policy, agent discover, aibom scan, defenseclaw-gateway audit export, and gateway start — all by design (CI-shaped, read-only, or daemon).
Common follow-ups
Quick aliases
defenseclaw setup claude-code, setup codex, setup cursor, ...
Multi-connector
One gateway enforcing N hook connectors via guardrail.connectors. Pick 'Add' at the prompt.
Changing connectors
Use setup aliases to add, reconfigure, or remove connector wiring.
Disabling
--disable rolls everything back, including agent-side hook entries.
HITL
Per-connector native ask events and non-pausing fallbacks.
Add vs Replace: keeping more than one connector wired
When you run a hook setup (defenseclaw setup codex, setup claude-code, …) and a different connector is already wired, DefenseClaw asks whether to Add or Replace:
- Add — keep the existing connector(s) and layer the new one in. The gateway now enforces guardrail policy for both, each under its own
guardrail.connectors.<name>block, andclaw.modeflips tomulti. This is the multi-connector path. - Replace — tear down the previous connector's wiring and make the new one the single active connector (documented in Changing connectors).
Proxy connectors (OpenClaw, ZeptoClaw) can't be Add peers — they own the traffic plane, so only one runs at a time.
Setup
Every defenseclaw setup verb in one place — from the central guardrail wizard to the auxiliary commands that wire keys, webhooks, registries, observability, and per-connector hooks.
Quick setup aliases
defenseclaw setup codex, setup claude-code, setup cursor, setup copilot, setup openhands, setup antigravity, setup hermes, setup opencode, setup omnigent, setup geminicli, setup windsurf, setup openclaw, setup zeptoclaw — one command per agent, no questions asked.