CiscoCiscoDefenseClaw
Connectors

Claude Code

Claude Code connector wires PreToolUse, PostToolUse, UserPromptSubmit, Stop, and PermissionRequest hooks plus the native OTel exporter. Native ask is supported on PreToolUse for HITL.

The Claude Code connector wires DefenseClaw into Anthropic's documented hook surfaces without inserting a proxy in the data path. Claude Code talks directly to its native upstream; DefenseClaw inspects via hooks + native OTel.

Setup

defenseclaw setup claude-code

The shortcut alias. Pins claw.mode=claudecode, wires hooks + native OTel, leaves enforcement off.

defenseclaw setup guardrail --connector claudecode --mode action --human-approval --restart

Flips guardrail.claudecode_enforcement_enabled=true so the proxy listener binds and Claude Code's ANTHROPIC_BASE_URL is pointed at the local DefenseClaw proxy (e.g. http://127.0.0.1:4000/c/claudecode). The proxy then forwards inspected, allowed traffic to the real upstream provider via the gateway's Bifrost-backed router. Hooks now consult the gateway for an allow/block decision on every tool call.

What this command sets vs. leaves at defaults

The four flags above explicitly set: connector, mode, HITL, and post-setup restart. Every other knob falls back to the values DefenseClaw ships with — schema-defined in internal/config/config.go and documented on the Defaults page.

KnobValue when omittedFlag to override
Scanner backendlocal (bundled regex packs, zero key)--scanner-mode local|remote|both
Rule packunset → built-in baseline (no overlay)--rule-pack default|strict|permissive
LLM judgeoff (regex-only triage)--judge-model <model> plus --judge-api-key-env
Detection strategyregex_judge if judge is on, else regex-only--detection-strategy regex_only|regex_judge|judge_first
HITL minimum severityHIGH (when --human-approval is on; stored uppercase in config)--hilt-min-severity low|medium|high|critical (case-insensitive)
Hook fail-modeopen (allow on guardrail-side failure)defenseclaw guardrail fail-mode <open|closed> (no flag)
Proxy port4000--port <int>
Block messageempty (uses built-in copy)--block-message "<text>"
Redactionenabled--disable-redaction (trusted single-tenant only)
Verify after setupon--no-verify

See the full flag reference for the complete table or run defenseclaw setup guardrail --help.

Common variations — pick the recipe that fits your phase

defenseclaw setup guardrail \
  --connector claudecode \
  --mode observe \
  --rule-pack permissive \
  --restart

Nothing blocks. Every prompt and tool call lands in ~/.defenseclaw/gateway.jsonl. Run this for at least a week before promoting — see Defaults → tuning by risk tolerance.

export DEFENSECLAW_LLM_KEY=<your-key>

defenseclaw setup guardrail \
  --connector claudecode \
  --mode action \
  --human-approval \
  --hilt-min-severity high \
  --detection-strategy regex_judge \
  --judge-model anthropic/claude-sonnet-4-20250514 \
  --judge-api-key-env DEFENSECLAW_LLM_KEY \
  --restart

Adds the LLM judge as a second pass on regex-flagged prompts. Costs a few cents per turn; cuts false positives meaningfully on semantic jailbreaks regex misses.

defenseclaw policy activate strict
defenseclaw setup guardrail \
  --connector claudecode \
  --mode action \
  --human-approval \
  --hilt-min-severity low \
  --rule-pack strict \
  --restart

Block ≥ MEDIUM, no allow-list bypass, HITL on every LOW+ event. Pair with the OpenShell sandbox profile and an MCP allow-list for full lockdown.

Decision aids — should I turn this on?

Not sure what to pick? Run defenseclaw setup guardrail (no flags) — the interactive wizard walks you through every choice with safe defaults pre-selected and inline help. Once you're happy with the setup, the Prompt → flag mapping table gives you the CI-shaped command for the same configuration.

Files DefenseClaw will modify

settings.json (hooks block + OTEL_* env vars + CLAUDE_CODE_ENABLE_TELEMETRY)

DefenseClaw stores a hash-checked backup of settings.json before edits. Teardown restores it byte-for-byte; if the file drifted, only DefenseClaw-owned entries are surgically removed.

Hook capabilities

Block events

  • PreToolUse
  • PostToolUse
  • UserPromptSubmit
  • Stop
  • PermissionRequest

Native ask events

  • PreToolUse

Claude Code is one of the few connectors that supports native PreToolUse ask. HITL approvals surface inside the agent UI itself, so the operator never has to leave Claude Code to decide.

Telemetry channels at boot

Claude Code
PreToolUse / PostToolUse /UserPromptSubmit / Stop /PermissionRequest hooks
Native OTel exporter(env-driven)
defenseclaw-gateway
Two telemetry channels run in observability mode; the proxy data path activates only when claudecode_enforcement_enabled is set.

When to enable enforcement

The enforcement flag (guardrail.claudecode_enforcement_enabled) gates the proxy-redirect path: setting Claude Code's ANTHROPIC_BASE_URL to the local proxy (which then proxies upstream via Bifrost), reserved-id strip, and subprocess sandbox. Until you flip it on, observability runs end-to-end via hooks and OTel without any data-path modification.

Flip it on when:

  • The audit log has at least a week of data and you've tuned the rule pack.
  • HITL is on for HIGH so risky calls pause for review instead of blocking outright.
  • Operators have access to either the agent UI (for native PreToolUse ask) or the DefenseClaw TUI.

Disable

defenseclaw setup guardrail --disable