OpenClaw
The reference proxy connector. DefenseClaw ships a TypeScript plugin that wires OpenClaw's fetch interceptor and before_tool_call hook directly into the gateway.
OpenClaw is the connector DefenseClaw was designed against. It ships a first-party TypeScript plugin (extensions/defenseclaw/) that hooks into fetch interception and OpenClaw's before_tool_call lifecycle so every prompt, response, and tool call lands in the DefenseClaw gateway.
Setup
defenseclaw setup openclaw --mode observe --restart
defenseclaw setup openclaw --mode action --human-approval --rule-pack strict --restartsetup openclaw is an alias around defenseclaw setup guardrail --connector openclaw — it pins claw.mode=openclaw and inherits every guardrail flag. Unlike Claude Code / Codex, the proxy is always in the data path here: there is no observability-only branch, only --mode observe (log without blocking) vs --mode action (enforce).
What this command sets vs. leaves at defaults
The flags above explicitly set: connector, mode, optional HITL, and optional rule pack. Every other knob falls back to the values DefenseClaw ships with — schema-defined in internal/config/config.go and documented on the Defaults page.
| Knob | Value when omitted | Flag to override |
|---|---|---|
| Scanner backend | local (bundled regex packs, zero key) | --scanner-mode local|remote|both |
| Rule pack | unset → built-in baseline (no overlay) | --rule-pack default|strict|permissive |
| LLM judge | off (regex-only triage) | --judge-model <model> plus --judge-api-key-env |
| Detection strategy | regex_judge if judge is on, else regex-only | --detection-strategy regex_only|regex_judge|judge_first |
| HITL | off (no operator approval prompts) | --human-approval plus --hilt-min-severity ... |
| HITL minimum severity | HIGH (when --human-approval is on; stored uppercase in config) | --hilt-min-severity low|medium|high|critical (case-insensitive) |
| Hook fail-mode | open (allow on guardrail-side failure) | defenseclaw guardrail fail-mode <open|closed> (no flag) |
| Proxy port | 4000 | --port <int> |
| Block message | empty (uses built-in copy) | --block-message "<text>" |
| Redaction | enabled | --disable-redaction (trusted single-tenant only) |
| Verify after setup | on | --no-verify |
See the full flag reference for the complete table or run defenseclaw setup guardrail --help.
Common variations — pick the recipe that fits your phase
defenseclaw setup openclaw \
--mode observe \
--rule-pack permissive \
--restartThe proxy is in the data path but nothing blocks. Every prompt, response, and tool call lands in ~/.defenseclaw/gateway.jsonl. Run this for at least a week before promoting — see Defaults → tuning by risk tolerance.
defenseclaw setup openclaw \
--mode action \
--human-approval \
--hilt-min-severity high \
--restartHIGH findings pause for operator approval; CRITICAL still blocks unconditionally. OpenClaw is one of the two connectors with a native approval surface (via the bundled DefenseClaw plugin), so chat-origin approvals reach the agent UI directly. See the HITL page for the per-connector matrix.
export DEFENSECLAW_LLM_KEY=<your-key>
defenseclaw setup openclaw \
--mode action \
--human-approval \
--hilt-min-severity high \
--detection-strategy regex_judge \
--judge-model anthropic/claude-sonnet-4-20250514 \
--judge-api-key-env DEFENSECLAW_LLM_KEY \
--restartAdds the LLM judge as a second pass on regex-flagged prompts. Costs a few cents per turn; cuts false positives meaningfully on semantic jailbreaks regex misses.
defenseclaw policy activate strict
defenseclaw setup openclaw \
--mode action \
--human-approval \
--hilt-min-severity low \
--rule-pack strict \
--restartBlock ≥ MEDIUM, no allow-list bypass, HITL on every LOW+ event. Pair with the OpenShell sandbox profile and an MCP allow-list for full lockdown.
Decision aids — should I turn this on?
Human-in-the-loop (HITL)
When --human-approval is worth it. OpenClaw approvals reach chat-origin sessions via the bundled plugin, not just the TUI.
Mode + judge recipes
Side-by-side bash for observe / action / action+HITL / action+judge — copy-paste ready.
Defaults & rule packs
What permissive / default / strict actually ship, and which one matches your risk tolerance.
Interactive wizard
Animated terminal demo of the prompt-by-prompt setup flow — the safest path the first time.
Not sure what to pick? Run defenseclaw setup guardrail (no flags) — the interactive wizard walks you through every choice with safe defaults pre-selected and inline help. The Prompt → flag mapping table gives you the CI-shaped command for the same configuration.
Files DefenseClaw will modify
A hash-checked backup of openclaw.json is stored before edits; teardown restores or surgically removes only DefenseClaw-owned entries.
What the plugin does
- 01User OpenClaw agent
prompt
- 02OpenClaw agent DefenseClaw plugin
before fetch (LLM request)
- 03DefenseClaw plugin defenseclaw-gateway
POST /v1/inspect
- 04defenseclaw-gateway DefenseClaw plugin
allow / block / pause
- 05DefenseClaw plugin OpenClaw agent
forward (or reject)
- 06OpenClaw agent DefenseClaw plugin
before_tool_call(name, args)
- 07DefenseClaw plugin defenseclaw-gateway
POST /v1/tool/inspect
- 08defenseclaw-gateway DefenseClaw plugin
verdict
- 09DefenseClaw plugin OpenClaw agent
allow / block / pause
- 10OpenClaw agent User
response
Hook capabilities
Block events
- before_tool_call
- fetch_request
- fetch_response
Native ask events
- before_tool_call
OpenClaw is one of the two connectors that supports DefenseClaw approval prompts for tool actions. Approvals reach chat-origin sessions via the bundled plugin instead of living only in the native approval queue.
Subprocess policy
sandbox — see SANDBOX.md in the source repo for the full openshell-sandbox setup. The connector wires DefenseClaw into the sandbox's syscall and filesystem policy.
Disable
defenseclaw setup guardrail --disableRestores ~/.openclaw/openclaw.json from the backup, removes the plugin entries, and stops the proxy.
Codex
Codex connector wires config.toml hooks (UserPromptSubmit, PreToolUse, PermissionRequest, PostToolUse, Stop, SessionStart), the native OTel exporter, and the notify bridge for agent-turn-complete events.
Cursor
Cursor connector wires hooks.json with native ask on beforeShellExecution and beforeMCPExecution. Block on preToolUse, beforeReadFile, beforeTabFileRead, beforeSubmitPrompt, stop.