CiscoCiscoDefenseClaw
Connectors

OpenClaw

The reference proxy connector. DefenseClaw ships a TypeScript plugin that wires OpenClaw's fetch interceptor and before_tool_call hook directly into the gateway.

OpenClaw is the connector DefenseClaw was designed against. It ships a first-party TypeScript plugin (extensions/defenseclaw/) that hooks into fetch interception and OpenClaw's before_tool_call lifecycle so every prompt, response, and tool call lands in the DefenseClaw gateway.

Full OpenClaw end-to-end: prompt arrives → gateway inspects → policy decides → HITL approves → tool runs → audit row written.

Setup

defenseclaw setup openclaw --mode observe --restart
defenseclaw setup openclaw --mode action --human-approval --rule-pack strict --restart

setup openclaw is an alias around defenseclaw setup guardrail --connector openclaw — it pins claw.mode=openclaw and inherits every guardrail flag. Unlike Claude Code / Codex, the proxy is always in the data path here: there is no observability-only branch, only --mode observe (log without blocking) vs --mode action (enforce).

What this command sets vs. leaves at defaults

The flags above explicitly set: connector, mode, optional HITL, and optional rule pack. Every other knob falls back to the values DefenseClaw ships with — schema-defined in internal/config/config.go and documented on the Defaults page.

KnobValue when omittedFlag to override
Scanner backendlocal (bundled regex packs, zero key)--scanner-mode local|remote|both
Rule packunset → built-in baseline (no overlay)--rule-pack default|strict|permissive
LLM judgeoff (regex-only triage)--judge-model <model> plus --judge-api-key-env
Detection strategyregex_judge if judge is on, else regex-only--detection-strategy regex_only|regex_judge|judge_first
HITLoff (no operator approval prompts)--human-approval plus --hilt-min-severity ...
HITL minimum severityHIGH (when --human-approval is on; stored uppercase in config)--hilt-min-severity low|medium|high|critical (case-insensitive)
Hook fail-modeopen (allow on guardrail-side failure)defenseclaw guardrail fail-mode <open|closed> (no flag)
Proxy port4000--port <int>
Block messageempty (uses built-in copy)--block-message "<text>"
Redactionenabled--disable-redaction (trusted single-tenant only)
Verify after setupon--no-verify

See the full flag reference for the complete table or run defenseclaw setup guardrail --help.

Common variations — pick the recipe that fits your phase

defenseclaw setup openclaw \
  --mode observe \
  --rule-pack permissive \
  --restart

The proxy is in the data path but nothing blocks. Every prompt, response, and tool call lands in ~/.defenseclaw/gateway.jsonl. Run this for at least a week before promoting — see Defaults → tuning by risk tolerance.

defenseclaw setup openclaw \
  --mode action \
  --human-approval \
  --hilt-min-severity high \
  --restart

HIGH findings pause for operator approval; CRITICAL still blocks unconditionally. OpenClaw is one of the two connectors with a native approval surface (via the bundled DefenseClaw plugin), so chat-origin approvals reach the agent UI directly. See the HITL page for the per-connector matrix.

export DEFENSECLAW_LLM_KEY=<your-key>

defenseclaw setup openclaw \
  --mode action \
  --human-approval \
  --hilt-min-severity high \
  --detection-strategy regex_judge \
  --judge-model anthropic/claude-sonnet-4-20250514 \
  --judge-api-key-env DEFENSECLAW_LLM_KEY \
  --restart

Adds the LLM judge as a second pass on regex-flagged prompts. Costs a few cents per turn; cuts false positives meaningfully on semantic jailbreaks regex misses.

defenseclaw policy activate strict
defenseclaw setup openclaw \
  --mode action \
  --human-approval \
  --hilt-min-severity low \
  --rule-pack strict \
  --restart

Block ≥ MEDIUM, no allow-list bypass, HITL on every LOW+ event. Pair with the OpenShell sandbox profile and an MCP allow-list for full lockdown.

Decision aids — should I turn this on?

Not sure what to pick? Run defenseclaw setup guardrail (no flags) — the interactive wizard walks you through every choice with safe defaults pre-selected and inline help. The Prompt → flag mapping table gives you the CI-shaped command for the same configuration.

Files DefenseClaw will modify

openclaw.json (plugin allow / load entries)

A hash-checked backup of openclaw.json is stored before edits; teardown restores or surgically removes only DefenseClaw-owned entries.

What the plugin does

  1. 01User OpenClaw agent

    prompt

  2. 02OpenClaw agent DefenseClaw plugin

    before fetch (LLM request)

  3. 03DefenseClaw plugin defenseclaw-gateway

    POST /v1/inspect

  4. 04defenseclaw-gateway DefenseClaw plugin

    allow / block / pause

  5. 05DefenseClaw plugin OpenClaw agent

    forward (or reject)

  6. 06OpenClaw agent DefenseClaw plugin

    before_tool_call(name, args)

  7. 07DefenseClaw plugin defenseclaw-gateway

    POST /v1/tool/inspect

  8. 08defenseclaw-gateway DefenseClaw plugin

    verdict

  9. 09DefenseClaw plugin OpenClaw agent

    allow / block / pause

  10. 10OpenClaw agent User

    response

Three interception points let DefenseClaw inspect every interesting moment in OpenClaw's lifecycle.

Hook capabilities

Block events

  • before_tool_call
  • fetch_request
  • fetch_response

Native ask events

  • before_tool_call

OpenClaw is one of the two connectors that supports DefenseClaw approval prompts for tool actions. Approvals reach chat-origin sessions via the bundled plugin instead of living only in the native approval queue.

Subprocess policy

sandbox — see SANDBOX.md in the source repo for the full openshell-sandbox setup. The connector wires DefenseClaw into the sandbox's syscall and filesystem policy.

Disable

defenseclaw setup guardrail --disable

Restores ~/.openclaw/openclaw.json from the backup, removes the plugin entries, and stops the proxy.