CiscoCiscoDefenseClaw
Connectors

Codex

Codex connector wires config.toml hooks (UserPromptSubmit, PreToolUse, PermissionRequest, PostToolUse, Stop, SessionStart), the native OTel exporter, and the notify bridge for agent-turn-complete events.

The Codex connector wires DefenseClaw into Codex's documented hooks, native OpenTelemetry exporter, and the notify bridge for agent-turn-complete events.

Setup

defenseclaw setup codex

Pins claw.mode=codex, wires hooks + OTel + notify bridge, leaves enforcement off.

defenseclaw setup guardrail --connector codex --mode action --restart

Flips guardrail.codex_enforcement_enabled=true so the proxy listener binds and Codex's openai_base_url (in ~/.codex/config.toml) is pointed at the local DefenseClaw proxy (e.g. http://127.0.0.1:4000/c/codex). The proxy then forwards inspected, allowed traffic to the real upstream provider via the gateway's Bifrost-backed router.

What this command sets vs. leaves at defaults

The three flags above explicitly set: connector, mode, and post-setup restart. Every other knob falls back to the values DefenseClaw ships with — schema-defined in internal/config/config.go and documented on the Defaults page.

KnobValue when omittedFlag to override
Scanner backendlocal (bundled regex packs, zero key)--scanner-mode local|remote|both
Rule packunset → built-in baseline (no overlay)--rule-pack default|strict|permissive
LLM judgeoff (regex-only triage)--judge-model <model> plus --judge-api-key-env
Detection strategyregex_judge if judge is on, else regex-only--detection-strategy regex_only|regex_judge|judge_first
HITLoff (no operator approval prompts)--human-approval plus --hilt-min-severity ...
Hook fail-modeopen (allow on guardrail-side failure)defenseclaw guardrail fail-mode <open|closed> (no flag)
Proxy port4000--port <int>
Block messageempty (uses built-in copy)--block-message "<text>"
Redactionenabled--disable-redaction (trusted single-tenant only)
Verify after setupon--no-verify

See the full flag reference for the complete table or run defenseclaw setup guardrail --help.

Common variations — pick the recipe that fits your phase

defenseclaw setup guardrail \
  --connector codex \
  --mode observe \
  --rule-pack permissive \
  --restart

Nothing blocks. Every prompt and tool call lands in ~/.defenseclaw/gateway.jsonl. Run this for at least a week before promoting — see Defaults → tuning by risk tolerance.

defenseclaw setup guardrail \
  --connector codex \
  --mode action \
  --human-approval \
  --hilt-min-severity high \
  --restart

HIGH findings pause for operator approval; CRITICAL still blocks unconditionally. Codex has no native ask surface, so HITL approvals downgrade to confirm verdicts in the DefenseClaw TUI / audit log — make sure operators are reachable there. See the HITL page for the full per-connector matrix.

export DEFENSECLAW_LLM_KEY=<your-key>

defenseclaw setup guardrail \
  --connector codex \
  --mode action \
  --human-approval \
  --hilt-min-severity high \
  --detection-strategy regex_judge \
  --judge-model anthropic/claude-sonnet-4-20250514 \
  --judge-api-key-env DEFENSECLAW_LLM_KEY \
  --restart

Adds the LLM judge as a second pass on regex-flagged prompts. Costs a few cents per turn; cuts false positives meaningfully on semantic jailbreaks regex misses.

defenseclaw policy activate strict
defenseclaw setup guardrail \
  --connector codex \
  --mode action \
  --human-approval \
  --hilt-min-severity low \
  --rule-pack strict \
  --restart

Block ≥ MEDIUM, no allow-list bypass, HITL on every LOW+ event. Pair with the OpenShell sandbox profile and an MCP allow-list for full lockdown.

Decision aids — should I turn this on?

Not sure what to pick? Run defenseclaw setup guardrail (no flags) — the interactive wizard walks you through every choice with safe defaults pre-selected and inline help. The Prompt → flag mapping table gives you the CI-shaped command for the same configuration.

Files DefenseClaw will modify

config.toml ([hooks], [otel], [notify] blocks)

The [hooks], [otel], and [notify] blocks are owned by DefenseClaw; everything else in config.toml is preserved verbatim.

Hook capabilities

Block events

  • UserPromptSubmit
  • PreToolUse
  • PermissionRequest
  • PostToolUse
  • Stop
  • SessionStart

Native ask events

None — confirm verdicts are downgraded with the raw action preserved.

Codex has no native ask surface here. Confirm verdicts are downgraded with raw_action preserved so operators can review the original action in the TUI / OpenClaw plugin.

Telemetry channels at boot

Codex
UserPromptSubmit / PreToolUse /PermissionRequest / PostToolUse /Stop / SessionStart
Native OTel exporter
Notify bridgeagent-turn-complete
defenseclaw-gateway
Three independent channels make Codex one of the most thoroughly inspected agents — even before enforcement.

When to enable enforcement

guardrail.codex_enforcement_enabled gates the proxy-redirect path: openai_base_url is set to the local proxy (which then proxies upstream via Bifrost), reserved-id strip, and the subprocess sandbox. Observability runs end-to-end without it. Flip it on when:

  • The Codex audit log has stable signal (tune your rule pack first).
  • You have a working HITL escalation path — Codex falls back to confirm verdicts so operators need to be reachable.

Disable

defenseclaw setup guardrail --disable