Fail modes

Three knobs share the words "fail open" and "fail closed" in DefenseClaw. This page disambiguates them — response-layer hook fail mode, transport-layer strict availability, per-shell override — and tells you which one to flip for which problem.

DefenseClaw uses the words "fail open" and "fail closed" in three different places. They are not the same knob. This page is the disambiguation reference.

TL;DR

Knob	Layer	Default	Where it lives
`guardrail.hook_fail_mode`	Response (4xx, malformed JSON, missing `action`)	`open`	`~/.defenseclaw/config.yaml`
`DEFENSECLAW_STRICT_AVAILABILITY=1`	Transport (gateway unreachable / 5xx)	unset = fail-open	env var on the agent shell
`DEFENSECLAW_FAIL_MODE`	Response (per-shell override)	unset = use `hook_fail_mode`	env var on the agent shell

If you remember nothing else: leave hook_fail_mode=open (the default), set DEFENSECLAW_STRICT_AVAILABILITY=1 only when you would rather take the agent offline than miss a policy decision, and avoid the per-shell DEFENSECLAW_FAIL_MODE outside of one-shot debugging.

The decision tree

Two orthogonal knobs gate the verdict path: `DEFENSECLAW_STRICT_AVAILABILITY` decides what to do when the gateway is unreachable; `guardrail.hook_fail_mode` (with optional `DEFENSECLAW_FAIL_MODE` override) decides what to do when the gateway answers nonsense.

The split between transport and response matters because the threat models are different:

Transport failures are the gateway being down. Bricking the developer's tool because of a sidecar restart is worse UX than a brief observability gap. DEFENSECLAW_STRICT_AVAILABILITY=1 is the dedicated escape hatch for sites that prefer agent downtime to a missed inspection during a real outage.
Response failures are the gateway answering nonsense. That is likely misconfiguration the operator should hear about loudly — closed makes it impossible to ignore.

Knob 1: `guardrail.hook_fail_mode` (response-layer, recommended)

Persisted in ~/.defenseclaw/config.yaml under guardrail.hook_fail_mode. Two values:

open (default, recommended) — when the gateway answers with a 4xx, malformed JSON, or a missing action field, the hook allows the tool/prompt with a stderr warning and an entry in ~/.defenseclaw/logs/hook-failures.jsonl.
closed — the same response-layer failures block the tool/prompt (exit 2).

Flip it via the dedicated CLI:

defenseclaw guardrail fail-mode                  # show current
defenseclaw guardrail fail-mode open
defenseclaw guardrail fail-mode closed --yes     # CI / TUI form

Flipping the knob regenerates every hook script (codex-hook, claude-code-hook, inspect-*) so the FAIL_MODE template variable picks up the new value, and restarts the gateway by default. defenseclaw setup guardrail and defenseclaw init / quickstart also accept --fail-mode <open|closed> for non-interactive flows.

Knob 2: `DEFENSECLAW_STRICT_AVAILABILITY` (transport-layer)

Set in the agent's environment (the shell that launches Claude Code, Codex, OpenClaw, etc.):

export DEFENSECLAW_STRICT_AVAILABILITY=1

When set to a truthy value (1 / true / yes):

Transport failures (gateway unreachable / network error / 5xx) block the tool with exit 2.
The stderr message reads defenseclaw: gateway unreachable, blocking <subject> (DEFENSECLAW_STRICT_AVAILABILITY=1): <reason> so on-call engineers can grep for the cause.

When unset (the default), transport failures always allow with a stderr warning. The hook fail mode does NOT apply to transport failures — the two knobs are intentionally orthogonal.

Knob 3: `DEFENSECLAW_FAIL_MODE` (per-shell override)

Set in the agent's environment to override the response-layer template default for that single shell:

DEFENSECLAW_FAIL_MODE=closed claude code

The hook script reads ${DEFENSECLAW_FAIL_MODE:-{{.FailMode}}}, so an env var wins over the value baked into the template at setup time. Useful for:

One-off "I want this shell to be paranoid" sessions during incident response.
A/B testing fail-closed behaviour without restarting the gateway.

Avoid making this a long-lived setting — the persisted guardrail.hook_fail_mode is the right place for cluster-wide policy.

The recommendation matrix

Deployment	`hook_fail_mode`	`DEFENSECLAW_STRICT_AVAILABILITY`
Developer laptop	`open`	unset
CI / build agent	`open`	unset
Single-tenant prod (best-effort)	`open`	unset
Single-tenant prod (strict)	`closed`	unset
Regulated multi-tenant (no missed inspections)	`closed`	`1`

The "regulated multi-tenant" row is the only one where you should ever set both. It produces an agent that refuses to run when anything in the DefenseClaw pipeline is wrong — including a transient sidecar restart. Make sure your monitoring is set up to page someone the moment the sidecar is down.

Foot-gun: `claude_code.fail_mode` and `codex.fail_mode` (legacy, NOT consumed by hook scripts)

The AgentHookConfig.FailMode field in internal/config/config.go (yaml: claude_code.fail_mode, codex.fail_mode, etc.) is a per-connector POLICY-LAYER hint that downstream connector glue can read to pick a posture. It is NOT consumed by the generated hook scripts. The legacy default closed is preserved for backward compatibility with installs that wrote it before hook_fail_mode existed.

The TUI and config editor still let you change these fields; setting them does nothing for the runtime response-layer behaviour. Operators who want to change response-layer hook behaviour must edit guardrail.hook_fail_mode (or run defenseclaw guardrail fail-mode), NOT the per-connector field.

The disambiguation lives in the source:

internal/config/config.go::AgentHookConfig (lines ~670-721) — "NOT consumed by the generated hook scripts" comment.
internal/config/config.go::GuardrailConfig.HookFailMode (lines ~1042-1097) — "this is NOT the same field as AgentHookConfig.FailMode" comment.

The TUI editor labels these legacy fields explicitly so operators know to edit the canonical knob instead.

What each layer reads at runtime

Component	Reads
Generated hook scripts (`codex-hook.sh`, `claude-code-hook.sh`, `inspect-*.sh`)	`${DEFENSECLAW_FAIL_MODE:-{{.FailMode}}}` (template `FailMode` is the persisted `guardrail.hook_fail_mode`) plus `DEFENSECLAW_STRICT_AVAILABILITY` for the transport-layer check (via `_hardening.sh::defenseclaw_should_fail_closed_on_unreachable`).
Go gateway	`guardrail.hook_fail_mode` for hook regeneration only. The `AgentHookConfig.FailMode` legacy field is read but not enforced anywhere on the response-layer path.
Python CLI	Reads both fields when rendering / validating config; defers to `guardrail.hook_fail_mode` for behaviour.

Reference

internal/config/config.go::GuardrailConfig.HookFailMode — the canonical doc on response-layer fail mode.
internal/config/config.go::AgentHookConfig — the legacy per-connector field with the foot-gun callout.
internal/gateway/connector/hooks/_hardening.sh — defenseclaw_should_fail_closed_on_unreachable and defenseclaw_emit_unreachable_stderr for the transport-layer behaviour.
cli/defenseclaw/commands/cmd_guardrail.py::fail_mode_cmd — the defenseclaw guardrail fail-mode CLI.
Reference → Env vars — every env var grouped by category.