Defaults

What every fresh DefenseClaw install ships with — three OPA policies (permissive / default / strict), three matching guardrail rule packs, the operator-config defaults, and how to pick the combination that fits your team's risk tolerance.

DefenseClaw ships with opinionated defaults that are immediately useful without tuning, and stay out of your way until you opt into stricter behaviour. This page documents what actually ships — grounded in the policy YAMLs in policies/, the schema in internal/config/config.go, and what defenseclaw setup guardrail actually writes.

The two layers you can swap

DefenseClaw separates admission policy from runtime guardrail rules. They ship as matching triples but are independent knobs.

Admission policy (OPA)

What happens when a skill / MCP / plugin gets installed or executed. Lives in policies/<name>.yaml, activated by defenseclaw policy activate <name>.

Guardrail rule pack

The regex patterns, LLM-judge prompts, and suppressions that gate prompts and completions in flight. Lives in policies/guardrail/<name>/, pointed at by guardrail.rule_pack_dir.

Both layers ship in three named profiles — default, strict, permissive — and you flip them independently:

defenseclaw policy activate default                                # OPA layer
defenseclaw setup guardrail --rule-pack default                    # Guardrail layer

defenseclaw policy activate strict does not change guardrail.rule_pack_dir, and vice versa. If you want strict everywhere, flip both. See docs/GUARDRAIL_RULE_PACKS.md for the rationale.

OPA admission policy — what each profile ships

The OPA policy file (policies/<name>.yaml) drives admission decisions: what happens to a finding by severity, whether the allow-list lets first-party assets bypass scanning, and what threshold a Cisco AI Defense verdict has to clear before it blocks.

Knob	`permissive`	`default`	`strict`
`admission.allow_list_bypass_scan`	`true`	`true`	`false`
`skill_actions.critical`	quarantine + disable + block	quarantine + disable + block	quarantine + disable + block
`skill_actions.high`	none + enable + none	quarantine + disable + block	quarantine + disable + block
`skill_actions.medium`	none + enable + none	none + enable + none	quarantine + disable + block
`scanner_overrides`	empty	MCP LOW/MEDIUM and plugin MEDIUM/HIGH overrides	MCP LOW/MEDIUM and plugin MEDIUM/HIGH overrides
`guardrail.block_threshold`	`4` (CRITICAL)	`4` (CRITICAL)	`2` (MEDIUM)
`guardrail.alert_threshold`	`3` (HIGH)	`2` (MEDIUM)	`1` (LOW)
`guardrail.cisco_trust_level`	`advisory`	`full`	`full`
`guardrail.hilt.enabled`	`false` (key omitted)	`false`	`false` (key omitted)
`guardrail.hilt.min_severity`	`HIGH`	`HIGH`	`HIGH`

Severity ranks are the rego convention from policies/rego/guardrail.rego: 1 = LOW, 2 = MEDIUM, 3 = HIGH, 4 = CRITICAL. cisco_trust_level: advisory means even Cisco AI Defense's own verdicts are surfaced but never escalated to a block.

The columns are deliberately conservative. We'd rather you opt into stricter behaviour than have an upgrade silently start blocking your traffic.

Guardrail rule pack — what each profile ships

The rule pack directory (policies/guardrail/<name>/) holds the regex YAMLs, judge prompts, sensitive-tool definitions, and suppressions the in-flight scanner consumes.

Pack	`rules/` files	`judge/` prompts	`suppressions.yaml`	`sensitive-tools.yaml`
`permissive`	c2, cognitive, commands, enterprise-data, local-patterns, secrets, sensitive-paths, trust-exploit	pii and tool-injection (higher judge thresholds); injection ships disabled	broadest; additionally suppresses all IP findings and selected file-inspection PII	same six sensitive tool definitions as the other packs
`default`	same eight families; balanced variants where profiles differ	injection, pii, tool-injection	private/loopback IPs, platform IDs, expected system metadata	same six sensitive tool definitions as the other packs
`strict`	same eight families; stricter variants where profiles differ	injection, pii, tool-injection (lower judge thresholds)	minimal structural suppressions; no tool suppressions	same six sensitive tool definitions as the other packs

All three packs share the same severity rubric and the same signal_strength output schema — only the per-category thresholds and suppression scope differ between packs. Switching the rule pack does not enable the LLM judge — that's a separate guardrail.judge.enabled toggle in your operator config (default: false). Flipping the rule pack only changes which prompt YAMLs the judge will run if you've enabled it.

What `setup guardrail` actually writes

After defenseclaw init, an explicit defenseclaw setup guardrail --connector openclaw --mode action --rule-pack default --non-interactive produces the relevant configuration below (unrelated generated fields omitted):

~/.defenseclaw/config.yaml (relevant fields after setup)

claw:
  mode: openclaw

guardrail:
  enabled: true
  mode: action
  rule_pack_dir: /Users/<you>/.defenseclaw/policies/guardrail/default
  hook_fail_mode: closed
  judge:
    enabled: false                    # opt in via --judge-model
  hilt:
    enabled: false                    # opt in via --human-approval
    min_severity: HIGH

privacy:
  disable_redaction: false

audit_sinks: []                        # no external audit sink; local SQLite + JSONL still write
webhooks: []                           # add via `defenseclaw setup webhook add ...`

claude_code:
  enabled: false                       # toggled when you pick claude-code in setup guardrail
codex:
  enabled: false

Three things to notice that contradict folklore:

The LLM judge is OFF by default. It only flips on if you pass --judge-model to setup guardrail or answer "yes" to the interactive judge prompt. The schema default is guardrail.judge.enabled = false in internal/config/config.go. Keeping it off keeps cost predictable; flip it on once you have a DEFENSECLAW_LLM_KEY configured.
HILT is OFF by default. The shipped severity floor is HIGH, but enabled: false means it never prompts. --human-approval flips it on; --hilt-min-severity adjusts the floor.
The built-in local stores still write. audit_sinks: [] means no external audit-event destination. SQLite audit_events powers defenseclaw alerts, the TUI, and audit export; ~/.defenseclaw/gateway.jsonl remains the live structured runtime log. Wire external sinks via setup splunk or setup local-observability.

Tuning by risk tolerance

You usually don't need a custom policy or rule pack — just a few knob changes.

"I'm in pilot, just observe"

defenseclaw policy activate permissive
defenseclaw setup guardrail \
  --connector openclaw \
  --mode observe \
  --rule-pack permissive \
  --restart \
  --non-interactive

Observe mode prevents enforcement even though the permissive policy's action-mode block threshold remains CRITICAL. Everything still flows to the audit log and JSONL so you can review what would have happened. Recommended first week of any rollout.

"Move fast, stop only the obvious harm"

defenseclaw policy activate default
defenseclaw setup guardrail \
  --connector openclaw \
  --mode action \
  --rule-pack default \
  --human-approval \
  --hilt-min-severity high \
  --restart \
  --non-interactive

Default rules; CRITICAL blocks and HIGH can prompt on OpenClaw's native ask surface. Most engineering teams in the early/middle phase land here.

"Regulated workload, lock it down"

defenseclaw policy activate strict
defenseclaw setup guardrail \
  --connector openclaw \
  --mode action \
  --rule-pack strict \
  --detection-strategy regex_judge \
  --judge-model openai/gpt-4o-mini \
  --restart \
  --non-interactive

Strict policy (block ≥ MEDIUM, no allow-list bypass), strict rule pack (stricter profile variants and minimal suppressions), and the LLM judge enabled. Because the strict block threshold runs before HITL, MEDIUM-and-higher findings block rather than prompt. Combine this with the bundled OpenShell sandbox profile and a reviewed first-party allow-list.

"I trust the scanner, raise its bar specifically"

The asset-class behavior is independent of the rule pack. Edit the active policy YAML directly:

```yaml title="policies/default.yaml override (apply with defenseclaw policy activate default)" scanner_overrides: mcp: medium: # was none/enable/none file: quarantine runtime: disable install: block


Then re-activate so OPA picks up the change:

```bash
defenseclaw policy activate default

What `defenseclaw init` doesn't change

A few defaults are intentionally fixed unless you edit ~/.defenseclaw/config.yaml directly:

Knob	Default	Why fixed
`~/.defenseclaw/gateway.jsonl` (JSONL fallback path)	always written	Reliability fallback — the gateway must always have a writable place to log when external sinks fail
`guardrail.hook_fail_mode`	`closed` on new installs	Malformed/unauthorized hook responses fail closed; upgrades from the pre-change default are migrated to `open` for compatibility
`guardrail.judge.timeout`	`30s`	Hot-path latency budget for the judge
`guardrail.judge.adjudication_timeout`	`5s`	Per-prompt adjudication budget
`guardrail.detection_strategy`	`regex_judge`	Tested baseline — regex first, judge for medium+ findings
Bifrost retry policy	`3 attempts, exp backoff`	Tested LLM-routing baseline

If you need to change any of these, edit ~/.defenseclaw/config.yaml directly and defenseclaw config validate confirms the schema.

Per-connector overrides (`guardrail.connectors`)

When you run more than one hook connector from a single gateway, override guardrail policy per connector under guardrail.connectors.<name> in ~/.defenseclaw/config.yaml. Every field is optional and inherits the global guardrail.* value when unset, so a connector block only carries what differs:

guardrail:
  mode: action                   # global default
  hook_fail_mode: closed

  connectors:
    claudecode:
      mode: action               # enforce for Claude Code
    codex:
      mode: observe              # softer for Codex than for Claude Code
      hook_fail_mode: open         # explicit softer override for Codex

claw.mode flips to multi automatically once more than one connector is active. Manage these blocks with defenseclaw setup <connector> (choosing Add) and the defenseclaw guardrail ... --connector X command group — see Setup → Multi-connector and Reference → Configuration. The OPA admission policy is still global — there's no per-connector policy override surface yet.

Legacy top-level connector blocks are deprecated

Older installs used top-level claude_code: / codex: blocks (the AgentHookConfig fields) for per-connector overrides:

claude_code:
  enabled: true
  mode: action
  fail_mode: open                # LEGACY hint, NOT consumed by hooks; see Reference → Fail modes

These are still parsed for backward compatibility, but fail_mode here does nothing at runtime (see Reference → Fail modes). Prefer guardrail.connectors.<name> for new configuration — it's the surface the per-connector CLI writes and the gateway resolves at request time.

Inspect the active defaults

defenseclaw config show                  # rendered ~/.defenseclaw/config.yaml (secrets masked)
defenseclaw policy list                  # all policies on disk + which is active
defenseclaw policy show default          # normalized summary of one named policy

config show always renders the resolved configuration — base + env-var overlay — so you can see the effective values without spelunking. Use --reveal to also show resolved secret values (still masked in the output for safety).

policy show <name> prints a normalized summary of the named file (default, strict, permissive, or any custom policies/<name>.yaml you've added). It does not dump the source YAML or individual guardrail rules. To find a rule by id, search the configured rule-pack directory directly:

grep -rn "rule_id_you_care_about" "$(awk '/rule_pack_dir/ {print $2}' ~/.defenseclaw/config.yaml)"

Reset to defaults

There's no --reset flag. Two real paths exist:

Soft reset (most common) — just re-run setup with the defaults you want. setup guardrail overwrites the relevant guardrail.* keys idempotently:

defenseclaw setup guardrail --rule-pack default --no-human-approval
defenseclaw policy activate default

Hard reset (start from zero) — defenseclaw uninstall archives ~/.defenseclaw/ to a timestamped backup, so you can roll back:

defenseclaw uninstall
defenseclaw init
defenseclaw setup guardrail