Stop Claude Code from deleting a critical path

Wire DefenseClaw into Claude Code, observe for a week, then safely verify the CMD-RM-RF rule against a disposable path.

"I want to observe Claude Code and block recursive force deletion against critical system paths."

That sentence is the original DefenseClaw user story. Here is the entire flow, end to end.

Outcome

By the end of this story, you'll have:

Claude Code wired to DefenseClaw via native hooks + OTel.
A week of audit data in ~/.defenseclaw/audit.db proving what your agent actually does.
An action mode policy that blocks recursive force deletion under protected root prefixes with a CRITICAL finding.
An audit row showing the block, with the rule reference for the next reviewer.

Wire Claude Code (observability only)

defenseclaw setup claude-code

This is the observability-only alias. Claude Code talks directly to its native upstream. DefenseClaw inspects via PreToolUse / PostToolUse / UserPromptSubmit / Stop / PermissionRequest hooks plus the native OTel exporter.

Watch the audit log fill up

Open the live dashboard:

defenseclaw tui

Or stream the JSONL fan-out from the gateway:

tail -f ~/.defenseclaw/gateway.jsonl | jq 'select(.connector == "claudecode")'

Open Claude Code and use it normally. Every prompt, tool call, and response shows up. Run this for a week — you want a real picture of your agent's behaviour before the first block decision.

Inspect the rule pack

ls ~/.defenseclaw/policies/guardrail/default/
sed -n '80,125p' ~/.defenseclaw/policies/guardrail/default/rules/commands.yaml

The CMD-RM-RF rule is CRITICAL and matches / plus protected top-level prefixes such as /etc, /var, /home, /usr, and /opt. It deliberately does not flag every rm -rf command or arbitrary home subdirectory. The three bundled packs currently carry the same command-rule boundary; their policy thresholds and other configuration differ.

Promote to action mode with HITL

defenseclaw setup guardrail \
  --connector claudecode \
  --mode action \
  --rule-pack default \
  --human-approval \
  --hilt-min-severity high \
  --restart

Three flags matter here:

--mode action — CRITICAL findings now block.
--human-approval — HIGH findings become confirmation requests on ask-capable events.
--hilt-min-severity high — MEDIUM and LOW still log; only HIGH and above prompt.

Claude Code supports native PreToolUse ask, so the approval prompt surfaces inside the agent UI itself.

Trigger the rule

Create an empty target so the exercise remains safe even if the connector is not enforcing:

mkdir -p /var/tmp/defenseclaw-demo-empty

Then ask Claude Code:

Please run rm -rf /var/tmp/defenseclaw-demo-empty.

The PreToolUse hook fires, the gateway scores the command, the default rule pack returns CMD-RM-RF at CRITICAL, and Claude Code receives a deny decision before the shell runs.

Confirm in the audit log

defenseclaw alerts --limit 10
# or live-tail the JSONL fan-out filtered by severity:
tail -f ~/.defenseclaw/gateway.jsonl \
  | jq 'select(.connector == "claudecode" and (.severity == "HIGH" or .severity == "CRITICAL"))'

2026-05-08T14:02:11Z  CRITICAL  CMD-RM-RF  blocked
  command:    rm -rf /var/tmp/defenseclaw-demo-empty
  reason:     Recursive force delete from critical root path
  rule:       policies/guardrail/default/rules/commands.yaml
  connector:  claudecode
  hook:       PreToolUse
  decision:   block

Why this works the way it does

01User Claude Code
rm -rf /var/tmp/defenseclaw-demo-empty
02Claude Code PreToolUse
PreToolUse(command)
03PreToolUse Gateway
POST /api/v1/claudecode/hook
04Gateway Policy
evaluate(command)
05Policy Gateway
CRITICAL · CMD-RM-RF
06Gateway PreToolUse
block · recursive delete
07PreToolUse Claude Code
block
08Claude Code User
I can't run that — DefenseClaw blocked it.

The block decision is made by the gateway, not Claude Code. The hook is the messenger.

Stop Claude Code from deleting a critical path

Outcome

Wire Claude Code (observability only)

Watch the audit log fill up

Inspect the rule pack

Promote to action mode with HITL

Trigger the rule

Confirm in the audit log

Why this works the way it does

Variations

Next

Catch a prompt injection on Codex

HITL approvals

On this page

Stop Claude Code from deleting a critical path

What if I want to allow it on a specific path?

What if I'm running an autonomous agent with no operator?

What if I want the same protection on Codex / Cursor / Copilot CLI?

What if the gateway is unreachable when the hook fires?

Catch a prompt injection on Codex

HITL approvals

On this page