Stop Claude Code from deleting a critical path
Wire DefenseClaw into Claude Code, observe for a week, then safely verify the CMD-RM-RF rule against a disposable path.
"I want to observe Claude Code and block recursive force deletion against critical system paths."
That sentence is the original DefenseClaw user story. Here is the entire flow, end to end.
Outcome
By the end of this story, you'll have:
- Claude Code wired to DefenseClaw via native hooks + OTel.
- A week of audit data in
~/.defenseclaw/audit.dbproving what your agent actually does. - An
actionmode policy that blocks recursive force deletion under protected root prefixes with a CRITICAL finding. - An audit row showing the block, with the rule reference for the next reviewer.
Wire Claude Code (observability only)
defenseclaw setup claude-codeThis is the observability-only alias. Claude Code talks directly to its native upstream. DefenseClaw inspects via PreToolUse / PostToolUse / UserPromptSubmit / Stop / PermissionRequest hooks plus the native OTel exporter.
Watch the audit log fill up
Open the live dashboard:
defenseclaw tuiOr stream the JSONL fan-out from the gateway:
tail -f ~/.defenseclaw/gateway.jsonl | jq 'select(.connector == "claudecode")'Open Claude Code and use it normally. Every prompt, tool call, and response shows up. Run this for a week — you want a real picture of your agent's behaviour before the first block decision.
Inspect the rule pack
ls ~/.defenseclaw/policies/guardrail/default/
sed -n '80,125p' ~/.defenseclaw/policies/guardrail/default/rules/commands.yamlThe CMD-RM-RF rule is CRITICAL and matches / plus protected top-level
prefixes such as /etc, /var, /home, /usr, and /opt. It deliberately
does not flag every rm -rf command or arbitrary home subdirectory. The three
bundled packs currently carry the same command-rule boundary; their policy
thresholds and other configuration differ.
Promote to action mode with HITL
defenseclaw setup guardrail \
--connector claudecode \
--mode action \
--rule-pack default \
--human-approval \
--hilt-min-severity high \
--restartThree flags matter here:
--mode action— CRITICAL findings now block.--human-approval— HIGH findings become confirmation requests on ask-capable events.--hilt-min-severity high— MEDIUM and LOW still log; only HIGH and above prompt.
Claude Code supports native PreToolUse ask, so the approval prompt surfaces inside the agent UI itself.
Trigger the rule
Create an empty target so the exercise remains safe even if the connector is not enforcing:
mkdir -p /var/tmp/defenseclaw-demo-emptyThen ask Claude Code:
Please run
rm -rf /var/tmp/defenseclaw-demo-empty.
The PreToolUse hook fires, the gateway scores the command, the default rule pack
returns CMD-RM-RF at CRITICAL, and Claude Code receives a deny decision before
the shell runs.
Confirm in the audit log
defenseclaw alerts --limit 10
# or live-tail the JSONL fan-out filtered by severity:
tail -f ~/.defenseclaw/gateway.jsonl \
| jq 'select(.connector == "claudecode" and (.severity == "HIGH" or .severity == "CRITICAL"))'2026-05-08T14:02:11Z CRITICAL CMD-RM-RF blocked
command: rm -rf /var/tmp/defenseclaw-demo-empty
reason: Recursive force delete from critical root path
rule: policies/guardrail/default/rules/commands.yaml
connector: claudecode
hook: PreToolUse
decision: blockWhy this works the way it does
rm -rf /var/tmp/defenseclaw-demo-empty
PreToolUse(command)
POST /api/v1/claudecode/hook
evaluate(command)
CRITICAL · CMD-RM-RF
block · recursive delete
block
I can't run that — DefenseClaw blocked it.
Variations
Next
Stories
Concrete walkthroughs for the things people actually want — block rm -rf on Claude Code, catch a prompt injection on Codex, stop secret exfiltration from Cursor, approve risky tool calls, pin local observability, and switch connectors without losing audit history.
Catch a prompt injection on Codex
Verify Codex prompt enforcement with the bundled TRUST-IGNORE-PREVIOUS rule, then add the optional judge for ambiguous and semantic cases.