Stop Claude Code from running rm -rf
Wire DefenseClaw into Claude Code, observe for a week, then promote to action mode and watch a destructive shell command never reach the disk.
"I want to observe my Claude Code, and make sure it doesn't run
rm -rfon my home directory."
That sentence is the original DefenseClaw user story. Here is the entire flow, end to end.
Outcome
By the end of this story, you'll have:
- Claude Code wired to DefenseClaw via native hooks + OTel.
- A week of audit data in
~/.defenseclaw/audit.dbproving what your agent actually does. - An
actionmode policy that blocksrm -rf <home subtree>with a CRITICAL finding. - An audit row showing the block, with the rule reference for the next reviewer.
Wire Claude Code (observability only)
defenseclaw setup claude-codeThis is the observability-only alias. Claude Code talks directly to its native upstream. DefenseClaw inspects via PreToolUse / PostToolUse / UserPromptSubmit / Stop / PermissionRequest hooks plus the native OTel exporter.
Watch the audit log fill up
Open the live dashboard:
defenseclaw tuiOr stream the JSONL fan-out from the gateway:
tail -f ~/.defenseclaw/gateway.jsonl | jq 'select(.connector == "claudecode")'Open Claude Code and use it normally. Every prompt, tool call, and response shows up. Run this for a week — you want a real picture of your agent's behaviour before the first block decision.
Inspect the rule pack
ls ~/.defenseclaw/policies/guardrail/default/
cat ~/.defenseclaw/policies/guardrail/default/shell.yamlThe default pack ships with a CRITICAL rule for recursive deletes against home-directory paths. The strict pack adds dot-file targeting and /var/lib/*; the permissive pack drops to HIGH and only matches ~/.
Promote to action mode with HITL
defenseclaw setup guardrail \
--connector claudecode \
--mode action \
--rule-pack default \
--human-approval \
--hilt-min-severity high \
--restartThree flags matter here:
--mode action— CRITICAL findings now block.--human-approval— HIGH findings pause for operator approval.--hilt-min-severity high— MEDIUM and LOW still log; only HIGH and above prompt.
Claude Code supports native PreToolUse ask, so the approval prompt surfaces inside the agent UI itself.
Trigger the rule
In Claude Code:
Please run
rm -rf ~/projects/old-experimentsto free disk space.
The PreToolUse hook fires, the gateway scores the command, the default rule pack flags it CRITICAL, and the hook returns block. Claude Code never executes the command and surfaces the block reason to the user.
Confirm in the audit log
defenseclaw alerts --limit 10
# or live-tail the JSONL fan-out filtered by severity:
tail -f ~/.defenseclaw/gateway.jsonl \
| jq 'select(.connector == "claudecode" and (.severity == "HIGH" or .severity == "CRITICAL"))'2026-05-08T14:02:11Z CRITICAL shell.dangerous-rm blocked
command: rm -rf ~/projects/old-experiments
reason: recursive delete targeting user home subtree
rule: policies/guardrail/default/shell.yaml:14
connector: claudecode
hook: PreToolUse
decision: blockWhy this works the way it does
- 01User Claude Code
rm -rf ~/projects/old-experiments
- 02Claude Code PreToolUse
PreToolUse(command)
- 03PreToolUse Gateway
POST /api/v1/claudecode/hook
- 04Gateway Policy
evaluate(command)
- 05Policy Gateway
CRITICAL · shell.dangerous-rm
- 06Gateway PreToolUse
block · recursive delete
- 07PreToolUse Claude Code
block
- 08Claude Code User
I can't run that — DefenseClaw blocked it.
Variations
Next
Stories
Concrete walkthroughs for the things people actually want — block rm -rf on Claude Code, catch a prompt injection on Codex, stop secret exfiltration from Cursor, approve risky tool calls, pin local observability, and switch connectors without losing audit history.
Catch a prompt injection on Codex
Local regex packs catch the obvious "ignore previous instructions" patterns. The optional LLM judge catches the clever ones. Wire both into Codex in two commands.