Stories

Stop Claude Code from deleting a critical path

Wire DefenseClaw into Claude Code, observe for a week, then safely verify the CMD-RM-RF rule against a disposable path.

"I want to observe Claude Code and block recursive force deletion against critical system paths."

That sentence is the original DefenseClaw user story. Here is the entire flow, end to end.

Outcome

By the end of this story, you'll have:

  • Claude Code wired to DefenseClaw via native hooks + OTel.
  • A week of audit data in ~/.defenseclaw/audit.db proving what your agent actually does.
  • An action mode policy that blocks recursive force deletion under protected root prefixes with a CRITICAL finding.
  • An audit row showing the block, with the rule reference for the next reviewer.

Wire Claude Code (observability only)

defenseclaw setup claude-code

This is the observability-only alias. Claude Code talks directly to its native upstream. DefenseClaw inspects via PreToolUse / PostToolUse / UserPromptSubmit / Stop / PermissionRequest hooks plus the native OTel exporter.

Watch the audit log fill up

Open the live dashboard:

defenseclaw tui

Or stream the JSONL fan-out from the gateway:

tail -f ~/.defenseclaw/gateway.jsonl | jq 'select(.connector == "claudecode")'

Open Claude Code and use it normally. Every prompt, tool call, and response shows up. Run this for a week — you want a real picture of your agent's behaviour before the first block decision.

Inspect the rule pack

ls ~/.defenseclaw/policies/guardrail/default/
sed -n '80,125p' ~/.defenseclaw/policies/guardrail/default/rules/commands.yaml

The CMD-RM-RF rule is CRITICAL and matches / plus protected top-level prefixes such as /etc, /var, /home, /usr, and /opt. It deliberately does not flag every rm -rf command or arbitrary home subdirectory. The three bundled packs currently carry the same command-rule boundary; their policy thresholds and other configuration differ.

Promote to action mode with HITL

defenseclaw setup guardrail \
  --connector claudecode \
  --mode action \
  --rule-pack default \
  --human-approval \
  --hilt-min-severity high \
  --restart

Three flags matter here:

  • --mode action — CRITICAL findings now block.
  • --human-approval — HIGH findings become confirmation requests on ask-capable events.
  • --hilt-min-severity high — MEDIUM and LOW still log; only HIGH and above prompt.

Claude Code supports native PreToolUse ask, so the approval prompt surfaces inside the agent UI itself.

Trigger the rule

Create an empty target so the exercise remains safe even if the connector is not enforcing:

mkdir -p /var/tmp/defenseclaw-demo-empty

Then ask Claude Code:

Please run rm -rf /var/tmp/defenseclaw-demo-empty.

The PreToolUse hook fires, the gateway scores the command, the default rule pack returns CMD-RM-RF at CRITICAL, and Claude Code receives a deny decision before the shell runs.

Confirm in the audit log

defenseclaw alerts --limit 10
# or live-tail the JSONL fan-out filtered by severity:
tail -f ~/.defenseclaw/gateway.jsonl \
  | jq 'select(.connector == "claudecode" and (.severity == "HIGH" or .severity == "CRITICAL"))'
2026-05-08T14:02:11Z  CRITICAL  CMD-RM-RF  blocked
  command:    rm -rf /var/tmp/defenseclaw-demo-empty
  reason:     Recursive force delete from critical root path
  rule:       policies/guardrail/default/rules/commands.yaml
  connector:  claudecode
  hook:       PreToolUse
  decision:   block

Why this works the way it does

  1. 01User Claude Code

    rm -rf /var/tmp/defenseclaw-demo-empty

  2. 02Claude Code PreToolUse

    PreToolUse(command)

  3. 03PreToolUse Gateway

    POST /api/v1/claudecode/hook

  4. 04Gateway Policy

    evaluate(command)

  5. 05Policy Gateway

    CRITICAL · CMD-RM-RF

  6. 06Gateway PreToolUse

    block · recursive delete

  7. 07PreToolUse Claude Code

    block

  8. 08Claude Code User

    I can't run that — DefenseClaw blocked it.

The block decision is made by the gateway, not Claude Code. The hook is the messenger.

Variations

Next