Overview
Most problems belong to one subsystem and have a focused troubleshooting page:
This page is for issues that span subsystems. If you know which system is failing, jump to that section first.
First commands to run
defenseclaw doctor
defenseclaw status
defenseclaw-gateway status
sqlite3 ~/.defenseclaw/audit.db \
"SELECT timestamp, action, severity FROM audit_events ORDER BY timestamp DESC LIMIT 10;"
If those commands all look healthy and you still have a problem, the issue is likely at a subsystem boundary: sidecar configuration, audit persistence, sink delivery, or TUI filtering.
Boundary issues
Guardrail sees traffic but nothing in the audit store
- Confirm
audit_dbpoints at the database you are querying:defenseclaw config show --format json. - The gateway process may have a stale audit DB handle after a crash: restart the sidecar with your supervisor.
- Disk full:
df -h ~/.defenseclaw/— the sidecar stops writing when the partition is full.
Audit store has the verdict but the TUI doesn't show it
- Restart the TUI so it re-reads local state.
- Check panel filters before assuming the row is missing.
- Export a small audit slice with
defenseclaw-gateway audit export --limit 20and compare it with the panel.
Policy reload returns success, but decisions don't change
- Confirm you edited the active policy directory, not a stale copy under another
DEFENSECLAW_HOME. - Re-run the relevant policy validation/test command from Policy testing.
- Retest with content that should match the new rule; additive rules only change decisions for matching traffic.
Sandbox runs but violations don't appear in Splunk
- First confirm the local audit database has the sandbox-related action.
- Then run
defenseclaw setup observability listanddefenseclaw setup observability test splunk-main. - Check the sink
actionsandmin_severityfilters inaudit_sinks[].
Clock skew
Timestamps matter. If your sinks are rejecting events with "too old" errors:
timedatectl status # Linux
sntp -sS time.apple.com # macOS
Clock drift > 5 minutes triggers rejection from many SaaS sinks.
Disk pressure
The gateway is disk-bound when audit.db or gateway.jsonl grow too large:
gateway.jsonluses lumberjack defaults frominternal/gatewaylog/writer.go: 50 MB, 5 backups, 30 days, compressed.- The audit DB is SQLite WAL-backed. Keep the data directory on a filesystem with enough free space and back it up like other local state.
- Use
defenseclaw-gateway audit export --limit 1000 --output audit-events.jsonlbefore pruning or archiving externally.
Memory pressure
- The webhook dispatcher caps concurrent deliveries at 20 and suppresses duplicate target/action pairs during the cooldown window.
- The gatewaylog writer writes JSONL synchronously, then runs fanout callbacks outside the writer lock.
- OTel buffers are owned by the OTel SDK/exporter configuration.
If the sidecar is using > 200 MB RSS steady-state, investigate — that's well above expected.
When to open a bug
Before opening a GitHub issue, run:
defenseclaw doctor --json-output > doctor.json
defenseclaw config show --format json > config.redacted.json
defenseclaw-gateway audit export --limit 1000 --output audit-events.jsonl
tail -n 1000 ~/.defenseclaw/gateway.jsonl > gateway.tail.jsonl
Attach those files after reviewing them for deployment-specific details.