CiscoCiscoDefenseClaw
Observability

Local observability stack

One-command OpenTelemetry + Loki + Tempo + Prometheus + Grafana stack, pre-wired to the DefenseClaw gateway. Grafana on :3000, dashboards seeded, no manual config.

defenseclaw setup local-observability brings up a full local OpenTelemetry stack and wires the gateway to it in one shot. It's what the project itself uses for development, demos, and rule-pack tuning.

defenseclaw setup local-observability up — five containers come up, the gateway picks up the new OTLP endpoint, and Grafana dashboards populate live.

What it brings up

The stack runs from bundles/local_observability_stack/docker-compose.yml:

ServiceImage / roleDefault port
otel-collectorOpenTelemetry Collector — fans out to Prom/Loki/Tempo4317 (gRPC), 4318 (HTTP)
prometheusMetrics store9090
lokiLog store (gateway audit + sink failures)3100
tempoTrace store (per-decision spans)3200
grafanaUI with seeded DefenseClaw dashboards3000

All five run under one Compose project (defenseclaw-local-observability). They share a Docker network and the bundled Grafana datasources point at each container by service name.

Bring it up

Start the stack.

defenseclaw setup local-observability up

The openclaw-observability-bridge binary (shipped in bundles/local_observability_stack/bin/) drives Compose, waits for healthchecks, and writes the OTLP endpoint into ~/.defenseclaw/config.yaml under otel:.

Open Grafana. http://localhost:3000 — anonymous viewer access is enabled by default for the demo dashboards. Sign in as admin/admin to edit.

Generate traffic. Run any DefenseClaw-managed agent, or send a synthetic event to every configured sink:

defenseclaw setup observability test

The test subcommand fans a probe through every enabled destination — OTel exporter, audit sinks, webhooks — so you can confirm Grafana, Splunk, and notifier wiring in one shot.

Watch the dashboards populate. "DefenseClaw → Overview" lights up first; per-rail dashboards (Guardrail, Admission, Judge, HITL) follow as the events flow.

Useful flags

The up subcommand exposes the bridge configuration directly:

defenseclaw setup local-observability up \
  --endpoint 127.0.0.1:4317 \
  --signals traces,metrics,logs \
  --service-name defenseclaw-gateway \
  --with-audit-sink \
  --timeout 90
FlagWhat it does
--endpointOTLP endpoint the gateway writes into config.yaml. Defaults to whatever the bridge published (127.0.0.1:4317).
--signalsComma-separated subset of traces,metrics,logs. Useful for low-volume installs that only want metrics.
--service-nameResource attribute set on every emitted span. Defaults to defenseclaw.
--with-audit-sink / --no-audit-sinkAlso (or don't) add an audit_sinks: otlp_logs entry so audit events flow as OTel logs. Default is on.
--no-configBring containers up but do not modify config.yaml. Useful when you manage gateway config out-of-band.
--no-waitDon't block on healthchecks — fire-and-forget.
--timeoutSeconds to wait for healthchecks. Default 180.

The OTLP protocol is read from the bridge readiness contract (currently grpc); there is no --protocol flag on up. Override it by editing otel.protocol in config.yaml if you need http/protobuf.

What ends up in config.yaml

~/.defenseclaw/config.yaml (after up)
otel:
  endpoint: 127.0.0.1:4317
  protocol: grpc
  insecure: true
  signals: [traces, metrics, logs]
  service_name: defenseclaw-gateway

audit_sinks:
  - kind: otlp_logs
    name: local-otlp-logs
    endpoint: 127.0.0.1:4317
    protocol: grpc
    insecure: true

If a sink with the same name already exists, the bridge updates it in place rather than adding a duplicate. Other sinks (Splunk HEC, JSONL, webhooks) are left untouched.

Bundled Grafana dashboards

Five dashboards ship in bundles/local_observability_stack/grafana/dashboards/:

overview.json
guardrail-verdicts.json
admission-and-scanners.json
llm-judge.json
hitl.json

They're auto-provisioned via bundles/local_observability_stack/grafana/provisioning/. The folder is named "DefenseClaw" inside Grafana so they're easy to find. Edits you make in the UI are not persisted back to disk by default — copy them out with Dashboard → JSON Model if you want to keep them.

Tear it down

defenseclaw setup local-observability down

Stops all five containers and removes the Compose project. Dashboards and Grafana state live in named Docker volumes so they survive down/up cycles.

To wipe everything including the volumes:

defenseclaw setup local-observability reset --yes

reset is destructive — your historical traces, logs, and metrics go with it. Splunk HEC and JSONL sinks are unaffected. down only stops the containers; named Docker volumes survive across down/up cycles until you run reset.

Status and logs

defenseclaw setup local-observability status
defenseclaw setup local-observability logs --service grafana --follow

status shows container health and the resolved OTLP endpoint. logs --service <name> tails one of the five containers — useful when Grafana isn't picking up dashboards or the OTel collector is dropping spans. Drop --service to fan out logs from every container at once.

Use it alongside Splunk

Local observability and Splunk are independent sinks. A common pattern:

  • Engineers and SREs use the local Grafana stack for live investigation.
  • The same gateway also forwards every event to the org Splunk for retention and SOC.

Just run both setup commands — they edit different blocks in config.yaml:

defenseclaw setup local-observability up
defenseclaw setup splunk --enterprise --hec-endpoint ... --hec-token ...

See Splunk integration for the dashboards we ship for the local Splunk app and for tips on building your own SPL queries against the gateway sourcetypes.

Troubleshooting