Local observability stack

One-command OpenTelemetry + Loki + Tempo + Prometheus + Grafana stack, pre-wired to the DefenseClaw gateway. Grafana on :3000, dashboards seeded, no manual config.

defenseclaw setup local-observability brings up a full local OpenTelemetry stack and wires the gateway to it in one shot. It's what the project itself uses for development, demos, and rule-pack tuning.

defenseclaw setup local-observability up — five containers come up, the gateway picks up the new OTLP endpoint, and Grafana dashboards populate live.

What it brings up

The stack runs from bundles/local_observability_stack/docker-compose.yml:

Service	Image / role	Default port
`otel-collector`	OpenTelemetry Collector — fans out to Prom/Loki/Tempo	`4317` (gRPC), `4318` (HTTP)
`prometheus`	Metrics store	`9090`
`loki`	Log store (gateway audit + sink failures)	`3100`
`tempo`	Trace store (per-decision spans)	`3200`
`grafana`	UI with seeded DefenseClaw dashboards	`3000`

All five run under one Compose project (defenseclaw-local-observability). They share a Docker network and the bundled Grafana datasources point at each container by service name.

Bring it up

Start the stack.

defenseclaw setup local-observability up

The openclaw-observability-bridge binary (shipped in bundles/local_observability_stack/bin/) drives Compose, waits for healthchecks, and writes the OTLP endpoint into ~/.defenseclaw/config.yaml under otel:.

Open Grafana. http://localhost:3000 — anonymous viewer access is enabled by default for the demo dashboards. Sign in as admin/admin to edit.

Generate traffic. Run any DefenseClaw-managed agent, or send a synthetic event to every configured sink:

defenseclaw setup observability test

The test subcommand fans a probe through every enabled destination — OTel exporter, audit sinks, webhooks — so you can confirm Grafana, Splunk, and notifier wiring in one shot.

Watch the dashboards populate. "DefenseClaw → Overview" lights up first; per-rail dashboards (Guardrail, Admission, Judge, HITL) follow as the events flow.

Useful flags

The up subcommand exposes the bridge configuration directly:

defenseclaw setup local-observability up \
  --endpoint 127.0.0.1:4317 \
  --signals traces,metrics,logs \
  --service-name defenseclaw-gateway \
  --with-audit-sink \
  --timeout 90

Flag	What it does
`--endpoint`	OTLP endpoint the gateway writes into `config.yaml`. Defaults to whatever the bridge published (`127.0.0.1:4317`).
`--signals`	Comma-separated subset of `traces,metrics,logs`. Useful for low-volume installs that only want metrics.
`--service-name`	Resource attribute set on every emitted span. Defaults to `defenseclaw`.
`--with-audit-sink` / `--no-audit-sink`	Also (or don't) add an `audit_sinks: otlp_logs` entry so audit events flow as OTel logs. Default is on.
`--no-config`	Bring containers up but do not modify `config.yaml`. Useful when you manage gateway config out-of-band.
`--no-wait`	Don't block on healthchecks — fire-and-forget.
`--timeout`	Seconds to wait for healthchecks. Default 180.

The OTLP protocol is read from the bridge readiness contract (currently grpc); there is no --protocol flag on up. Override it by editing otel.protocol in config.yaml if you need http/protobuf.

What ends up in `config.yaml`

~/.defenseclaw/config.yaml (after up)

otel:
  endpoint: 127.0.0.1:4317
  protocol: grpc
  insecure: true
  signals: [traces, metrics, logs]
  service_name: defenseclaw-gateway

audit_sinks:
  - kind: otlp_logs
    name: local-otlp-logs
    endpoint: 127.0.0.1:4317
    protocol: grpc
    insecure: true

If a sink with the same name already exists, the bridge updates it in place rather than adding a duplicate. Other sinks (Splunk HEC, JSONL, webhooks) are left untouched.

Bundled Grafana dashboards

Five dashboards ship in bundles/local_observability_stack/grafana/dashboards/:

overview.json

guardrail-verdicts.json

admission-and-scanners.json

llm-judge.json

hitl.json

They're auto-provisioned via bundles/local_observability_stack/grafana/provisioning/. The folder is named "DefenseClaw" inside Grafana so they're easy to find. Edits you make in the UI are not persisted back to disk by default — copy them out with Dashboard → JSON Model if you want to keep them.

Tear it down

defenseclaw setup local-observability down

Stops all five containers and removes the Compose project. Dashboards and Grafana state live in named Docker volumes so they survive down/up cycles.

To wipe everything including the volumes:

defenseclaw setup local-observability reset --yes

reset is destructive — your historical traces, logs, and metrics go with it. Splunk HEC and JSONL sinks are unaffected. down only stops the containers; named Docker volumes survive across down/up cycles until you run reset.

Status and logs

defenseclaw setup local-observability status
defenseclaw setup local-observability logs --service grafana --follow

status shows container health and the resolved OTLP endpoint. logs --service <name> tails one of the five containers — useful when Grafana isn't picking up dashboards or the OTel collector is dropping spans. Drop --service to fan out logs from every container at once.

Use it alongside Splunk

Local observability and Splunk are independent sinks. A common pattern:

Engineers and SREs use the local Grafana stack for live investigation.
The same gateway also forwards every event to the org Splunk for retention and SOC.

Just run both setup commands — they edit different blocks in config.yaml:

defenseclaw setup local-observability up
defenseclaw setup splunk --enterprise --hec-endpoint ... --hec-token ...

See Splunk integration for the dashboards we ship for the local Splunk app and for tips on building your own SPL queries against the gateway sourcetypes.

Local observability stack

Grafana shows no data

Port already in use

Reverting config.yaml