Overview
Testing is split by implementation surface and by CI cost. Contributors should add tests at the lowest layer that covers the change, then run the matching Makefile target before opening a PR.
Layers
| Layer | Where | Cost | What it covers |
|---|---|---|---|
| Python CLI | cli/tests/ | ~s | Click commands, config migration, Python scanner glue |
| Go packages | internal/**/*_test.go, test/**/*_test.go | ~s to minutes | Gateway, TUI, policy, audit, sandbox, race-sensitive paths |
| TypeScript plugin | extensions/defenseclaw/src/__tests__/ | ~s | Fetch interceptor, sidecar client, provider coverage |
| Rego | policies/rego/*_test.rego | ~s | Admission, firewall, guardrail, sandbox, skill-action decisions |
| E2E scripts | test/e2e/, scripts/test-e2e-*.sh, .github/workflows/e2e.yml | minutes | Full-stack runner scenarios outside the default make test target |
Running
make test # cli-test + gateway-test
make cli-test-cov # Python pytest with coverage XML
make go-test-cov # Go race tests with coverage.out
make ts-test # plugin Vitest suite
make rego-test # OPA policy tests
make check # parity gates and provider coverage
Per-package: go test ./internal/gateway/... or pytest cli/defenseclaw/commands/test_cmd_init.py.
Conventions
Go
- Table-driven tests. One function per unit. Subtests via
t.Run(name, ...). - Use
testify/requirefor assertions that short-circuit andtestify/assertfor non-fatal ones. - Prefer
httptest.Serverto real network. Prefert.TempDir()toos/execto filesystem fixtures. - Fake the clock with
clockwork.NewFakeClock— never read the wall clock in tests.
Python
pytest+pytest-click. One test file per command.- Use
CliRunner.isolated_filesystem()to avoid leaking state. - Mock sidecar calls via
responsesor a local test server fromtest/helpers/.
Fixtures
test/fixtures/skills/— skill directories exhibiting specific findings.test/fixtures/plugins/— plugin fixtures for scanner coverage.test/fixtures/mcps/— MCP server fixtures.test/fixtures/code/— code scanning fixtures.test/testdata/— shared endpoint/provider corpora.
Fixtures are versioned and treated as source. Breaking a fixture to match a new behavior requires the test to be updated explicitly.
Harnesses
Policy tests
Rego tests live next to the policy modules and run through the rego-test target:
make rego-test
Plugin tests
Vitest suites under extensions/defenseclaw/src/__tests__/ use mocks and local HTTP servers instead of live providers:
const res = await interceptFetch("https://api.openai.com/v1/chat/completions");
expect(res.headers["X-DC-Target-URL"]).toContain("api.openai.com");
What to test
- Parsers: every error path (malformed YAML, bad regex, unknown fields).
- Compilers: round-trip (parse → compile → serialize → parse == original).
- Decisions: every row of the actions matrix.
- Concurrency: race tests (
go test -race) on every new shared state. - Errors: exit codes match the reference table.
- Upgrades: migration from prior on-disk layouts — fixture directories preserved.
What not to test
- Don't test OPA itself. Assume OPA is correct; test your rules.
- Don't test SQLite itself.
- Don't test upstream providers with live keys.
- Don't rely on wall-clock time.
- Don't shell out to the system-installed
defenseclaw— use the harness.
CI gates
The following are required for a PR to land:
make testmake cli-test-covmake go-test-covmake ts-testmake rego-test.github/workflows/e2e.ymlfor full-stack E2E coveragemake lint(golangci-lint, ruff, prettier)make docs-check(no docgen drift)make docs-deadlinks(no broken doc links)