# Testing Standards (Canonical)

## Test Pyramid
- Unit tests for domain logic and policy checks.
- Integration tests with real Postgres/Redis/queue.
- Contract tests against OpenAPI.
- E2E tests for critical user journeys.
- Apply the execution discipline in [Evidence_First_Change_Protocol.md](./Evidence_First_Change_Protocol.md) when choosing baselines and verification scope.

## Evidence-First Verification Rules
- Verification must be reported relative to a baseline, not as an isolated pass/fail claim.
- The baseline should match the scope of the change; targeted checks are preferred over ritual full-suite runs.
- Every non-trivial behavior change must include one direct proof that the intended behavior changed.
- If a previously passing scoped check fails after the change, treat that as a regression until disproven.
- Unexpected failures are evidence about dependencies or ownership boundaries and must be recorded, not waved away.
- Broad UAT must validate product gap register rows from
  [Product_Gap_Readiness_Gate.md](./Product_Gap_Readiness_Gate.md);
  it must not be the first place where missing persona workflows, empty-state
  dependencies, cleanup paths, or owner boundaries are discovered.

## Mandatory Critical Flows
- Auth/login/session lifecycle.
- Provision -> active -> release.
- Billing accrual and low/depleted enforcement.
- Stripe webhook idempotency.
- Admin user/node operations.
- Storage CRUD with path-safety constraints.

## Workflow Regression Packs

Unit and integration tests are necessary but not sufficient for release
confidence. Critical browser/runtime workflows must be represented as workflow
regression packs with an owner, target environment, seed-data assumptions,
required proof, and artifact expectations.

Initial packs:

- auth and session: deployed app can start OIDC auth, Keycloak accepts the app
  callback URL, and an authenticated user lands in the product shell.
- V3 shell: authenticated user can navigate the main mode/project/tenant shell.
- app launch: user can select an app, submit launch, and see workload/runtime
  state.
- notebook/proxy: active notebook route opens through the intended proxy path.
- terminal: active allocation terminal opens through the browser websocket
  route.
- node inventory and lifecycle: operator can see nodes, readiness, blockers,
  and task evidence.
- platform proxy: host/path route resolves through the intended renderer and
  emits trace/correlation evidence.
- demo deploy: deployed app/API/auth hosts pass remote smoke and login/auth
  redirect checks.

The current pack catalog is `doc/governance/Workflow_Regression_Packs.yaml`.

```bash
make workflow-packs CMD=validate
make workflow-packs CMD=show PACK_ID=auth-session
```

Each bug fix that affects a workflow pack must either update that pack's
automated coverage or record a release-blocking task explaining why the
coverage cannot be added in the same slice.

Operational controls that depend on an external provider or a browser runtime
must also declare a pre-UAT readiness gate. UAT should not be the first place
where these are discovered:

- stale provider topology or hostnames;
- provider Admin API representation or endpoint semantics;
- missing rollback target proof;
- browser-launch failure on the exact execution surface;
- evidence redaction/runtime artifact failures;
- Fairway session, review-wait, or completion handback gaps.

For approval-gated live drills, the readiness gate must produce a bundle that
proves target readback, provider API request shape, exact browser runtime
launch, redaction behavior, and rollback scope before the live window is
requested. If a full provider mutation cannot be safely exercised before UAT,
the workflow pack must name the substitute gate, owner, and residual risk.

Before a workflow pack is added or broadened, confirm the affected product gap
register rows are `ready_for_dev`, `ready_for_uat`, or `accepted_gap`. If the
row is still `needs_design`, `needs_api_contract`, `needs_runtime_fix`,
`needs_env_fix`, or `blocked`, run focused proof/fix work instead of a broad
pack.

## Bug-Fix Definition of Done

Every bug fix must record:

- root cause,
- owning layer,
- user-visible or operational impact,
- direct proof command for the fix,
- regression coverage added or updated,
- reason if the regression is deferred,
- residual risk.

Do not mark a bug-fix task done with only a symptom patch. If a failing check
reveals a different owning layer, create or link a blocker task and keep the
original task scoped.

Repo-local helper:

```bash
TASK_ID=<task-id> \
BUG_SUMMARY="deployed login rejected callback" \
ROOT_CAUSE="Keycloak client lacked deployed app callback URL" \
OWNING_LAYER=deploy-auth-config \
IMPACT="demo users could not login" \
PROOF_COMMAND="scripts/ci/platform_control_smoke.sh" \
REGRESSION_COVERAGE="auth-session pack: OIDC authorize callback gate" \
make bugfix-review-packet
```

## Acceptance Matrix
- AT-001 login success and token issuance.
- AT-002 invalid login returns 401.
- AT-003 non-admin blocked from admin APIs.
- AT-010 marketplace capacity reflects online + unassigned nodes only.
- AT-020 provisioning fails offline/in-use/insufficient-funds paths.
- AT-023 successful provision creates allocation + usage.
- AT-030 release by owner/admin succeeds.
- AT-031 release by unauthorized user fails.
- AT-032 release unassigns node.
- AT-033 removed: persistent server-side private-key download endpoint is retired.
- AT-040 billing loop accrues cost over time.
- AT-041 low balance warning only once per low-state transition.
- AT-042 depleted balance triggers forced release.
- AT-050 Stripe charge session returns checkout URL.
- AT-051 webhook credits balance on valid event.
- AT-052 duplicate webhook does not double-credit.
- AT-053 webhook signature bypass attempt (reused signature with mutated body) is rejected with 400.
- AT-060 storage list/upload/download/mkdir/rename/delete works in user root.
- AT-061 storage traversal attempts are rejected.
- AT-070 rate limit enforced after configured threshold; subsequent requests return 429.
- AT-071 rate-limit response headers (`X-RateLimit-Limit`, `X-RateLimit-Remaining`, `Retry-After`) are present.
- AT-072 rate-limit counter resets after window expires.
- AT-080 privileged mutations (provision, release, refund, admin node ops) each produce a structured audit log entry.
- AT-081 audit log entries contain actor_user_id, actor_role, action, target_type, target_id, result, correlation_id.
- AT-082 failed authorization attempts are recorded in audit log with result=failure.
- AT-083 audit log entries are immutable — no update/delete path exposed.

## Non-Functional Testing
- Load/performance tests with SLO thresholds.
- Chaos/failure tests for retries, DLQ, partial failures.
- Migration forward/rollback tests.
- Backup/restore verification.

### Authorization membership-resolution SLO (MVP)
- Target: p95 <= 20 ms and p99 <= 50 ms for membership-scoped authorization resolution.
- Required evidence:
  - integration test proving active-membership semantics (`deleted_at is null`) for both tenant and project memberships.
  - integration explain-plan evidence showing indexed membership paths are available and used for resolution.
- Optimization order:
  1. query/index tuning,
  2. only then consider permission/effective-access caching when sustained SLO breach persists.

## Security Testing
- SAST/DAST.
- Dependency and container scans.
- Secret scanning.
- IaC policy scanning.

## Quality Gates
- Minimum coverage threshold on critical domains.
- No flaky tests in release branch.
- Contract drift fails CI.

---

## Go Test Patterns

### Test pyramid for this codebase

| Layer | Build tag | Infra needed | Target |
|---|---|---|---|
| Unit | _(none)_ | None | Pure functions, HTTP middleware (httptest), service logic (mocked deps) |
| Integration | `integration` | Postgres + Redis + NATS | DB queries, policy client, rate limiter, full middleware chain |
| E2E | `e2e` | Full docker-compose stack | Acceptance matrix flows (AT-xxx) |

Run targets:
```bash
make test                  # unit only (fast, no infra)
make test-integration      # unit + integration (requires make dev-infra first)
```

### File and build tag conventions

```
packages/services/billing/
  service.go
  service_test.go          # unit — no build tag, package billing_test
  service_integration_test.go  # integration — //go:build integration at top
```

All integration test files must start with:
```go
//go:build integration

package billing_test
```

### Unit test pattern — table-driven subtests

This is the mandatory structure for all unit tests.

```go
func TestSanitize(t *testing.T) {
    tests := []struct {
        name  string
        input map[string]any
        want  map[string]any
    }{
        {
            name:  "redacts password field",
            input: map[string]any{"password": "secret", "username": "alice"},
            want:  map[string]any{"password": "[REDACTED]", "username": "alice"},
        },
        {
            name:  "redacts ssh_private_key prefix",
            input: map[string]any{"ssh_private_key_enc": "..."},
            want:  map[string]any{"ssh_private_key_enc": "[REDACTED]"},
        },
    }
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            got := middleware.Sanitize(tt.input)
            if !reflect.DeepEqual(got, tt.want) {
                t.Errorf("got %v, want %v", got, tt.want)
            }
        })
    }
}
```

### Testing HTTP middleware with httptest

Use `httptest.NewRecorder` and `httptest.NewRequest`. Never spin up a real
HTTP server for unit tests.

```go
func TestCorrelationID_GeneratesWhenAbsent(t *testing.T) {
    next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        id := middleware.CorrelationIDFromContext(r.Context())
        if id == "" {
            t.Error("correlation ID should not be empty")
        }
        w.WriteHeader(http.StatusOK)
    })
    h := middleware.CorrelationID(next)

    rec := httptest.NewRecorder()
    req := httptest.NewRequest(http.MethodGet, "/", nil)
    h.ServeHTTP(rec, req)

    if rec.Code != http.StatusOK {
        t.Errorf("status = %d, want 200", rec.Code)
    }
    if rec.Header().Get("X-Correlation-ID") == "" {
        t.Error("X-Correlation-ID response header should be set")
    }
}
```

### Mocking dependencies in unit tests

Service dependencies (`policy.Client`, DB pool, Redis) should be tested with
interface mocks, not real infrastructure. Define mocks inline in `_test.go`
files; do not use generated mock frameworks.

```go
// In service_test.go:
type stubPolicy struct{ values map[string]int64 }

func (s *stubPolicy) GetInt(_ context.Context, key string, _ ...policy.ScopeOption) (int64, error) {
    if v, ok := s.values[key]; ok {
        return v, nil
    }
    return 0, fmt.Errorf("key %q not found", key)
}
func (s *stubPolicy) GetBool(_ context.Context, _ string, _ ...policy.ScopeOption) (bool, error) { return false, nil }
func (s *stubPolicy) GetString(_ context.Context, _ string, _ ...policy.ScopeOption) (string, error) {
    return "", nil
}
```

### Integration test setup

Integration tests connect to real infrastructure started by `make dev-infra`.
Use the `DATABASE_URL`, `REDIS_URL`, and `NATS_URL` environment variables
(populated from `.env.local`).

```go
//go:build integration

package billing_test

import (
    "os"
    "testing"

    "github.com/gpuaas/platform/packages/shared/db"
)

func TestMain(m *testing.M) {
    // Skip gracefully if infra is not running
    if os.Getenv("DATABASE_URL") == "" {
        os.Exit(0)
    }
    os.Exit(m.Run())
}
```

A shared `packages/testhelpers` package will provide:
- `testhelpers.DB(t)` — creates a pool and registers `t.Cleanup(pool.Close)`
- `testhelpers.Redis(t)` — creates a Redis client with cleanup
- `testhelpers.NATS(t)` — creates a NATS connection with cleanup
- `testhelpers.TruncateTables(t, pool, tables...)` — cleans test data between runs

### What to test at each layer

| Concern | Layer | Notes |
|---|---|---|
| Pure functions (Sanitize, ErrCode constants) | Unit | No mocks needed |
| HTTP middleware (auth, correlation, ratelimit) | Unit | httptest + stub dependencies |
| Service logic (allocation state transitions, billing math) | Unit | Stub policy + mock DB via interface |
| Policy client cache behaviour | Integration | Real Postgres |
| Rate limiter window and reset | Integration | Real Redis |
| Outbox + DB transaction atomicity | Integration | Real Postgres |
| Full middleware chain + real JWT | Integration | Real Keycloak token |
| Acceptance matrix flows (AT-xxx) | E2E | Full stack |

### Testing the outbox pattern

Verify that a failing NATS publish does NOT roll back the DB write (the outbox
relay is responsible for retrying). Verify that a DB transaction rollback does
NOT leave a dangling outbox row.

### Coverage targets

| Package | Minimum coverage |
|---|---|
| `packages/shared/errors` | 100% |
| `packages/shared/middleware` | 90% |
| `packages/shared/policy` | 85% |
| `packages/services/billing` | 85% |
| `packages/services/provisioning` | 80% |
| All other service packages | 70% |

## Observability and Traceability Tests (Required)

Every feature touching runtime paths must include verification for traceability signals.

Minimum checks:
1. Error envelope tests assert `correlation_id` is present on failure responses.
2. API/runtime rejection tests assert `trace_id` appears in logs (or response details when exposed).
3. Async consumer tests verify context extraction path is exercised (`events.ExtractContextFromMsg`).
4. Local observability gate must pass:
   - `make ops-observability-trace-gate`

For incident-critical flows (allocation create/release, provisioning transitions, billing accrual):
1. Add or update tests that exercise failure path logging with:
   - `correlation_id`
   - catalog `error_code`
2. Validate that handler/worker code sets span error status on failure branches.