# Platform-Control Release Promotion Policy

Purpose:
- prevent `release/platform-control` drift from `master`
- prevent piecemeal release-only hotfixes from silently dropping runtime files, config keys, generated artifacts, or tests
- make platform-control deploys reproducible from one exact source SHA

## Required policy

`release/platform-control` is a promotion branch, not a development branch.

It must always point to one exact source commit that is already integrated in `master`, unless an explicit temporary divergence override is being used during incident response.

Do not:
- cherry-pick individual files onto `release/platform-control`
- make release-only edits directly in the release worktree as a normal workflow
- treat release deploy success as proof that release contents are synchronized with `master`

## Required promotion flow

1. Integrate the intended change into `master`.
2. Promote one exact source SHA to `release/platform-control`.
3. Deploy from the frozen release candidate generated from that same SHA.

Use:

```bash
make platform-control-local-preflight
scripts/ci/platform_control_promote_release_branch.sh origin/master
```

Required sequence:
- confirm CI is green on the exact source SHA you intend to promote
- run `make platform-control-local-preflight`
- generate or collect the platform evidence payload when the change is
  production-impacting, then run `scripts/ci/platform_evidence_gate.sh`
- review any local preflight failures and decide whether they represent a real release blocker
- then run `scripts/ci/platform_control_promote_release_branch.sh origin/master`

The local preflight is required as a signal for platform-control release work, but it is advisory by default. Green CI on the exact source SHA is the primary promotion gate. A local preflight failure should block promotion only when it demonstrates a real release-risking defect rather than local-environment drift or local-only e2e instability.

Platform evidence is the shared release-readiness record. The current gate runs
in report mode by default so existing release flow stays usable while evidence
coverage matures:

```bash
PLATFORM_EVIDENCE_PAYLOAD_DIR=dist/platform-evidence scripts/ci/platform_evidence_gate.sh
```

For production-impacting promotion hardening, switch the same gate to enforcing
mode:

```bash
PLATFORM_EVIDENCE_GATE_MODE=enforce PLATFORM_EVIDENCE_PAYLOAD_DIR=dist/platform-evidence scripts/ci/platform_evidence_gate.sh
```

Promotion can also require the bundle before `release/platform-control` is
moved:

```bash
PLATFORM_CONTROL_REQUIRE_EVIDENCE_BUNDLE=true scripts/ci/platform_control_promote_release_branch.sh origin/master
```

That enforcement validates `bundle.json`, `items.json`, and
`release-gates.json`, confirms `bundle.source_commit` matches the exact source
SHA being promoted, and rejects missing, failing, blocked, or unapproved partial
release gates.

Missing, failing, or blocked gates must not be described as a clean promotion.
Partial gates require a named approval/debt path before they are allowed.

Platform-foundation L2 claims have an additional blocking gate. Before a
promotion, release note, or readiness packet claims PSSM L2 maturity, run:

```bash
PLATFORM_FOUNDATION_L2_CLAIM=true scripts/ci/platform_foundation_l2_promotion_gate.sh <source-sha>
```

That wrapper runs the platform-foundation boundary guard in `blocking_new`
mode, executes the shared-service degradation harness, and validates the
platform evidence bundle in enforcing mode. The normal
`platform_evidence_payload` CI job remains report/artifact-only outside the L2
claim path because ordinary release pipelines can intentionally omit UAT,
security, or deployment-smoke artifacts.

Release execution mode defaults:
- `full` is the default trigger mode for platform-control release pipelines.
- use `deploy` only as an explicit lighter-weight override when post-deploy report-only security/runtime evidence is intentionally being skipped.

Temporary release profile override for demo-week web iteration:
- `PLATFORM_CONTROL_RELEASE_PROFILE=web-fast` is allowed only for low-risk web-only changes.
- `web-fast` keeps the promotion-branch discipline, but release CI intentionally narrows to:
  - contract checks
  - frontend build/test
  - web-only runtime publish/deploy
  - lightweight remote web smoke/runtime validation
- do not use `web-fast` for backend/API/schema/worker/infra changes.
- preferred wrapper:

```bash
scripts/ci/platform_control_release_web_fast.sh origin/master
```

Change-aware release lane selection:
- run `scripts/ci/platform_control_change_aware_preflight.sh --base origin/master --head HEAD --write-env`
  before choosing a deploy profile when the failure is late in CD or remote
  validation.
- use `PLATFORM_CONTROL_RELEASE_PROFILE=validation-only` only when the change is
  limited to validation assertions/scripts or kind edge verifier logic and the
  already deployed manifest should be revalidated.
- use `PLATFORM_CONTROL_RELEASE_PROFILE=config-only` only when the change is
  limited to environment config, Kubernetes manifests, Keycloak redirects,
  edge/tunnel config, deploy scripts, or preflight logic and an existing release
  manifest is supplied through `PLATFORM_CONTROL_RELEASE_MANIFEST_FILE`,
  `PLATFORM_CONTROL_RELEASE_MANIFEST_JSON`,
  `PLATFORM_CONTROL_RELEASE_MANIFEST_B64`, or
  `PLATFORM_CONTROL_RELEASE_MANIFEST_URL`.
- do not use either lane for code, schema, contract, SDK, generated artifact, or
  runtime image changes.

## Platform-Control Dev-Test Bypass

Some platform features cannot be validated meaningfully in kind because the real
behavior depends on MAAS, physical GPU nodes, node-agent lifecycle, platform
Funnel/auth exposure, or platform-control-only secrets. For those cases, a
bounded dev-test bypass is allowed to shorten the feedback loop.

Use this only when:

- the change needs platform-control hardware/runtime behavior to validate;
- the source commit is disposable or under active development;
- the result is not being treated as a release;
- the follow-up plan is to merge to `master` and run the normal promotion flow
  after the implementation is meaningful and reviewed.

Use:

```bash
scripts/ci/platform_control_dev_test_deploy.sh <source-ref>
```

What the wrapper does:

- requires a clean working tree;
- allows the source SHA to be outside `origin/master`;
- skips local preflight by default;
- force-promotes that SHA to `release/platform-control`;
- triggers a deploy-mode GitLab pipeline with
  `PLATFORM_CONTROL_ALLOW_RELEASE_BRANCH_DIVERGENCE=true` and
  `PLATFORM_CONTROL_DEV_TEST_DEPLOY=true`.

What it does not do:

- it does not replace CI on `master`;
- it does not mark a change released;
- it does not waive contract/schema/test ownership for the final change;
- it does not allow release-only fixes to remain unmerged.

After a successful dev-test deploy, the required path is still:

```bash
git switch master
git merge --ff-only <validated-branch>
git push origin master
scripts/ci/platform_control_promote_release_branch.sh origin/master
PLATFORM_CONTROL_RELEASE_MODE=deploy PLATFORM_CONTROL_RELEASE_PROFILE=standard scripts/ci/gitlab_pipeline_trigger.sh
```

The local preflight still exists to catch issue classes that recently escaped into deploy:
- frontend admin route-guard regressions
- platform-role bind/list/revoke authz drift
- audit-log visibility regressions on the admin read path

Current default local preflight shape:
- full frontend Playwright e2e
- role authz smoke
- kind parity validation

Use scoped frontend preflight only as an explicit debugging override:

```bash
PLATFORM_CONTROL_LOCAL_PREFLIGHT_FRONTEND_MODE=scoped make platform-control-local-preflight
```

The promotion script:
- resolves one source SHA
- verifies it is reachable from `origin/master` by default
- runs `scripts/ci/platform_control_local_preflight.sh` unless `PLATFORM_CONTROL_SKIP_LOCAL_PREFLIGHT=true` is set intentionally
- treats local preflight failure as advisory by default and continues with a warning
- can be switched back to a blocking local gate with `PLATFORM_CONTROL_REQUIRE_LOCAL_PREFLIGHT=true`
- force-updates `release/platform-control` to that exact SHA
- regenerates `dist/platform-control-release-candidate.env`

## CI enforcement

Release branch CI now includes:

- `scripts/ci/platform_control_release_branch_guard.sh`

This guard fails `release/platform-control` pipelines if the branch contains commits not reachable from `origin/master`.

Temporary escape hatch:

```bash
PLATFORM_CONTROL_ALLOW_RELEASE_BRANCH_DIVERGENCE=true
```

Use this only for bounded incident response, and merge the release-only fixes back to `master` immediately after stabilization.

## Incident lesson captured

Recent terminal and bootstrap recovery work exposed the failure mode this policy is meant to prevent:

- release branch lost terminal-gateway internal listener/runtime files
- release branch lost config keys required for internal terminal WebSocket transport
- release branch had stale generated SDK artifacts
- release branch had stale web test helpers/tests
- deploy/runtime failures were caused by release drift rather than by the owning feature code

The corrective rule is simple:

- fix on the source branch
- merge to `master`
- promote one exact SHA

## Operator checklist before release

Before triggering a platform-control deploy:

1. Confirm the intended fix is on `master`.
2. Confirm promotion will use `scripts/ci/platform_control_promote_release_branch.sh`.
3. Confirm CI is green on the exact SHA being promoted.
4. Run `make platform-control-local-preflight` and review the result.
5. If local preflight fails, confirm whether the failure is a real release blocker or local-environment-only noise before continuing.
6. Confirm no manual release-only edits are pending in the release worktree.
7. Confirm generated artifacts required by CI are committed.
8. Confirm bootstrap/config/runtime files for changed services are present on the promoted SHA, not only in local worktrees.

## Release order of operations

Use this order for platform-control releases, especially after app/runtime/API/schema work.

1. Stabilize on `master`.
   - Commit all code, docs, schema, seed, generated SDK artifacts, and web test updates on `master`.
   - Push `master` to the canonical remotes before touching `release/platform-control`.
   - Do not patch `release/platform-control` directly.

2. Run targeted local checks for the files changed.
   - API contract changes: run codegen/SDK checks and the relevant API tests.
   - Web/UI changes: run TypeScript plus the relevant Vitest/Playwright slice.
   - Runtime image changes: build the changed image locally when practical and inspect required OCI labels.
   - Schema/seed changes: verify the schema is idempotent for existing databases, not only correct for fresh databases.

3. Run release preflight.
   - Preferred: `make platform-control-local-preflight`.
   - If preflight is skipped because the local environment is noisy, record that decision and run the targeted checks from step 2 first.

4. Promote the exact source SHA.
   - Use `scripts/ci/platform_control_promote_release_branch.sh origin/master`.
   - Confirm the release candidate file reports the intended SHA.
   - Confirm `release/platform-control` points at that same SHA.

5. Trigger the release pipeline explicitly.
   - Standard deploy mode: `PLATFORM_CONTROL_RELEASE_MODE=deploy PLATFORM_CONTROL_RELEASE_PROFILE=standard scripts/ci/gitlab_pipeline_trigger.sh`.
   - Use `full` when post-deploy report-only security/runtime evidence is required.
   - Use `web-fast` only for web-only changes.
   - Use `validation-only` to rerun remote validation after validation/assertion
     fixes without rebuilding or redeploying.
   - Use `config-only` to redeploy config from an existing manifest without
     package or image-publish stages.
   - Retry CD from the release manifest, not from package/image stages. The
     manifest-only deploy wrapper is `scripts/ci/platform_control_deploy_from_manifest.sh`.

6. Watch the pipeline by stage, not only final status.
   - Contracts: OpenAPI/AsyncAPI and release branch guard must pass.
   - Build/test: backend, frontend, integration, SDK, and security gates must pass.
   - Package: every runtime in `scripts/ci/platform_control_runtime_matrix.sh` must have a matching publish job or an intentional filter.
   - Digest fan-in: `platform_control_assemble_runtime_digests` must find every runtime digest snippet.
   - Manifest-only deploy: remote preflight, schema, seed, manifests, image annotations, rollout, and controller secret reconciliation must pass without rebuilding artifacts or images.
   - Remote Validation: public endpoints, observability, runtime image conformance, app/runtime smoke, and disposable cleanup must pass against the deployed manifest.

7. Fix failures at the owning layer.
   - Frontend failures: update the UI contract/test expectation together.
   - SDK failures: regenerate and commit generated artifacts from the API contract.
   - Missing runtime artifact failures: update the runtime matrix, publish job graph, and digest fan-in together.
   - Image label failures: fix the runtime Dockerfile/build path, not the publish check.
   - Schema/seed failures: make `db_schema_v1.sql` safe for existing databases and keep `scripts/seed.sql` aligned.
   - Remote cleanup failures: treat them as lifecycle/read-model defects unless proven to be external infrastructure failure.

8. Repeat from `master`.
   - Commit the fix on `master`.
   - Push all remotes.
   - Promote the new exact SHA.
   - Trigger a new pipeline.
   - Do not retry the same failed release SHA after code/schema fixes unless the failure was proven to be transient infrastructure.

9. Close the release only after final validation.
   - Confirm the final pipeline status is `success`.
   - Confirm deploy and remote validation jobs passed.
   - Record the final SHA and pipeline URL in the handoff or task notes.

## Common platform-control release failure classes

These are release blockers observed in recent platform-control work:

- Stale frontend tests after UI/API payload changes.
- Stale SDK/generated artifacts after OpenAPI changes.
- Runtime present in `platform_control_runtime_matrix.sh` but missing a split publish CI job.
- Runtime Dockerfile missing required OCI labels: `org.opencontainers.image.version`, `org.opencontainers.image.revision`, and `org.opencontainers.image.created`.
- Fresh-schema-only changes that do not backfill existing deployed databases with `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` or refreshed constraints.
- Seed data that introduces a new enum/check value before the deployed database constraint allows it.
- Disposable validation cleanup surfacing read-model or lifecycle API defects after the main smoke path passed.
