# Critical Flow Preflight And Drill Gate

Status: governance baseline

This gate applies to approval-gated, user-visible, operator-visible, security,
deploy, UAT, and live-environment workflows where failure can consume long
coordination time or create safety risk.

The June 2026 MFA drill showed that documentation and review are not enough
when the causal flow, provider assumptions, execution surface, and retry budget
are not mapped before live validation starts. The durable rule is:

```text
Flow map before implementation.
Non-live preflight before live window.
Bounded retry before causal reset.
Fairway evidence before handoff.
```

## Operating Principle

The goal is to deliver features faster, with higher quality, and with enough
safety for the current stage of the product. Process is valuable only when it
improves one of those outcomes.

Before adding a new review layer, packet, approval step, handoff, or gate,
state the value hypothesis:

- speed: what wait, retry loop, or rework should this reduce?
- quality: what missed flow, defect class, or regression should this catch?
- safety: what mutation, credential, exposure, or rollback risk should this
  reduce?

Pilot new process on a bounded slice before making it permanent. The pilot must
record whether the process actually improved speed, quality, or safety. If it
does not, remove or narrow the process and invest in better preflight, tests,
UAT flow coverage, or tool automation instead.

## Automate After Repetition

Repeated coordination work is a delivery smell. Use this rule:

- first time: do the work manually and learn the real shape;
- second time: capture the checklist, command, packet, or validation query;
- third time: automate it or create a scoped automation task with an owner.

Preferred automation targets are Fairway state summaries, review-wait and
merge-ready checks, commit-boundary handling, preflight packet generation, UAT
coverage diffs, CI/deploy monitor handbacks, evidence redaction/validation, and
delivery/token/process overhead reporting.

Automation proposals must still state the same speed, quality, or safety value
hypothesis before becoming default process. If the proposed automation does not
reduce waiting, rework, missed defects, rollback risk, or cycle time, do not add
it as another mandatory gate.

## Required Before Live Or Broad UAT

For each P0/P1 critical flow, create or update a Product Quality flow row before
feature work, broad UAT, deploy validation, or live drill scheduling. The row
must name:

- persona and canonical entry point;
- happy, empty, blocked, recovery, negative, and cleanup paths;
- contract/API/CLI/runtime owners;
- fixture, identity, permission, provider, DNS/edge, browser, CI/CD, and
  environment prerequisites;
- non-live or disposable preflight proof;
- rollback/cleanup owner and evidence owner;
- accepted residual gaps and explicit no-go conditions.

## Required Preflight Shape

A critical-flow preflight must be checked in, reproducible, and reviewed before
being used as live-window evidence. It must:

- run without source/prod mutation unless the packet explicitly authorizes an
  isolated disposable target;
- validate setup/readback evidence before browser, credential, token, or
  sensitive-operation steps;
- fail closed with sanitized findings;
- prove rollback or cleanup for disposable resources;
- emit stable JSON/markdown artifacts with redaction self-test evidence;
- avoid one-off scripts for credential submission, browser automation, or
  provider mutation.

## Reviewer Packet Rule

Reviewers need causal context, not just a narrow diff. Every critical-flow
review packet must include:

- the current goal and why this change exists;
- the last blocker or missing proof this slice addresses;
- what the change is allowed to prove;
- what the change must not authorize;
- exact commands and artifact paths reviewed;
- next owner/action if the proof fails.

## Review Scaling Rule

The lightweight model is the first-class default for GPUaaS critical-flow
stabilization. Critical flows still need full review at the actual approval
boundary: live window, deploy, production-readiness claim, external compliance
claim, source/prod mutation, credential action, break-glass, or
sensitive-operation gate implementation.

They do not need the full critical matrix on every small child task. The June
2026 MFA drill showed that repeatedly routing full-domain reviews for
docs-only deferrals, offline harness guardrails, and stale-blocker cleanup
burned coordination and LLM tokens without improving safety.

Default policy:

- micro slices that only narrow, document, or preserve an already-blocked
  posture require one independent accountable reviewer from the primary
  affected domain;
- grouped cleanup or grouped harness/docs slices should use one batch artifact
  and one accountable reviewer per primary affected domain;
- epic, launch, live-window, deploy, production-readiness, and compliance
  claims require the full configured review matrix;
- if a child task expands authority, enables enforcement, weakens a safety
  gate, performs mutation, handles credentials, or changes public exposure, it
  loses inheritance and must use the full relevant review matrix.

The review record must state whether the decision is a narrow child-slice
review, grouped review, or epic/release review. Any request to make the heavier
model default again must be explicit, time-bounded, and justified by evidence
that it improves defect discovery, rollback safety, delivery speed, or
production risk reduction. Fairway configuration should eventually encode these
review profiles so reviewers are routed by actual risk, not by a
one-size-fits-all task template.

## Safe Iteration Boundary

For pre-production work with no active users, the primary control is a safe
engineering boundary, not review ceremony. Once Architecture Control approves a
non-live or disposable preflight boundary, agents should optimize for causal
learning inside that boundary:

```text
Make it work in disposable/non-live.
Capture evidence and rollback proof.
Review the boundary transition.
```

Inside the safe boundary, setup, readback, classifier, harness, and provider
shape fixes should use lightweight review unless they expand authority or
weaken a safety gate. Requiring the full matrix for each internal fix is a
process smell; it should be challenged when it does not improve product quality
or reduce risk.

The full review matrix returns when work crosses a decision boundary:

- live drill/window authorization;
- source/prod mutation;
- deploy, release, or public exposure;
- credential reset/submission outside an approved disposable packet;
- token/API sensitive-operation matrix;
- break-glass;
- sensitive-operation enforcement;
- production-readiness, customer, or compliance claim.

If a process step is not improving defect discovery, product quality, or risk
control, reduce the review load and invest in better preflight, tests, UAT flow
coverage, and tool automation.

## Control Maturity Ramp

Do not apply mature CISO/control-attestation ceremony to every early
development feature. The default maturity path is:

1. make the feature work in dev/kind/staging with reproducible tests;
2. prove representative user/admin/ops flows with UAT or e2e coverage;
3. keep product/security claims limited to what the evidence proves;
4. add production cutover controls only at the environment boundary;
5. add compliance, custody, and multi-approver controls when an external or
   regulated claim is actually being made.

For MFA and comparable security features, early iterations should look like
normal product engineering: code/config, tests, preflight, and UAT. Formal
control evidence is still valuable, but it belongs at cutover, release,
break-glass, credential, sensitive-operation, or compliance boundaries. Adding
it earlier must be a deliberate exception with a named risk and expected
benefit.

## Retry Budget Rule

Approval-gated reruns must be bounded. A meaningful failure is a failure after
the approved preflight path reaches the behavior being tested. Coordination-only
failures, such as stale session cleanup or missing approval packet metadata, do
not count against the behavior retry budget, but must still be recorded.

Default policy:

- after the first meaningful failure, create a scoped blocker task;
- after the second meaningful failure, verify the causal model before another
  rerun packet;
- after the third meaningful failure, stop narrow reruns and create a causal
  reset task before requesting another live or disposable rerun.

The causal reset must explain whether the failure is caused by the product,
provider semantics, harness, environment, execution surface, or review packet.

## Step-Back Rule

The control track must stop endless retry loops. If repeated iterations keep
discovering new blockers instead of converging, the work is no longer in
closeout; it is in causal discovery.

Step back when any of these are true:

- two or three meaningful failures happen after the work was described as
  nearly ready;
- fixes keep landing in the same layer without end-to-end progress;
- reviews approve narrow slices but the overall flow still does not work;
- new blockers appear only during live/preflight execution;
- coordination and approval effort exceeds engineering progress;
- the next proposed action is another approval packet without a clearer causal
  model.

When this happens:

1. stop new live/disposable retry packets;
2. summarize the failure chain and real unknowns;
3. switch to causal discovery inside a safe non-live or disposable boundary;
4. reduce review to one accountable reviewer for internal child fixes;
5. define the exact proof required before another retry;
6. return to full review only at the next boundary decision.

## Execution Surface Rule

Browser, provider, git, deploy, and long-running operations must prove the
execution surface before the operation. If a Desktop provider surface cannot
create `.git/index.lock`, write Go cache, launch Chrome, or keep provider
processes healthy, use an approved tmux/SSH/CLI execution lane and record that
handoff in Fairway.

## Fairway Boundary

Fairway owns generic coordination primitives: track memory, wait/wake,
notification audit, retry packets, reviewer packets, routability validation,
execution-surface readiness, and causal-reset policy.

GPUaaS owns product-specific flow rows, scripts, fixtures, Keycloak behavior,
UAT matrices, runbooks, and evidence contracts.