# Billing Platform Overhaul v1

## Purpose

Billing is moving from allocation-duration accounting into a platform control plane.
This document defines the target model, boundaries, and implementation phases before
we add v3 production billing surfaces or contract changes.

The current implementation is intentionally simple:

1. allocation usage records,
2. immutable ledger entries,
3. balance queries,
4. Stripe checkout credits,
5. basic payment-session operations.

That baseline remains valid, but it is not enough for production billing across
bare-metal, GPU slices, apps, storage, network, reserved capacity, and tenant/project
administration.

## Non-Goals

This document does not:

1. change current v1 demo behavior;
2. define final pricing numbers;
3. implement invoices, budgets, reserved capacity, or delinquency;
4. replace the immutable ledger invariant;
5. make UI mock data contractual.

All API and schema work must still follow the normal contract-first process in
`doc/api/`.

## Design Principles

1. The ledger is immutable. Corrections are new ledger entries.
2. Usage capture, rating, ledger posting, invoicing, payment collection, and
   delinquency are separate concerns.
3. Pricing and billing policy must be explainable by tenant, project, user, SKU,
   app, and region.
4. Launch-time choices such as pricing mode and idle policy must be snapshotted onto
   the billable runtime, not inferred later from mutable catalog policy.
5. Billing must support multiple usage units, not only GPU-hours.
6. Billing operations are privileged and auditable.
7. User-facing billing language must hide internal implementation terms such as
   idempotency keys while preserving retry safety.
8. Finance read models may aggregate across domains, but billing services must not
   bypass domain ownership boundaries with ad-hoc joins.
9. Currency, tax, reseller, and revenue-recognition context must be modeled before
   production invoice or partner-channel commitments are made.

## Target Billing Domains

### Metering

Metering owns raw billable measurements. Examples:

1. GPU allocation runtime,
2. app runtime uptime,
3. model-serving requests or tokens,
4. storage byte-hours,
5. data ingress and egress,
6. reserved-capacity commitment windows,
7. idle/suspended runtime state windows.

Raw measurements should be append-only or replayable. They should include enough
correlation to reconstruct source context without depending on mutable current-state
tables.

### Rating

Rating converts measured usage into priced line items using a snapshotted pricing
context.

Rating inputs:

1. usage source and unit,
2. pricing mode,
3. SKU, app, storage class, network class, or service tier,
4. region,
5. tenant/project/user attribution,
6. applicable discounts or commitments,
7. billable state such as active, idle, suspended, or reserved.

Rating outputs are not ledger entries yet. They are explainable rated charges that can
be posted to ledger or invoice lines.

### Ledger

Ledger remains the source of financial truth.

Ledger responsibilities:

1. debit and credit postings,
2. adjustments,
3. refunds and credits,
4. balance computation,
5. reconciliation anchors to payment or invoice records.

Ledger must stay append-only. There must never be direct balance mutation.

### Invoicing

Invoicing groups rated charges, credits, taxes or adjustments, and payment state into
a customer-facing financial artifact.

Initial invoice support can be lightweight, but the model must support:

1. invoice headers,
2. invoice lines,
3. invoice lifecycle state,
4. due dates,
5. paid/partially paid/void/write-off outcomes,
6. exportable evidence for finance operations.

### Payments

Payments owns payment-session lifecycle and provider reconciliation.

The current Stripe checkout flow remains a valid payment rail. Future rails may
include invoice payment, credits, enterprise purchase orders, manual adjustments, or
reserved-commitment contracts.

### Budgets and Financial Controls

Budgets are guardrails and alerting controls, not ledger entries.

Budget scopes:

1. tenant,
2. project,
3. user override,
4. app or SKU class,
5. region,
6. storage/network class.

Budget actions:

1. notify only,
2. require approval,
3. block new launches,
4. suspend eligible workloads,
5. force release only when the product policy explicitly allows it.

### Delinquency

Delinquency handles unpaid, overdue, or exhausted-credit states.

Delinquency must be policy-driven and auditable. It must not be hidden in the billing
worker loop.

Typical states:

```text
healthy -> at_risk -> restricted -> suspended -> collections_hold
```

Initial implementation can use fewer states, but it should avoid baking direct
balance-depleted behavior into unrelated runtime handlers.

### Prepaid Credit Exhaustion Workflow

The current prepaid model must not treat `balance <= 0` as an opaque worker-side
force-release. Running out of prepaid credit is a financial state transition with
customer-visible impact, so it needs the same evidence, notification, and audit
discipline as other privileged lifecycle decisions.

Required v1 behavior:

1. runway warning estimates remaining prepaid funds from current burn rate and emits
   a notification when the remaining runway is below a tenant-admin configurable
   threshold; the platform default is 24 hours via
   `billing.prepaid_runway_warning_hours`;
2. low-balance warning emits a user/project notification with balance, threshold,
   affected active workloads, estimated runway, and correlation id;
3. disable/restrict step blocks new launches or disables eligible non-critical
   actions before touching running workloads;
4. auto-release-pending emits a second notification before destructive action when
   policy allows a grace window;
5. balance-depleted records a financial restriction or delinquency transition
   (`at_risk`, `restricted`, or `suspended`) with evidence that names the balance,
   policy values, active allocations, and affected app instances; the current
   selected-action policy is `billing.prepaid_depleted_action`, defaulting to
   `restrict`, with `force_release` as an explicit policy choice;
6. any force release, suspension, or removal must be a policy decision, not an implicit billing
   loop side effect;
7. audit and outbox events must share the same correlation id from the billing
   decision through provisioning release, app-runtime failure, notifications, and
   finance/operator evidence;
8. the billing page must show the current financial posture, estimated runway, what was affected, when
   it happened, and the next action to restore service.

Monthly invoice or postpaid accounts use the same state model, but the trigger is
invoice due/overdue policy rather than prepaid balance. Postpaid delinquency should
normally block new launches and escalate collections state before affecting running
workloads, unless the customer contract explicitly allows suspension or release.

## Pricing Modes

### On-Demand

Default pay-as-you-go mode.

Required snapshot fields:

1. pricing mode,
2. SKU or service tier,
3. unit price,
4. currency,
5. region,
6. effective pricing plan version.

### Spot

Discounted interruptible capacity.

Required model additions:

1. interruption policy,
2. notice window,
3. price source,
4. eligibility by SKU/region,
5. billing treatment for interrupted windows.

Spot should not be implemented as a SKU name convention. It is a pricing and
availability mode.

### Reserved Duration

Commitment-backed usage for fixed duration or capacity pool.

Required model additions:

1. reservation contract,
2. committed capacity,
3. start/end window,
4. overage policy,
5. early termination policy,
6. unused commitment reporting.

Reserved capacity can be assigned to tenant, project, or a specific workload family.
The raw usage record should still show actual consumption; commitment application is a
rating step.

Revenue-recognition baseline:

1. committed payment should be treated as deferred revenue until earned;
2. revenue is recognized over the commitment period, not entirely at contract
   signature;
3. unused commitment expires at the end of the commitment window unless the customer
   contract explicitly grants carry-forward credit;
4. overage is rated separately from committed usage;
5. overlapping reservations must have a deterministic application order before
   implementation.

The first reserved-capacity implementation should be tenant-level unless a reviewed
customer contract requires project-level assignment. Secondary markets or transferable
reserved capacity are out of scope for v1.

## Idle and Suspension Policy

Idle policy is selected at launch or inherited from project/tenant policy.

Examples:

1. no suspension,
2. suspend after 30 minutes idle,
3. suspend after 60 minutes idle,
4. notify before suspend,
5. auto-release after prolonged suspension.

Billing implications:

1. active runtime and idle runtime may rate differently;
2. suspended runtime may still charge for storage or reserved capacity;
3. app runtimes may define idle using app-specific signals;
4. billing must record state windows, not only final state.

Provisioning owns the runtime action. Billing owns the financial interpretation.

## Usage Attribution Model

This section is the canonical attribution shape for billing. It reconciles the
general billing model with `doc/architecture/App_Runtime_Billing_Model_v1.md` and
managed-ingress route identity.

Every billable record should resolve to:

1. `org_id`;
2. `project_id`;
3. `actor_type` and `actor_id` when there is a request submitter or launch actor;
4. optional `submitter_user_id` or submitter service-account identity for shared
   resources;
5. `resource_type`;
6. `resource_id`;
7. `allocation_id` when the usage is allocation-backed;
8. `app_instance_id` when the usage originates from an app instance;
9. `app_slug` and app version when app-created;
10. `operating_mode` for app runtimes;
11. `control_plane_scope` for app runtimes and shared control planes;
12. `runtime_backend` when applicable;
13. `route_id`, `endpoint_name`, `route_family`, `client_auth_mode`, and
    `proxy_pool_id` for managed-ingress usage;
14. `sku`, service tier, storage class, network class, or building-block key;
15. `region`;
16. `usage_source`;
17. `usage_unit`;
18. `correlation_id`.

Resource examples:

1. allocation,
2. app workload,
3. storage bucket,
4. network endpoint,
5. public IP,
6. load balancer,
7. VPN connection,
8. reserved-capacity contract.

App billing must follow `doc/architecture/App_Runtime_Billing_Model_v1.md`; it must
not create a separate balance model.

### Attribution Cross-Walk

| Source model | Required anchors | Billing canonical mapping |
|---|---|---|
| App runtime billing | `org_id`, `project_id`, `app_instance_id`, `app_slug`, `operating_mode`, `control_plane_scope`, `runtime_backend`, `correlation_id` | Adopted unchanged and extended with actor/resource/region/unit fields |
| Managed ingress route intent | `org_id`, `project_id`, `app_instance_id`, `endpoint_name`, `route_id`, `proxy_pool_id`, `client_auth_mode`, `route_family` | Adopted for route billing and route usage evidence |
| Allocation usage | allocation, user, SKU, region, time window | Maps to `resource_type=allocation`, `allocation_id`, actor, SKU, region, usage window |
| Storage/network usage | bucket/endpoint/network class, owner project, region | Maps to resource, service class, region, usage unit, and correlation where available |

When a source domain lacks one of these fields, the implementation must either:
1. add the missing field at the source;
2. mark the usage unit as not billable yet; or
3. document why the field is not meaningful for that usage unit.

### Attribution Within Shared Resources

Shared resources are billed to the owning tenant/project, but usage should preserve
submitter attribution for chargeback and abuse investigation.

Examples:

1. A shared Ray or Slurm control plane may be paid by one tenant/project owner while
   each job preserves the submitting user or service account.
2. A multi-allocation scheduler workload may distribute worker cost by source project
   while preserving submitter identity at job level.
3. A managed-ingress request bills to the route/app/project owner while preserving
   the caller identity from route authorization where the caller is authenticated.

The default product rule is:

1. ledger impact belongs to the paying project or tenant financial account;
2. usage attribution preserves submitter identity where known;
3. chargeback reports can group by submitter, but raw ledger corrections remain
   additive and project/tenant-owned.

## Currency Strategy

Money values use integer minor units plus an ISO-4217 `currency`. No floating point
money values are allowed.

Default strategy:

1. ledger entries carry their native transaction currency;
2. balances are computed per currency, not collapsed across currencies by default;
3. invoice headers have one invoice currency;
4. invoice lines should share the invoice currency unless an explicit conversion
   line is present;
5. when conversion is required, the rated or invoice line stores an FX snapshot:
   source currency, target currency, rate, provider/source, and timestamp;
6. historical entries are never revalued when FX rates change.

Initial production may be single-currency per tenant, but schema and APIs must not
assume the whole platform is single-currency. Supported customer currency should be a
tenant/account setting. Cross-currency balance presentation is a reporting feature, not
a ledger mutation.

Reseller billing may involve different currencies at different layers. For example,
the platform may charge a reseller in USD while the reseller invoices an end customer
in AED. That case must use the same FX snapshot mechanism as any other conversion,
with wholesale and customer-facing rated lines kept explainable.

## Tax And VAT Model

Tax is part of invoicing and rated charge presentation, not raw metering.

Initial tax baseline:

1. tax is calculated at invoice issuance;
2. tax jurisdiction, tax rate, tax category, and tax registration identifiers are
   snapshotted onto invoice headers and lines;
3. per-line tax is preferred so mixed taxable/exempt services can be represented;
4. invoice-level tax summaries may be derived from lines;
5. tax-exempt and reverse-charge customers must be representable even if the first
   implementation supports only one active jurisdiction;
6. tax corrections create adjustment lines or credit memos, not rewrites.

The first implementation may support a single configured jurisdiction and rate, but
invoice schemas must leave room for UAE VAT, EU VAT/reverse-charge, tax-exempt
customers, and customer tax registration numbers.

## Reseller And Channel Billing

Reseller support adds a billing layer above the tenant. It must be represented before
the partner/reseller program becomes contractual.

Concepts:

1. platform seller account;
2. reseller or partner account;
3. end-customer tenant;
4. reseller-visible usage and invoice evidence;
5. wholesale platform price;
6. reseller customer-facing price, markup, or margin;
7. credit-risk owner: platform or reseller;
8. tax responsibility owner.

Default v1 direction:

1. raw usage remains attributed to the end-customer tenant/project/resource;
2. the billing account may be the reseller rather than the tenant;
3. reseller margin is a rating/invoice layer, not a mutation of raw usage;
4. white-label invoice generation is a future presentation layer unless required by
   a signed partner;
5. reseller and end-customer views must have separate RBAC.

Partner/channel billing must not be implemented as ad hoc tenant impersonation.

## Refund And Dispute Model

Refunds and disputes are financial workflows with audit, not direct ledger edits.

Baseline behavior:

1. provider refund is allowed when the payment provider and refund window permit it;
2. internal credit is the fallback when provider refund is unavailable or out of
   policy;
3. partial refunds are supported by amount and reason;
4. every refund has requester, approver/actor, target user/customer, policy outcome,
   provider reference where applicable, and correlation ID;
5. chargebacks/disputes are represented separately from voluntary refunds because
   provider state can move independently of platform approval.

Dispute handling should eventually support:

1. dispute opened;
2. evidence submitted;
3. dispute won/lost;
4. provider debit or reversal posted;
5. account restriction or collections review if needed.

Existing refund APIs remain the baseline until this model is expanded.

## Operation Latency Expectations

| Operation | Target latency | Notes |
|---|---:|---|
| Balance check at launch/allocation request | < 100 ms | Must be local/control-plane fast path |
| Budget posture check at launch | < 200 ms | Notify-only budget can be stale; blocking budget cannot |
| Usage record creation | < 1 s after source event | Source domain may buffer but must preserve event time |
| Rated line creation | < 5 min | Worker lag must be observable |
| Ledger posting from rated lines | < 15 min | Posting lag must not silently affect balance guarantees |
| Budget threshold event | < 5 min after rated/posting input | First-seen/last-seen/count semantics |
| Invoice generation | Periodic, usually monthly | Manual/on-demand generation allowed for finance ops |
| Refund processing | Async/manual or provider-dependent | Must expose current state and correlation |
| Dispute/chargeback update | Provider webhook latency + reconciliation | Must be idempotent |

Latency targets are product/SLO defaults, not hardcoded business constants. If a
worker cannot meet them, V3 finance/ops surfaces must show lag and disabled reasons.

## Future Usage Units

Initial implemented unit:

1. `gpu_hour`.

Expected future units:

1. `gpu_second`,
2. `gpu_slice_hour`,
3. `storage_gb_hour`,
4. `egress_gb`,
5. `ingress_gb`,
6. `public_ip_hour`,
7. `load_balancer_hour`,
8. `request`,
9. `token_input`,
10. `token_output`,
11. `job_second`,
12. `reserved_capacity_hour`.

The usage table should not encode pricing logic in the unit name. Unit describes
measurement; rating policy determines cost.

`node_hour` and `vcpu_hour` are expected to ship first through the SKU Resource
Model pilot for CPU VM SKUs. That pilot must use the same rating/posting
separation, snapshot discipline, and shadow-rating evidence as GPU-hour usage.

Token Factory or model-serving token meters must flow into this unified billing
platform as usage events or usage records. They must not maintain a separate financial
ledger.

## Token Factory Metering Integration

Token Factory and model-serving runtimes emit request/token usage into the canonical
usage shape. Token usage should carry `token_input`, `token_output`, model/runtime
identity, tenant/project/app attribution where applicable, submitter identity when
known, and correlation ID.

Integration follows the same rule as app runtime metering: raw token events are not
ledger entries, rating converts them into rated lines, and ledger posting is an
explicit later step. The remaining open decision is the exact source adapter and
event contract for the first shipping Token Factory meter.

## Data Ingress and Egress Accounting

Network usage should be modeled separately from firewall or connectivity
configuration.

Initial direction:

1. collect traffic by tenant/project/resource where possible;
2. distinguish public egress, private fabric, and storage backend traffic;
3. avoid billing internal control-plane traffic to users;
4. keep raw measurements separate from rated charges;
5. make exemptions explicit in policy.

This is closely tied to future Network and Security surfaces under Access and to the
infra VRF design.

## Budget and Alert Semantics

Budgets are optional controls. They may be configured at tenant or project scope and
optionally narrowed by user, SKU class, app, or region.

Alert thresholds:

1. percent consumed,
2. absolute remaining amount,
3. projected depletion date,
4. abnormal spend increase,
5. reserved commitment under-utilization.

Budget enforcement must be clear in the UI:

1. advisory only,
2. launch blocked,
3. approval required,
4. existing workloads unaffected,
5. eligible workloads may be suspended.

Budget state should be surfaced in v3 shell/account/tenant/project read models without
making those pages own billing rules.

## Roles and Privileged Actions

Expected billing roles:

1. platform finance operator,
2. platform support operator,
3. tenant billing admin,
4. tenant billing viewer,
5. project admin with budget privileges,
6. regular user with self-spend visibility.

Privileged actions requiring audit:

1. create or change pricing plan,
2. apply credit or adjustment,
3. issue refund,
4. change budget enforcement,
5. change delinquency state,
6. write off invoice,
7. override reservation assignment,
8. reconcile failed payment session.

## API Surface Direction

Keep contract fragments under the billing/payments domain.

Initial future groups:

```text
/api/v1/billing/balance
/api/v1/billing/usage
/api/v1/billing/budgets
/api/v1/billing/invoices
/api/v1/billing/pricing-plans
/api/v1/billing/reservations
/api/v1/payments/*
/api/v1/admin/finance/*
/api/v1/v3/... finance read models
```

V3 read models can aggregate billing with workload/storage/app context, but mutations
should remain in billing, payments, or admin finance contracts.

Endpoint authority baseline:

| Surface | Typical actor | Notes |
|---|---|---|
| `GET /billing/balance` | self, project admin where scoped | Per-currency balance output |
| `GET /billing/usage` | self, project admin, tenant billing admin | Scope-limited and export-governed |
| `GET/POST /billing/budgets` | tenant/project billing admin | Mutations privileged and audited |
| `GET /billing/invoices` | tenant billing viewer/admin | Customer-facing artifacts |
| `POST /admin/finance/*` | platform finance/support operator | Privileged, audited, correlation required |
| `POST /billing/pricing-plans` | platform finance operator | Requires review/approval workflow |
| `GET /v3/... finance read models` | role-scoped operator/admin | Aggregated read models only |

## Schema Direction

Current tables such as `usage_records`, `ledger_entries`, `payment_sessions`,
`invoice_headers`, and `invoice_lines` are the baseline.

Expected additions or evolutions:

1. `pricing_plan_versions`,
2. `rated_usage_lines`,
3. `budget_policies`,
4. `budget_events`,
5. `reservation_contracts`,
6. `reservation_assignments`,
7. `delinquency_states`,
8. `network_usage_records`,
9. `storage_usage_records` or normalized multi-source usage records,
10. provider reconciliation records.

Do not add a mutable balance column.

## Event Direction

Potential future events:

1. `billing.usage_recorded`,
2. `billing.usage_rated`,
3. `billing.ledger_posted`,
4. `billing.budget_threshold_crossed`,
5. `billing.budget_enforcement_changed`,
6. `billing.invoice_generated`,
7. `billing.invoice_due`,
8. `billing.invoice_paid`,
9. `billing.delinquency_state_changed`,
10. `billing.reservation_created`,
11. `billing.reservation_applied`.

Events that cause cross-domain action must go through outbox.

## Worker Direction

Likely workers:

1. usage aggregation worker,
2. rating worker,
3. ledger posting worker,
4. invoice generation worker,
5. delinquency worker,
6. payment reconciliation worker,
7. budget alert worker.

These can remain binaries in the same monorepo until operational scale requires
separation.

Decision:
Ledger posting is a separate responsibility from rating. The first implementation may
run both responsibilities in one deployed binary for simplicity, but the code boundary,
idempotency keys, metrics, and tests must treat rated-line creation and ledger posting
as separate stages.

Worker dependency model:

```text
metering source -> usage aggregation -> rating -> ledger posting -> invoice/budget/delinquency
                                            \-> finance evidence/read models
payments provider -> payment reconciliation -> ledger posting / refund / dispute evidence
```

Failure rules:

1. rating can lag usage, but lag must be visible;
2. ledger posting must not run ahead of rating for rated usage;
3. invoice generation must not include unrated usage unless explicitly marked
   estimated;
4. budget enforcement cannot depend on stale data when enforcement mode is blocking;
5. payment reconciliation retries must be idempotent by provider event/session ID;
6. workers must emit operational metrics for queue depth, lag, failed records, and
   oldest unprocessed item.

## UX Implications

V3 should treat billing as a first-class family, but implementation can land in phases.

User/account view:

1. balance,
2. current burn,
3. personal payments,
4. self usage,
5. low-balance or budget warnings.

Tenant admin view:

1. tenant spend,
2. project/user attribution,
3. budgets,
4. payment methods or invoice status,
5. policy alerts.

Project admin view:

1. project budget posture,
2. user/SKU/app spend,
3. launch-blocking budget state,
4. storage/network contribution.

Platform finance view:

1. payment sessions,
2. failed reconciliation,
3. invoices,
4. credits/refunds,
5. delinquency,
6. exportable evidence.

Ops view:

1. billing worker health,
2. outbox lag,
3. stuck payment sessions,
4. budget enforcement signals affecting active workloads.

## Migration Phases

### Phase 0: Baseline Freeze

Keep current allocation-duration billing working. Only fix correctness bugs.

### Phase 1: Read-Model Alignment

Expose v3 billing summaries using existing ledger, usage, and payment records.
No new pricing mode behavior yet.

The first platform-finance workbench slice is payment-session triage:

1. list payment sessions with backend-owned `signal_key` and acknowledgement fields,
2. open a focused session detail with diagnostics, evidence, activity, and stable
   manual operation targets,
3. support manual internal credit, manual state update, and manual refund workflow,
4. do not expose payment-provider replay from the v3 workbench yet.

Provider replay/reconcile semantics remain a follow-up backend decision. Until that
lands, finance UI must label recovery as auditable manual recovery and link to
diagnostics/evidence rather than implying a provider replay exists.

### Phase 2: Rating Separation

Introduce rated usage lines and pricing plan versions. Existing allocation usage can be
rated through the new path before adding new units.

### Phase 3: Budgets

Add tenant/project budgets and notification-only alerts first. Blocking or suspension
comes later after policy review.

### Phase 4: Pricing Modes and Idle Policy

Add launch-time pricing mode and idle policy snapshots. Keep disabled UI choices hidden
or clearly unavailable until the backend supports them.

### Phase 5: Invoices and Delinquency

Add invoice lifecycle and delinquency state machine. Keep finance operations auditable.

### Phase 6: Storage, Network, and App Units

Add storage byte-hour, data ingress/egress, public IP, load balancer, and app-specific
usage units once the owning domains can emit reliable metering signals.

## Implementation Work Packages

The overhaul should be implemented as small, reviewable slices. Do not treat the
original architecture epic as implementation complete.

### 1. Billing Domain Inventory And Gap Review

Goal:
1. reconcile current code, schema, contracts, workers, and V3 finance surfaces with
   this target model;
2. classify what is production-ready, what is read-model only, and what is still
   design-only;
3. produce an ordered implementation ledger with blocked prerequisites.

Required output:
1. current-state matrix for metering, rating, ledger, invoicing, payments, budgets,
   delinquency, and finance UX;
2. list of existing tables and endpoints that can be reused unchanged;
3. list of schema/API/event additions needed before implementation;
4. explicit “do not touch yet” areas where product policy is undecided.

### 2. Rating Foundation

Goal:
Separate usage measurement from priced financial impact.

Required capabilities:
1. `pricing_plan_versions` or equivalent immutable pricing snapshots;
2. `rated_usage_lines` or equivalent explainable rated-charge records;
3. deterministic rating for existing allocation usage before adding new units;
4. idempotent re-rating behavior for retry and correction cases;
5. ledger posting from rated lines without direct balance mutation.

Acceptance gates:
1. the same usage window rated twice with the same pricing snapshot produces the same
   rated line identity or a no-op;
2. price changes do not rewrite historical rated lines;
3. corrections create new rated/ledger records, not updates to immutable money rows.

### 3. Pricing Plan And Launch Snapshot Model

Goal:
Make launch-time billing choices explicit and durable.

Required capabilities:
1. pricing mode snapshot: `on_demand`, future `spot`, future `reserved`;
2. unit price, currency, region, SKU/service tier, and pricing plan version snapshot;
3. app/runtime-specific pricing context where the workload originates from an app;
4. disabled state for pricing modes that are designed but not implemented.

This slice should not implement spot or reserved execution. It only creates the
contract shape so future implementation is not SKU-name driven.

### 4. Budget And Alert Model

Goal:
Introduce budgets as guardrails separate from ledger and payment state.

Required capabilities:
1. tenant and project budget policies;
2. threshold events with first-seen, last-seen, and count semantics;
3. notification-only enforcement first;
4. launch-blocking or suspension modes represented as explicit future states;
5. V3 shell/account/project/tenant read-model fields for budget posture.

Budget enforcement must not be hidden inside provisioning or billing worker code.
Provisioning may consume a policy decision, but billing owns the financial state and
budget event semantics.

### 5. Invoice Lifecycle

Goal:
Create a customer-facing financial artifact without weakening the prepaid ledger
baseline.

Required capabilities:
1. invoice header and line lifecycle;
2. invoice lines sourced from rated usage, credits, adjustments, and taxes or fees
   when those exist;
3. lifecycle states for draft, issued, due, paid, partially paid, void, and write-off
   where supported;
4. exportable invoice evidence for finance operators;
5. reconciliation anchors back to ledger entries and payment sessions.

Initial invoices may be internal/export-only. Provider-hosted invoice payment can be
a later payment-rail slice.

### 6. Delinquency And Financial Restrictions

Goal:
Replace ad hoc low-balance behavior with a policy-driven financial state machine.

Required capabilities:
1. tenant/project/user delinquency or financial restriction state;
2. policy-driven transitions such as `healthy`, `at_risk`, `restricted`, `suspended`,
   and `collections_hold`;
3. clear effects on new launches, existing workloads, app runtimes, and storage;
4. audited operator overrides;
5. V3-visible disabled reasons and recovery actions.

This must be designed with product/legal policy before destructive runtime actions
such as forced release or suspension are enabled.

### 7. Additional Usage Units

Goal:
Add non-allocation usage without inventing parallel billing stores.

Candidate units:
1. managed ingress request, byte, and connection-second;
2. app runtime uptime and request units;
3. storage byte-hours;
4. public egress and private fabric traffic;
5. scheduler job-seconds;
6. token input/output for model serving.

Each new unit requires:
1. owning domain signal source;
2. attribution anchors;
3. replay or dedupe model;
4. rating policy;
5. ledger/invoice behavior;
6. operator evidence.

### 8. Billing Test Harness

Goal:
Create a money-domain regression pack before broad billing implementation.

Required test classes:
1. ledger immutability;
2. idempotent retry;
3. rating determinism;
4. time-window and daylight-saving boundaries;
5. numeric precision and rounding;
6. payment provider reconciliation;
7. invoice total reconciliation;
8. budget threshold crossing;
9. delinquency transition;
10. privileged-action audit.

Tests that write JSON metadata or money amounts must execute against real Postgres
where bind typing or numeric precision could fail.

## Review Gates

Before implementation starts:
1. review this document with product, finance/operator, backend, and UX owners;
2. decide whether the first production tenant is prepaid-only, invoice/postpaid, or
   hybrid;
3. decide whether budgets initially notify only or can block launches;
4. decide whether invoices are internal/export-only or provider-issued in the first
   production release;
5. confirm whether rating and ledger posting run in one binary initially or as separate
   deployed workers; the code boundary remains separate either way.

Before enabling any production money-affecting mutation:
1. OpenAPI contract must be updated first.
2. Every privileged mutation must write audit.
3. Every worker retry path must be idempotent.
4. Every correction path must create additive records.
5. Finance V3 surfaces must show correlation/evidence without direct DB inspection.

## Initial Sequencing Recommendation

1. Run the domain inventory and gap review.
2. Implement the billing test harness.
3. Implement rating foundation for current allocation usage only.
4. Add pricing plan/version snapshots and launch-time context.
5. Add V3 read models for rated usage and finance evidence.
6. Add budgets in notification-only mode.
7. Add invoice lifecycle in internal/export-only mode.
8. Add delinquency states without destructive enforcement.
9. Add new usage units only after the owning domain signal is reliable.

This order keeps the existing ledger safe while building the next billing control
plane underneath it.

## Testing Requirements

Billing changes need stronger tests than ordinary CRUD:

1. ledger immutability tests,
2. idempotent retry tests,
3. rating determinism tests,
4. time-window boundary tests,
5. payment provider replay tests,
6. invoice total reconciliation tests,
7. budget threshold crossing tests,
8. delinquency transition tests,
9. audit tests for privileged mutations,
10. integration tests against Postgres for JSON and numeric precision;
11. shadow-rating tests that run a new pricing plan against historical usage before
    activation;
12. pricing rollout tests proving new pricing applies only to eligible new launches
    or explicitly migrated resources;
13. currency and tax rounding tests;
14. reseller attribution tests once channel billing is enabled.

Shadow-rating deployment pattern:

1. run a candidate pricing plan against historical usage in a sandbox or dry-run mode;
2. compare old-plan and new-plan rated output without ledger writes;
3. review deltas by tenant/project/SKU/unit before activation;
4. roll out new plans to new launches by default;
5. keep existing launches on their snapshotted pricing plan unless an explicit,
   audited migration is approved.

## Data Retention

Retention must satisfy audit, tax, customer dispute, and operational cost needs.
The executable table policy lives in
`doc/architecture/Billing_Retention_and_Compaction_Policy_v1.md`.

Baseline direction:

1. ledger entries are retained indefinitely or for the longest legal/accounting
   window required by deployed jurisdictions;
2. invoice headers, invoice lines, tax evidence, payment records, refunds, disputes,
   and adjustments are retained for at least the applicable tax/accounting period;
3. raw high-volume usage may be compacted after rating/invoice finalization, but
   enough aggregate and source evidence must remain to explain the charge;
4. rated usage lines are retained at least as long as invoices and disputes can be
   reopened;
5. PII minimization and credential redaction still apply to billing metadata.

Do not implement raw usage deletion until invoice/dispute evidence requirements are
reviewed.

## Team and Ownership Implications

Billing/Finance should be owned by a small domain team with:

1. Go backend ownership for ledger, workers, payments, and reconciliation;
2. TypeScript support for account, tenant, and platform finance surfaces;
3. finance/product analyst support for pricing and invoice semantics;
4. QA ownership for money, idempotency, and reconciliation paths.

Agents can accelerate implementation, but a human domain owner must own the financial
invariants and release gates.

## Open Decisions

1. prepaid-only versus invoice/postpaid support for first production tenant;
2. whether budgets initially block launch or only alert;
3. whether reserved capacity is tenant-level first or project-level first;
4. how idle detection is normalized across compute, Jupyter, vLLM, training, and
   scheduler-backed apps;
5. how network usage will be measured once VRF and firewall design lands;
6. whether invoice documents are generated internally or delegated to a billing provider;
7. first supported customer currencies and whether tenants are single-currency;
8. first tax/VAT jurisdiction and tax-exempt/reverse-charge support level;
9. reseller/channel billing scope for the first partner program;
10. Token Factory/model-serving source adapter and event contract for the first
    production meter;
11. data retention periods for raw usage, rated lines, invoices, payment records,
    refunds, disputes, and ledger entries;
12. revenue-recognition policy for reserved capacity and unused commitments;
13. dispute/chargeback handling and account restriction policy.

## Related Documents

1. `doc/architecture/App_Runtime_Billing_Model_v1.md`
2. `doc/architecture/Tenant_Admin_Quota_Delegation_v1.md`
3. `doc/architecture/App_Runtime_Metering_v1.md`
4. `doc/architecture/State_Machines.md`
5. `doc/architecture/Seed_Data_Spec.md`
6. `doc/architecture/AI_Factory_Team_Domain_Operating_Model_v1.md`
7. `doc/product/Product_Surface_IA_and_Role_Model_v1.md`
8. `doc/architecture/Billing_Retention_and_Compaction_Policy_v1.md`