# App Runtime Billing Model v1

## Goal
Define the minimum billing contract for platform apps so GPUaaS can meter tenant-dedicated and future platform-managed runtimes without rewriting billing ownership later.

This is a baseline attribution model, not the final pricing engine.

## Core Rules
1. Billing ownership remains anchored to tenant and project context.
2. App runtime billing must not bypass the immutable ledger model.
3. Billing contracts must work for both `tenant_dedicated` and `platform_managed` operating modes.
4. Control-plane footprint and workload consumption must be explainable separately when needed.
5. Internal reference apps and third-party apps use the same attribution model.

## Attribution Anchors
Every billable app-runtime record must be attributable by:
1. `org_id`
2. `project_id`
3. `app_instance_id`
4. `app_slug`
5. `operating_mode`
6. `control_plane_scope`
7. `runtime_backend`
8. `correlation_id`

These fields are required even when the underlying runtime exports richer scheduler-specific signals.

## Billing Shapes by Operating Mode

### 1. `tenant_dedicated`
Likely billable components:
1. dedicated control-plane footprint
2. tenant-bounded worker or compute capacity
3. runtime-specific usage signals such as jobs, pods, or serving uptime

Attribution behavior:
1. control-plane overhead may be billed to the owning project or distributed within the tenant by policy
2. worker usage remains project-attributed whenever the workload originated from a project-owned app instance
3. project-scoped control planes are the clean default for `dev/test/stage/prod` style environments

### 2. `platform_managed`
Likely billable components:
1. shared managed-service consumption
2. per-request, per-job, or per-runtime usage
3. service-tier or quota-based overhead allocation

Attribution behavior:
1. shared-service runtime cost must still resolve to project and tenant usage records
2. platform overhead allocation rules must be policy-driven, not embedded in app-specific code

## Default Baseline Direction
Initial baseline should assume:
1. `tenant_dedicated`
2. `control_plane_scope = project`
3. project-owned app instances are the billing anchor for both runtime usage and any directly attached control-plane cost

This preserves clean environment-level attribution and avoids hidden cross-project subsidy in the initial model.

Tenant-scoped shared control planes are still supported, but cost-sharing must be explicit and policy-driven.
See:
- `doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md`

## Usage Record Direction
App runtime billing should eventually emit usage records that can be reconciled into ledger entries with:
1. `usage_source = app_runtime`
2. `usage_unit` appropriate to the runtime backend
3. `app_instance_id`
4. optional `control_plane_component = true|false`

Examples:
1. Slurm job runtime attributed to a project-owned app instance
2. model-serving uptime plus request volume
3. Ray cluster head/control overhead plus worker execution time

## Separation of Concerns
Core platform is responsible for:
1. usage record schema and ledger integration
2. tenant/project ownership enforcement
3. auditability and reconciliation
4. policy-driven thresholds and entitlement limits

App operators are responsible for:
1. mapping runtime-native signals into the billing contract
2. identifying which runtime signals are billable
3. preserving correlation and project context in those signals

## Policy Direction
Future policy overlays should be able to constrain:
1. whether control-plane overhead is billable
2. whether tenant-scoped shared-control costs may be distributed across projects
3. which runtime usage units are enabled for a given app
4. per-project cost ceilings or quotas for app instances

## Reconciliation Requirements
Billing correctness for platform apps must support:
1. timeline reconstruction by `correlation_id`
2. separation of control-plane and workload consumption where applicable
3. deterministic attribution to tenant and project
4. audit-safe corrections through new ledger entries only

## Examples

### Slurm
1. project-scoped Slurm control plane in `dev`
   - control-plane cost attributed to that project
   - job runtime attributed to same project
2. tenant-scoped shared Slurm
   - tenant-owned controller and tenant-reserved capacity are charged to the tenant-shared runtime owner record
   - project-contributed worker capacity remains charged to the contributing source project
   - submitted jobs should remain attributable to the submitting project when the scheduler/runtime emits that signal

## First Tenant-Shared Billing Rule
For the first productized tenant-shared scheduler flow:
1. controller/control-plane allocations are billed to the tenant-owned shared runtime owner record
2. worker allocations contributed through an attached project remain billed to that source project
3. worker contribution must therefore preserve:
   - `source_project_id`
   - `attachment_id`
   - `allocation_id`
4. any later cross-project redistribution is a reporting/policy layer, not a rewrite of raw usage attribution

### Model Serving
1. tenant-dedicated private model serving
   - serving instance uptime attributed to owning project
   - request usage attributed to same project unless tenant policy says otherwise
2. platform-managed inference tier
   - request usage attributed to project
   - shared platform overhead allocation defined by service tier policy

## Non-Negotiable Invariants
1. App billing must not invent a separate balance model.
2. App billing must not require direct DB writes from app runtimes outside public contracts/events.
3. Shared-service cost allocation must be explicit and explainable.
4. Project remains the primary usage attribution anchor even when runtime control planes are tenant- or platform-scoped.

## Related Docs
1. `doc/architecture/App_Control_Plane_v1.md`
2. `doc/architecture/App_Runtime_Operating_Modes_v1.md`
3. `doc/architecture/Scheduler_as_Platform_App_v1.md`
4. `doc/architecture/State_Machines.md`
5. `doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md`
