# Role and Policy Lifecycle Model (Review Baseline)

## Purpose

Define a single lifecycle model for:

1. built-in platform and tenant/project roles,
2. tenant-defined custom roles,
3. policy constraints that are OPA-ready but OPA-deferred.

This document is the pre-implementation baseline for feedback before endpoint/schema changes.

## Scope

In scope:

1. role taxonomy and boundaries (platform vs tenant/project),
2. role lifecycle rules (create/update/delete/assignment/audit),
3. policy lifecycle rules and decision interface,
4. phased rollout plan (MVP+1).

Out of scope:

1. full OPA/OPAL deployment,
2. enterprise IdP federation admin UX implementation details,
3. final API contract shapes for role CRUD.

## Design Principles

1. Ownership remains `tenant -> project -> resource`.
2. Role grants are permission-based, not UI-label-based.
3. Built-in roles are immutable and non-deletable.
4. Membership and role changes are soft-delete/auditable, never silent overwrite.
5. Authorization decision call shape must be stable before swapping to OPA.

## Current vs Target

Current runtime:

1. Platform role in `users.role` (`user` / `admin`) gates platform admin surfaces.
2. Tenant/project membership tables exist and are the ownership/authz baseline.

Target extension (this model):

1. Keep platform role layer for platform operations.
2. Add explicit tenant/project role lifecycle with custom roles.
3. Keep the same decision contract so in-code evaluation and OPA can share inputs/outputs.

## Platform Role Mapping and Cutover

### Transitional Mapping (while `users.role` is runtime source)

1. `users.role='admin'` maps to `platform_superadmin`.
2. `users.role='user'` maps to `platform_user`.
3. `platform_ops` is introduced via explicit binding/claims in Phase 2 and is not inferred from `users.role`.

### Cutover Rule

1. Phase 1: `users.role` remains authoritative for platform role checks.
2. Phase 2: platform role bindings become authoritative; `users.role` is compatibility/read-model only.
3. During transition, when both sources exist:
   - binding-based platform role wins,
   - mismatch must emit an audit warning with correlation_id.

## Role Taxonomy

### Platform Roles (Global)

Built-in:

1. `platform_superadmin`
2. `platform_ops`
3. `platform_user` (default non-platform-admin)

### Tenant Roles (Tenant Scope)

Built-in:

1. `tenant_owner`
2. `tenant_admin`
3. `tenant_member`
4. `tenant_billing_manager`
5. `tenant_billing_viewer`
6. `tenant_viewer`

### Project Roles (Project Scope)

Built-in:

1. `project_owner`
2. `project_admin`
3. `project_member`
4. `project_viewer`

### Service Identity Role

1. `service_account` is a project-scoped actor type.
2. Service accounts never inherit platform roles.

## Built-in Role Permission Baseline (MVP+1)

Permissions are `resource.action` keys.

| Role | Baseline permissions |
|---|---|
| `platform_superadmin` | `authorization.override.all` |
| `platform_ops` | `platform.ops.read`, `platform.ops.runbook.read`, `platform.node.read`, `platform.node.probe`, `platform.audit.read` |
| `platform_user` | no platform-admin permissions |
| `tenant_owner` | `tenant.user.invite`, `tenant.user.remove`, `tenant.role.assign`, `tenant.policy.write`, `tenant.project.create`, `tenant.billing.read`, `tenant.billing.write` |
| `tenant_admin` | `tenant.user.invite`, `tenant.user.remove`, `tenant.role.assign`, `tenant.project.read`, `tenant.project.update`, `tenant.billing.read` |
| `tenant_member` | `tenant.read`, `project.read`, `tenant.user.read` |
| `tenant_billing_manager` | `tenant.billing.read`, `tenant.billing.write`, `tenant.invoice.read` |
| `tenant_billing_viewer` | `tenant.billing.read`, `tenant.invoice.read` |
| `tenant_viewer` | `tenant.read` |
| `project_owner` | `project.role.assign`, `allocation.create`, `allocation.release`, `allocation.read`, `storage.read`, `storage.write`, `terminal.connect` |
| `project_admin` | `project.member.invite`, `allocation.create`, `allocation.release`, `allocation.read`, `storage.read`, `storage.write`, `terminal.connect` |
| `project_member` | `allocation.create`, `allocation.release`, `allocation.read`, `storage.read`, `storage.write`, `terminal.connect` |
| `project_viewer` | `allocation.read`, `storage.read` |

Notes:

1. This table is the implementation baseline and must be represented in code as permission sets.
2. Handler checks evaluate permission keys, not role-name string comparisons.
3. `authorization.override.all` is a reserved internal permission. It is an explicit allow-all override for platform-superadmin evaluation and is not treated as a prefix wildcard match.
4. `tenant.role.assign` authorizes assignment attempts, but assignment ceiling is enforced separately by assignment handler rules (grantor highest active role vs target role).
5. `is_assignable_to_service_accounts` baseline for built-ins:
   - true: `project_member`, `project_viewer`
   - false: all platform roles, all tenant roles, `project_owner`, `project_admin`
6. Current runtime-enforced platform action keys are:
   - `platform.admin`
   - `platform.ops.read`
   - `platform.ops.runbook.read`
   - `platform.node.read`
   - `platform.audit.read`
   Remaining keys in this table are staged for incremental handler adoption.

## Role Inheritance and Scope Rules

1. Inheritance is only within the same scope tier:
   - `tenant_owner` includes `tenant_admin` includes `tenant_member`.
   - `project_owner` includes `project_admin` includes `project_member` includes `project_viewer`.
2. No automatic tenant-to-project runtime grant:
   - tenant role alone does not grant project runtime access unless a project membership exists.
3. Platform override:
   - `platform_superadmin` may bypass tenant/project checks for explicit platform-admin endpoints.
4. Assignment ceiling rule:
   - Role assignment cannot exceed grantor authority.
   - `tenant_admin` cannot grant `tenant_owner`.
   - only `tenant_owner` (or platform override) can grant `tenant_owner`.

## Built-in vs Custom Roles

### Built-in Roles

1. `is_builtin = true`, immutable ID and immutable baseline permission set.
2. Cannot be deleted.
3. Can be disabled only by platform policy control.

### Custom Roles (Tenant-Defined)

1. Scope-limited to tenant or project.
2. Editable permission set with versioning.
3. Soft-delete only (`deleted_at`, `deleted_by_user_id`, `delete_reason`).
4. No hard delete in runtime paths.

## Lifecycle Rules

### Role Definition Lifecycle

1. `create` -> role version `v1`.
2. `update` -> append new role version.
3. `disable` -> deny new assignments immediately; active bindings remain valid during configured grace period.
4. `delete` -> soft-delete marker only (custom roles only).

Disable controls:

1. Default mode: graceful disable (`block_new_only`) with policy-controlled grace window.
2. Emergency mode: immediate disable (`block_all_now`) for security incidents.
3. Rollback: role can be re-enabled by same authority scope; rollback must be audit-logged with reason.
4. Planned policy key: `authorization.role_disable_grace_window_seconds` (not seeded yet in MVP baseline).
5. Until the key is seeded, runtime behavior is deterministic:
   - `block_new_only` requests are rejected as `invalid_request` (unconfigured policy),
   - only `block_all_now` is permitted.
6. Tracking requirement: add this key to `doc/architecture/Seed_Data_Spec.md`, `scripts/seed.sql`, and queue before enabling graceful disable in production.

### Assignment Semantics (Versioning)

1. Assignments are pinned to `role_version_id` at grant time.
2. Role updates do not auto-upgrade existing assignments.
3. Optional bulk-upgrade operation can rebind assignments to a newer version (audit required).
4. Bulk-upgrade authority:
   - tenant-scope roles: `tenant_owner` or platform override.
   - project-scope roles: `project_owner` or platform override.
5. Bulk-upgrade execution requires explicit target role, from-version, to-version, reason, and correlation_id in audit metadata.

### Membership Binding Lifecycle

1. Membership grant creates active binding row.
2. Membership revoke sets `deleted_at` fields (soft delete).
3. Authorization reads only active rows (`deleted_at is null`).
4. Every grant/revoke/change writes audit log with `correlation_id`.

### Break-Glass Lifecycle

1. Break-glass elevation is time-bound with explicit reason.
2. In MVP+1, only `platform_superadmin` can grant break-glass elevation.
3. Expiry auto-revokes elevation.
4. All break-glass actions are high-severity audit events.
5. Break-glass implementation phase: Phase 3 (same phase as OPA cutover preparation).

### Custom Role Governance

1. Tenant-scoped custom roles can be created/updated/deleted by `tenant_owner`.
2. Project-scoped custom roles can be created/updated/deleted by `project_owner`.
3. `platform_superadmin` can perform override operations.

## Policy Lifecycle (OPA-Ready, OPA-Deferred)

### Policy Scope

1. Runtime scope chain stays `global -> tenant -> department -> project` (most-specific wins).
2. Role grants baseline permission.
3. Policy adds constraints or denies based on context attributes.

### Decision Interface (Stable Contract)

Input:

1. actor (`user_id` or `service_account_id`, actor type),
2. platform role,
3. tenant/department/project scope context, memberships, and resolved permissions,
4. action,
5. resource descriptor (`resource_name`, type, owner tenant/department/project),
6. request attributes (region, sku, time, flags).

Output:

1. `allow` / `deny`,
2. `reason_code`,
3. `applied_scope` (`global|tenant|department|project`),
4. `policy_source` (`in_code|platform_policy_values|opa`).

Reason codes (baseline enum):

1. `permission_denied`
2. `membership_missing`
3. `scope_mismatch`
4. `policy_constraint_denied`
5. `role_disabled`
6. `actor_disabled`

Implementation rule:

1. this input/output shape is mandatory in MVP+1 even while evaluation remains in Go.

### Deterministic Authorization Merge Algorithm

1. Deny if actor is disabled (`actor_disabled`).
2. If `authorization.override.all` is present and action is override-eligible, allow immediately and record policy source `in_code`.
3. Resolve active tenant/project memberships and project department context for requested scope; deny on missing required scope (`membership_missing`).
4. Resolve active bound roles in scope and expand inherited roles within same scope tier.
5. Build effective permission set as union of resolved role permission sets.
6. If action not present in effective set, deny (`permission_denied`).
7. Apply policy constraints on granted action.
8. Scope precedence: project -> department -> tenant -> global (most-specific wins).
9. Conflict rule: explicit policy deny overrides role grant (`policy_constraint_denied`).
10. Precedence rule for superadmin override: when step 2 matches (`authorization.override.all` + `override_eligible=true`), that allow decision is final and is not overridden by step 9.

Override-eligible source of truth:

1. Override eligibility is defined by an explicit action metadata registry (`override_eligible: true|false`, default false).
2. Runtime handlers must resolve action identity from the same action registry consumed by authorization evaluation.
3. Superadmin bypass is allowed only for actions explicitly marked `override_eligible=true`; no implicit endpoint-path heuristics are permitted.
4. Registry home: code-owned static registry at `packages/shared/authz/action_registry.go`; changes are code-reviewed and not runtime-configurable in MVP+1.

## Service Account Permission Rules

1. Service accounts use the same permission key model (`resource.action`).
2. Service accounts may be bound only to project-scope built-in/custom roles.
3. Service accounts cannot be granted platform roles or break-glass elevation.
4. Service accounts cannot call platform-admin endpoints.

## Data Model Direction

Implemented baseline tables:

1. `role_definitions` (`id`, `scope_type`, `scope_id`, `name`, `is_builtin`, `is_assignable_to_service_accounts`, `state`, `current_version_id`).
2. `role_definition_versions` (`id`, `role_definition_id`, `version`, `created_at`, `created_by_user_id`).
3. `role_permissions` (`role_version_id`, `permission_key`, `effect`).
4. `platform_role_bindings` (`id`, `principal_type`, `principal_id`, `role_definition_id`, `created_at`, `deleted_at`).
5. `tenant_role_bindings` (`id`, `tenant_id`, `principal_type`, `principal_id`, `role_version_id`, `created_at`, `deleted_at`).
6. `project_role_bindings` (`id`, `project_id`, `principal_type`, `principal_id`, `role_version_id`, `created_at`, `deleted_at`).

Constraint direction:

1. Tenant/project bindings: unique active binding per principal/scope/role version.
2. Platform bindings: unique active binding per (`principal_type`, `principal_id`, `role_definition_id`).
3. FK from tenant/project bindings to role version and owner scope.
4. Custom role rows must reference tenant/project scope owner.
5. Bindings extend membership anchors; they do not replace `tenant_memberships` or `project_memberships`.
6. `role_definitions.current_version_id` must reference a version row for the same `role_definition_id`.
7. `platform_role_bindings` references `role_definition_id` directly because platform built-in roles are immutable and do not participate in role-version lifecycle.
8. `platform_role_bindings` is the Phase 2 authority source for platform roles.

## Rollout Plan

### Phase 1 (Near-Term)

1. Keep current platform role gating (`users.role`) intact.
2. Add explicit role-permission mapping in code for built-in tenant/project roles.
3. Seed test users for:
   - platform superadmin,
   - tenant admin,
   - project member,
   - project viewer.

### Phase 2 (Extension in Same Track)

1. Add platform role bindings as authority source and enable `platform_ops` runtime role path.
2. Seed platform-ops test users via binding path.
3. Use platform-role management APIs for bind/revoke/list as primary control plane path.
   Controlled scripts remain break-glass fallback only:
   - `scripts/ops/bind_platform_role.sh`
   - `scripts/ops/revoke_platform_role.sh`
   (correlation-id required, audit write required; no ad-hoc SQL).
4. Add custom role definitions and bindings.
5. Add role versioning and soft-delete lifecycle.
6. Keep decision interface unchanged.

### Phase 3 (Later)

1. Introduce OPA engine behind existing decision interface.
2. Run shadow-mode decision parity checks.
3. Flip enforcement after parity SLO is met.

## Observability and Audit Requirements

Every authorization failure or privileged role mutation should include:

1. `correlation_id`
2. `actor_type`
3. `actor_id`
4. `platform_role`
5. `tenant_id`
6. `project_id`
7. `resource_name` (when applicable)
8. `reason_code`

## Decision Status for This Baseline

1. `platform_ops` is part of MVP+1 built-ins.
2. Billing split is active in model (`tenant_billing_manager` and `tenant_billing_viewer`).
3. Phase 2 supports both tenant-level and project-level custom roles.

## References

1. `doc/architecture/Tenant_Project_Ownership_Baseline.md`
2. `doc/architecture/User_Onboarding_Model.md`
3. `doc/architecture/Service_Account_Model.md`
4. `doc/architecture/adrs/ADR-004-identity-authz-model.md`
5. `doc/governance/Assumptions_Register.md` (A-015)