# Platform IAM Model v1

## Purpose

Define the platform IAM model in terms of:

1. what the platform already implements,
2. what should be treated as the canonical long-term model,
3. what must still be built or modified to avoid hardening the wrong assumptions.

This document is intentionally not a Keycloak design doc. In GPUaaS, Keycloak is an authentication and federation component, not the authoritative product IAM model.

## Core Design Rule

Model IAM in three dimensions:

1. resource hierarchy
2. subject model
3. scoped role bindings over capability bundles

Do not start from a large catalog of named roles. Default roles are product bundles built on top of capability families.

## Boundary: What Keycloak Does vs What Platform IAM Owns

### Keycloak in current GPUaaS

Keycloak is currently responsible for:

1. OIDC login and auth-code exchange
2. refresh-token exchange
3. logout / token revocation
4. JWT issuance for browser/API sessions
5. JWKS publication for token validation
6. identity federation entry point for OIDC/SAML-style flows
7. human MFA enrollment and enforcement when MFA is enabled in the
   authentication flow

Keycloak is not the authoritative source for:

1. tenant/project membership
2. tenant/project/platform scoped role bindings
3. service-account ownership and platform authorization
4. scoped audit visibility
5. project/tenant governance semantics
6. service-account/API-key authorization or rotation policy

The platform database is the product IAM authority.

MFA boundary:

1. Keycloak verifies human factors such as TOTP and WebAuthn/passkeys.
2. GPUaaS exposes MFA posture and effective requirement through the existing
   V3 account security read model at `GET /api/v1/v3/account/security`.
3. The product UI extends the existing `/account/security` surface; it must not
   collect MFA secrets or create a parallel MFA page.
4. Sensitive-operation MFA gates may use token/session claims such as `amr` or
   `acr` only after the Keycloak realm proves those claims are reliable.
5. Service accounts and API keys are not MFA subjects.

## Canonical Objects

### 1. Principals

Canonical product actors.

Principal types:

1. `human`
2. `service_account`
3. `group`

Future:

1. external group references
2. workload identities if they become first-class beyond service accounts

### 2. External Identity Bindings

Authentication anchors attached to principals.

Examples:

1. OIDC issuer + subject
2. local password credential
3. tenant federation provider binding

These are authn bindings, not authorization truth.

### 3. Memberships

Memberships place a principal into tenant/project scope.

Current conceptual model:

1. tenant membership
2. project membership

Memberships are the current authorization root for tenant/project access.

### 4. Role Bindings

Role bindings attach a subject to a role bundle at a scope.

Scope hierarchy:

1. `platform`
2. `tenant`
3. `project`

### 5. Role Bundles / Capability Sets

Roles should be treated as named bundles over capabilities, not the base model.

Examples of capability families:

1. `iam.*`
2. `billing.*`
3. `ops.*`
4. `project.*`
5. `resource.*`
6. `audit.*`

### 6. Invitations

Invitation flows should be first-class IAM objects, not implicit user creation side effects.

Examples:

1. tenant invite
2. project invite
3. tenant-admin invite

### 7. Integration References

External tenant-owned systems should appear as integrations, not as platform-owned user stores.

Examples:

1. tenant Kubernetes cluster integration
2. tenant database integration
3. tenant external IdP configuration

The platform may store integration metadata and delegated credential references, but it should not mirror the external system's full user/role model.

## Resource Hierarchy

Canonical hierarchy:

1. `platform`
2. `tenant`
3. `project`

This hierarchy defines where authority is bound.

Important rule:

Hierarchy does not imply universal content visibility.

Example:

1. a `tenant_admin` may be allowed to create/delete projects
2. that does not automatically mean they can inspect all data/content inside every child project

Management rights and content visibility must remain separable.

## Subject Model

Subjects are the principals or group-like identities that receive bindings.

Initial subject types:

1. user
2. service account
3. group

The platform should not assume a single global human username namespace as the main identity boundary.

Safer model:

1. immutable principal identity
2. tenant membership as the real product access boundary
3. project membership nested under tenant/project scope

## Default Role Families

These are default product bundles, not the full permission grammar.

### Platform scope

1. `platform_admin`
2. `platform_ops`
3. `platform_viewer`
4. `platform_iam_admin`
5. `platform_billing_admin`

### Tenant scope

1. `tenant_owner`
2. `tenant_admin`
3. `tenant_ops`
4. `tenant_viewer`
5. `tenant_iam_admin`
6. `tenant_billing_admin`

### Project scope

1. `project_owner`
2. `project_admin`
3. `project_operator`
4. `project_member`
5. `project_viewer`

Important rule:

These defaults should be built from capability bundles and kept small. The platform should not expose an AWS-style explosion of role labels as the primary mental model.

## Capability Separation Rules

The model must support these separations:

1. read vs mutate
2. management rights vs content visibility
3. IAM authority vs billing authority vs ops authority
4. tenant-wide governance vs project-local authoring

Examples:

1. `platform_ops` may investigate incidents and use admin read surfaces without having full IAM mutation rights.
2. `tenant_admin` may manage tenant users and projects without automatically seeing all project content.
3. `tenant_billing_admin` may view or manage billing without holding general tenant IAM authority.
4. `project_admin` may manage project members and service accounts without tenant-level user governance.

## External System Identity Boundary

For tenant-owned infra systems such as Kubernetes or databases:

1. if the external system has its own SSO or IAM model, that remains the tenant's responsibility
2. GPUaaS may store an integration reference or delegated credential/configuration if needed
3. GPUaaS should not try to become the canonical IAM model for tenant-owned external systems

So:

1. platform IAM owns platform principals, memberships, and platform-managed identities
2. tenant-owned infra IAM stays external

## What Exists Today

### Already implemented or partially implemented

1. `users`
   - stores product users
   - includes `oidc_issuer` and `oidc_subject`
   - still has transitional `role` and `org_id` fields

2. `tenant_memberships`
   - tenant-scoped membership baseline exists

3. `project_memberships`
   - project-scoped membership baseline exists

4. `tenant_identity_providers`
   - tenant OIDC/SAML provider config exists in schema

5. `tenant_federation_domain_bindings`
   - tenant federation domain binding exists in schema

6. `auth_federation_states`
   - provider/org-bound auth flow state exists

7. `role_definitions`
8. `role_definition_versions`
9. `platform_role_bindings`
10. `tenant_role_bindings`
11. `project_role_bindings`
   - the platform already has the skeleton for a richer scoped role-binding model

11. `service_accounts`
   - project-scoped service-account model exists today
   - tenant-owned shared runtimes still need a separate delegated machine-identity model

12. scoped access-credential model
   - useful as a related platform primitive, but separate from IAM role design

### Existing documented direction

Relevant docs already move in this direction:

1. [Role_and_Policy_Lifecycle_Model.md](./Role_and_Policy_Lifecycle_Model.md)
2. [User_Onboarding_Model.md](./User_Onboarding_Model.md)
3. [ADR-008-tenant-project-ownership-baseline.md](./adrs/ADR-008-tenant-project-ownership-baseline.md)
4. [ADR-010-tenant-federation-sso-model.md](./adrs/ADR-010-tenant-federation-sso-model.md)

## Current Gaps / Mismatches

### 1. Global username uniqueness is still too strong

Current schema:

1. `users.username text not null unique`
2. partial MVP constraint on `tenant_memberships(user_id)` also still enforces single-tenant active membership

This is too restrictive for the intended tenant-scoped identity model.

### 2. `users.role` is still transitional and too coarse

Current role field only supports:

1. `user`
2. `admin`

This is insufficient for:

1. platform read-only admin visibility
2. ops-only investigation
3. tenant IAM admin
4. tenant billing admin
5. project admin/operator separation

### 3. Role-binding model exists but is not authoritative

The schema supports richer role bindings, but most runtime behavior still depends on:

1. membership tables
2. coarse `users.role`
3. endpoint-specific assumptions

### 4. No first-class invitation model yet

IAM needs invitation and delegated onboarding as explicit objects/workflows.

### 5. No first-class group model yet

Groups are implied by future need but not implemented as platform IAM primitives.

### 6. No formal external identity binding object beyond current OIDC fields

Current `oidc_issuer` / `oidc_subject` fields work, but richer multi-provider identity binding will need a clearer model.

### 7. Audit visibility is not yet fully scope-aware

Scoped audit is required for:

1. platform admin
2. tenant admin
3. project admin
4. future cross-project sharing/grant visibility

## What Must Be Built or Modified

### Phase 1: clarify authority and remove bad assumptions

1. document that platform DB, not Keycloak, is IAM authority
2. make role-binding/capability model the target authority in docs
3. keep Keycloak as auth/federation component only

### Phase 2: fix identity scope assumptions

1. remove or relax single-tenant active-membership constraint when multi-tenant user support is enabled
2. revisit global username uniqueness
3. separate principal identity from tenant membership more clearly in read/write paths

### Phase 3: make scoped role bindings real

1. move runtime authorization toward role bindings + capability evaluation
2. reduce `users.role` to compatibility/read-model only
3. introduce default role bundles for:
   - platform
   - tenant
   - project

### Phase 4: add missing IAM primitives

1. invitations
2. groups
3. richer external identity bindings
4. scoped audit presentation
5. cross-project sharing/grants

## Design Constraints To Preserve

1. do not make Keycloak the canonical product user store
2. do not require global human-readable username uniqueness as the long-term tenant model
3. do not conflate admin page visibility with mutation authority
4. do not assume parent-scope admin implies child-scope content visibility
5. do not turn tenant-owned external infra IAM into platform-owned IAM

## Recommended Next Docs

This document should be followed by:

1. cross-project access/sharing model
2. scoped audit model
3. IAM API/resource contract slices
4. UX IA alignment for platform/tenant/project admin modes
5. delegated shared-runtime operator authz model for tenant-owned shared app runtimes