# CLI v2 Command Matrix

As of: March 10, 2026

## Purpose
Concrete command naming, flags, and conventions for the CLI v2 implementation.
This is the implementation reference — code should not invent names outside this matrix.

## Global Flags

These flags apply to every command. They are parsed before the subcommand.

| Flag | Short | Default | Description |
|---|---|---|---|
| `--output` | `-o` | `table` | Output format: `table`, `csv`, `json` |
| `--no-heading` | | `false` | Suppress header row (table/csv only) |
| `--no-input` | | `false` | Disable interactive prompts (agent mode) |
| `--project-id` | `-p` | from config | Override active project context |
| `--base-url` | | from config | Override API base URL |

## Exit Codes

| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Client error (bad input, missing required flag, config error) |
| 2 | Server error (5xx, network failure, timeout) |
| 3 | Auth error (401, 403, token expired, not authenticated) |
| 4 | Not found (404) |
| 5 | Conflict (409) |

## Error Output Contract

All errors write to stderr. In `--output json` mode, errors are JSON:

```json
{"error": {"code": "user_not_found", "message": "...", "correlation_id": "..."}}
```

In table/csv mode, errors are human text:
```
ERROR: user not found (code=user_not_found, correlation_id=abc-123)
```

---

## Phase 1: Foundation (Cobra migration + fixes)

Existing commands, identical behavior, new structure.

### auth

| Command | Flags | Auth | Description |
|---|---|---|---|
| `auth login` | `[--provider]` `[--tenant-hint]` `[--identity-hint]` `[--no-browser]` `[--base-url]` | none | Browser OIDC PKCE login |
| `auth dev-login` | `--username` `--password` `[--base-url]` | none | Dev password login |
| `auth keycloak-login` | `--username` `--password` `[--base-url]` `[--kc-url]` `[--realm]` `[--client-id]` `[--client-secret]` | none | Dev Keycloak OIDC password grant |
| `auth whoami` | | user/SA | Show effective identity, tenant, project, roles |
| `auth logout` | | user | Clear local session |

### catalog

| Command | Flags | Auth | Description |
|---|---|---|---|
| `catalog list` | `[--output]` `[--no-heading]` | user | List GPU SKUs |

### billing

| Command | Flags | Auth | Description |
|---|---|---|---|
| `billing balance` | `[--output]` | user | Show current balance |
| `billing usage` | `[--app-instance-id]` `[--usage-source]` `[--from]` `[--to]` `[--project-id]` `[--output]` `[--no-heading]` | user | Query usage records |

### nodes

| Command | Flags | Auth | Description |
|---|---|---|---|
| `nodes list` | `[--status]` `[--output]` `[--no-heading]` | user | List available nodes |

### projects

| Command | Flags | Auth | Description |
|---|---|---|---|
| `projects list` | `[--output]` `[--no-heading]` | user | List accessible projects |
| `projects use` | `--id` | user | Set active project in config |
| `projects create` | `--name` `[--slug]` | user | Create new project |

### allocations

| Command | Flags | Auth | Description |
|---|---|---|---|
| `allocations list` | `[--status]` `[--project-id]` `[--output]` `[--no-heading]` | user | List allocations |
| `allocations get` | `--id` `[--project-id]` `[--output]` | user | Get allocation details |
| `allocations create` | `[--scheduler-type]` `[--node-id]` `[--project-id]` | user | Create allocation |
| `allocations release` | `--id` `[--project-id]` | user | Release allocation |
| `allocations connect` | `--id` `[--project-id]` | user | Mint terminal token and print connection info |

### context

| Command | Flags | Auth | Description |
|---|---|---|---|
| `context show` | `[--output]` | none | Show active config: base URL, project, tenant, identity type |
| `context set` | `--key` `--value` | none | Set config value (base-url, project-id, output) |

---

## Phase 2: Service Accounts + Auth

### auth (additions)

| Command | Flags | Auth | Description |
|---|---|---|---|
| `auth service-account-token` | `--service-account-id` `--key-id` `--client-secret` | none | Mint short-lived SA token, store in config |

### service-accounts

| Command | Flags | Auth | Description |
|---|---|---|---|
| `service-accounts list` | `[--project-id]` `[--output]` `[--no-heading]` | user | List project service accounts |
| `service-accounts create` | `--name` `--slug` `[--description]` `[--project-id]` | user | Create service account, output credential |
| `service-accounts get` | `--id` `[--project-id]` `[--output]` | user | Get service account details |
| `service-accounts rotate-key` | `--id` `[--project-id]` | user | Rotate credential key |
| `service-accounts disable` | `--id` `[--project-id]` | user | Disable service account |
| `service-accounts delete` | `--id` `[--project-id]` | user | Delete service account |

---

## Phase 3: Apps

### apps catalog

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps catalog list` | `[--output]` `[--no-heading]` | user/SA | List app catalog entries |
| `apps catalog versions` | `--app` | user/SA | List versions for an app slug |
| `apps catalog registry` | `[--output]` | user/SA | Show platform OCI registry info |

### apps entitlements

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps entitlements list` | `[--project-id]` `[--output]` `[--no-heading]` | user | List project app entitlements |
| `apps entitlements set` | `--app` `--enabled` `[--allowed-modes]` `[--allowed-scopes]` `[--project-id]` | user | Enable/configure app for project |

### apps instances

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps instances list` | `[--status]` `[--project-id]` `[--output]` `[--no-heading]` | user/SA | List app instances |
| `apps instances get` | `--id` `[--project-id]` `[--output]` | user/SA | Get instance details |
| `apps instances create` | `--app` `--version` `--name` `[--operating-mode]` `[--control-plane-scope]` `[--operator-sa-id]` `[--project-id]` | user/SA | Create app instance |
| `apps instances upgrade` | `--id` `--version` `[--project-id]` | user/SA | Upgrade instance to new version |
| `apps instances rollback` | `--id` `--version` `[--project-id]` | user/SA | Rollback instance to previous version |
| `apps instances decommission` | `--id` `[--project-id]` | user/SA | Decommission instance |

### apps deploy (alias)

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps deploy` | `--app` `--version` `--name` `[--operating-mode]` `[--control-plane-scope]` `[--operator-sa-id]` `[--project-id]` | user/SA | Alias for `apps instances create` |

### apps shared-runtimes

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps shared-runtimes list` | `--org-id` `[--status]` `[--output]` `[--no-heading]` | user/SA | List tenant-shared app runtimes |
| `apps shared-runtimes get` | `--org-id` `--id` `[--output]` | user/SA | Get tenant-shared runtime details |
| `apps shared-runtimes create` | `--org-id` `--app` `--version` `--name` `[--operating-mode]` `[--control-plane-scope]` `[--operator-identity-ref]` `[--attach-project]` | user/SA | Create tenant-shared runtime |
| `apps shared-runtimes delete` | `--org-id` `--id` `[--output]` | user/SA | Delete tenant-shared runtime |

### apps shared-runtimes attachments

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps shared-runtimes attachments list` | `--org-id` `--runtime-id` `[--output]` `[--no-heading]` | user/SA | List tenant-shared runtime attachments |
| `apps shared-runtimes attachments get` | `--org-id` `--runtime-id` `--attachment-id` `[--output]` | user/SA | Get tenant-shared runtime attachment |
| `apps shared-runtimes attachments create` | `--org-id` `--runtime-id` `--project-id` `[--allow-worker-contribution]` `[--allow-job-submission]` `[--output]` | user/SA | Create tenant-shared runtime attachment |
| `apps shared-runtimes attachments delete` | `--org-id` `--runtime-id` `--attachment-id` `[--output]` | user/SA | Delete tenant-shared runtime attachment |

### apps shared-runtimes workers

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps shared-runtimes workers list` | `--org-id` `--runtime-id` `[--output]` `[--no-heading]` | user/SA | List tenant-shared runtime workers |
| `apps shared-runtimes workers get` | `--org-id` `--runtime-id` `--worker-id` `[--output]` | user/SA | Get tenant-shared runtime worker details |

### apps shared-runtimes worker-operations

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps shared-runtimes worker-operations list` | `--org-id` `--runtime-id` `[--output]` `[--no-heading]` | user/SA | List tenant-shared runtime worker operations |
| `apps shared-runtimes worker-operations get` | `--org-id` `--runtime-id` `--operation-id` `[--output]` | user/SA | Get tenant-shared runtime worker operation details |
| `apps shared-runtimes worker-operations create` | `--org-id` `--runtime-id` `--action` `[--attachment-id]` `[--source-project-id]` `[--allocation-id]` `[--output]` | user/SA | Create tenant-shared runtime worker operation |

### apps artifacts

| Command | Flags | Auth | Description |
|---|---|---|---|
| `apps artifacts list` | `[--app]` `[--lifecycle-state]` `[--project-id]` `[--output]` `[--no-heading]` | user/SA | List project artifacts |
| `apps artifacts get` | `--id` `[--project-id]` `[--output]` | user/SA | Get artifact details |
| `apps artifacts register` | `--app` `--version` `--name` `--repository` `--digest` `[--tag]` `--media-type` `[--artifact-kind]` `[--source-type]` `[--source-uri]` `[--project-id]` | user/SA | Register artifact digest |
| `apps artifacts publish-intent` | `--app` `--version` `--name` `[--artifact-kind]` `[--source-type]` `[--channel]` `[--project-id]` `[--output]` | user/SA | Create publish intent, output repository + credential info |
| `apps artifacts promote` | `--id` `--channel` `[--target-environment]` `[--project-id]` | user/SA | Promote artifact to channel |
| `apps artifacts verify` | `--id` `[--project-id]` | user/SA | Mark artifact as verified |
| `apps artifacts revoke` | `--id` `[--project-id]` | user/SA | Revoke artifact trust |
| `apps artifacts deprecate` | `--id` `[--project-id]` | user/SA | Deprecate artifact |
| `apps artifacts retire` | `--id` `[--project-id]` | user/SA | Retire artifact |

### storage

Project-scoped storage operations. Required for the blob artifact workflow.

| Command | Flags | Auth | Description |
|---|---|---|---|
| `storage list` | `[--prefix]` `[--project-id]` `[--output]` `[--no-heading]` | user/SA | List objects in project storage |
| `storage upload` | `--file` `--path` `[--project-id]` | user/SA | Upload file to project storage |
| `storage download` | `--path` `--dest` `[--project-id]` | user/SA | Download file from project storage |
| `storage delete` | `--path` `[--project-id]` | user/SA | Delete object from project storage |

Table columns for `apps artifacts list`:

| Column | JSON field | Description |
|---|---|---|
| ARTIFACT_ID | `id` | UUID |
| APP | `app_slug` | App slug |
| VERSION | `app_version` | App version |
| NAME | `artifact_name` | Artifact name |
| KIND | `artifact_kind` | `oci` or `blob` |
| LIFECYCLE | `lifecycle_state` | `published`, `promoted`, `deprecated`, `retired` |
| TRUST | `trust_state` | `unverified`, `verified`, `failed_verification`, `revoked` |
| CHANNEL | `promoted_channel` | Promotion channel (if promoted) |
| DIGEST | `digest` | Truncated in table, full in json |
| REGISTERED | `created_at` | Timestamp |

---

## Phase 4: IAM + Ops + Introspection

### iam

| Command | Flags | Auth | Description |
|---|---|---|---|
| `iam members list` | `[--project-id]` `[--output]` `[--no-heading]` | user | List project/tenant members |
| `iam roles list` | `[--output]` `[--no-heading]` | user | List available roles |

### ops

All ops commands require `platform_ops` or `platform_superadmin` role.

| Command | Flags | Auth | Description |
|---|---|---|---|
| `ops fleet-health` | `[--output]` | admin | Show fleet health overview |
| `ops node-metrics` | `--node-id` `[--output]` | admin | Show node metrics |
| `ops trace` | `--trace-id` | admin | Open Grafana trace URL |
| `ops incident` | `--correlation-id` `[--output]` | admin | Lookup incident by correlation ID |
| `ops runbooks list` | `[--output]` `[--no-heading]` | admin | List runbooks |
| `ops runbooks show` | `--id` `[--output]` | admin | Show runbook detail |
| `ops allocations force-release` | `--id` | admin | Force-release stuck allocation |
| `ops users list` | `[--output]` `[--no-heading]` | admin | List platform users |
| `ops users get` | `--id` `[--output]` | admin | Get user details |
| `ops audit-log` | `[--actor]` `[--action]` `[--target-type]` `[--from]` `[--to]` `[--correlation-id]` `[--output]` `[--no-heading]` | admin | Query audit log |

### ops nodes (admin node lifecycle)

All `ops nodes` commands require `platform_ops` or `platform_superadmin` role.

| Command | Flags | Auth | Description |
|---|---|---|---|
| `ops nodes list` | `[--status]` `[--sku]` `[--output]` `[--no-heading]` | admin | List all nodes with status, SKU, occupancy |
| `ops nodes get` | `--id` `[--output]` | admin | Get full node detail including agent status and lifecycle timestamps |
| `ops nodes add` | `--host` `--sku` `--gpus-total` `[--region-code]` `[--onboarding-mode]` `[--probe]` `[--output]` | admin | Register a new node (status: registered) |
| `ops nodes bootstrap` | `--id` `[--output]` | admin | Generate enrollment token and bootstrap script for a registered node |
| `ops nodes probe` | `--id` `[--output]` | admin | Trigger a connectivity probe against a node |
| `ops nodes agent-lifecycle get` | `--id` `[--output]` | admin | Get desired/reported node-agent version, drift state, and latest lifecycle run |
| `ops nodes agent-lifecycle start` | `--id` `--mode` `--scenario` `[--target-version]` `--safety-policy` `[--reason]` `[--output]` | admin | Start a node-agent lifecycle run using `reimage` or `manual_install` |
| `ops nodes retire` | `--id` | admin | Retire a node (drains active allocations first; reversible) |
| `ops nodes reactivate` | `--id` | admin | Reactivate a retired node with the same identity |
| `ops nodes remove` | `--id` `[--force]` | admin | Permanently remove a retired node (irreversible) |

Node lifecycle state machine:
```
registered → bootstrap_issued → enrolling → active ⇄ offline
                                               ↓
                                           draining → retired → (remove)
                                               ↑
                                          quarantined
```

- `add` creates a node in `registered` state.
- `bootstrap` issues enrollment credentials and transitions to `bootstrap_issued`.
- The node agent self-enrolls → `enrolling` → `active`.
- `retire` triggers graceful drain of active allocations before transitioning to `retired`.
- `reactivate` moves a `retired` node back to `offline` with the same identity; heartbeat or probe must prove it `active` before scheduling.
- `remove` permanently deletes a `retired` node. Requires `--force` if the node has historical allocations.

### ops releases (platform artifact discovery)

| Command | Flags | Auth | Description |
|---|---|---|---|
| `ops releases list` | `[--output]` `[--no-heading]` | admin | List platform release versions |
| `ops releases get` | `--version` `[--output]` | admin | Show release manifest with artifact digests |
| `ops releases download` | `--version` `--artifact` `[--dest]` | admin | Download a platform release artifact (cli, go-sdk, python-sdk, node-agent, node-agent-bootstrap) |

Backend direction for this surface:
1. these commands should use control-plane release catalog APIs, not hardcoded registry env vars,
2. `download` should call a control-plane download endpoint for small platform artifacts,
3. generalized pull-intent APIs remain the underlying primitive for registry-native pulls and larger artifact classes.

### mcp

| Command | Flags | Auth | Description |
|---|---|---|---|
| `mcp serve` | `[--transport]` | from config | Start MCP tool server (stdio default) |
| `mcp list-tools` | `[--output]` | none | List available MCP tools and descriptions |

MCP tool names are derived from command paths: `allocations_list`, `apps_instances_create`, etc.
The MCP layer is a transport adapter — it exposes exactly what the CLI exposes, same auth, same errors.

### Introspection

| Command | Flags | Auth | Description |
|---|---|---|---|
| `schema` | `<resource>` | none | Show resource schema (from embedded OpenAPI) |
| `explain` | `<command>` | none | Show command help with examples |
| `api` | `<method>` `<path>` `[--data]` `[--output]` | user/SA | Raw API call with auth/project injection |

---

## Naming Conventions

### Flags
- Kebab-case: `--project-id`, `--artifact-kind`, `--lifecycle-state`
- Resource identifiers: `--id` for the primary resource, `--{resource}-id` for foreign references
- Boolean flags: `--enabled`, `--no-input`, `--no-heading` (no `=true/false` syntax needed)

### Verbs
- `list` — paginated collection read
- `get` — single resource read by ID
- `create` — create new resource
- `set` — idempotent upsert (entitlements)
- Trust actions: `verify`, `revoke`
- Lifecycle actions use domain verbs: `release`, `upgrade`, `rollback`, `decommission`, `promote`, `deprecate`, `retire`
- Connection: `connect` (curated workflow, not raw API mirror)

### Output
- `table`: human-readable, aligned columns, header row by default
- `csv`: machine-parseable, RFC 4180
- `json`: full API response payload, pretty-printed

### Auth type column
- `none` — no token required
- `user` — human bearer token only
- `user/SA` — human or service-account token
- `admin` — requires platform_ops or platform_superadmin role

---

## Implementation Notes

### Architecture: SDK-first

Build the Go SDK (`pkg/sdk/`) first. CLI and MCP server are thin presentation layers on top.

```
pkg/sdk/                              # Typed Go client — all API logic lives here
├── client.go                         # NewClient(Config), HTTP transport, auth headers, project context
├── errors.go                         # APIError type, exit code mapping, correlation_id
├── auth.go                           # Login, Whoami, Logout, SAToken
├── catalog.go                        # ListSKUs
├── billing.go                        # Balance, Usage
├── nodes.go                          # ListNodes
├── projects.go                       # List, Create
├── allocations.go                    # List, Get, Create, Release, Connect
├── service_accounts.go               # List, Get, Create, RotateKey, Disable, Delete
├── apps_catalog.go                   # List, Versions, Registry
├── apps_entitlements.go              # List, Set
├── apps_instances.go                 # List, Get, Create, Upgrade, Rollback, Decommission
├── apps_artifacts.go                 # List, Get, Register, PublishIntent, Promote, Verify, Revoke, Deprecate, Retire
├── storage.go                        # List, Upload, Download, Delete
├── iam.go                            # Members, Roles
├── ops.go                            # FleetHealth, NodeMetrics, AuditLog, ForceRelease, Incident, Trace
├── ops_nodes.go                      # NodeAdd, NodeBootstrap, NodeProbe, NodeRetire, NodeReactivate, NodeRemove
├── ops_releases.go                   # ReleasesList, ReleasesGet, ReleasesDownload
└── config.go                         # Config file load/save/path

cmd/gpuaas-cli/                       # Thin Cobra presentation layer
├── main.go                           # Root command, global flags, SDK client init
├── cmd_auth.go                       # auth group
├── cmd_catalog.go                    # catalog group
├── cmd_billing.go                    # billing group
├── cmd_nodes.go                      # nodes group
├── cmd_projects.go                   # projects group
├── cmd_allocations.go                # allocations group
├── cmd_context.go                    # context show/set
├── cmd_service_accounts.go           # service-accounts group
├── cmd_apps_catalog.go               # apps catalog subgroup
├── cmd_apps_entitlements.go          # apps entitlements subgroup
├── cmd_apps_instances.go             # apps instances subgroup
├── cmd_apps_artifacts.go             # apps artifacts subgroup
├── cmd_storage.go                    # storage group
├── cmd_iam.go                        # iam group
├── cmd_ops.go                        # ops group (includes admin mutations)
├── cmd_ops_nodes.go                  # ops nodes subgroup (admin node lifecycle)
├── cmd_ops_releases.go               # ops releases subgroup (platform artifact discovery)
├── cmd_introspection.go              # schema, explain, api
├── mcp.go                            # mcp serve + list-tools
└── output.go                         # printTable, printCSV, printJSON, formatMoney
```

### SDK client pattern
```go
client := sdk.NewClient(sdk.Config{
    BaseURL:   "https://api.example.com",
    ProjectID: "uuid",
    Token:     "bearer-token",
})

allocs, err := client.Allocations.List(ctx, &sdk.AllocationListOpts{Status: "active"})

var apiErr *sdk.APIError
if errors.As(err, &apiErr) {
    // apiErr.Code, apiErr.Message, apiErr.CorrelationID
}
```

Each CLI command is ~20 lines: parse flags → `sdk.Method(ctx, opts)` → format output.
Each MCP tool is auto-derived: expose SDK methods as tools, return JSON.

### Idempotency keys
All SDK mutation methods generate `crypto/rand` UUID v4 idempotency keys automatically.
Callers may override via options if they need deterministic retry keys.

### Agent mode (`--no-input`)
When set:
- Skip all confirmation prompts
- JSON output for errors (not just success)
- Include `correlation_id` in all error output
- Suppress decorative human text (e.g., "logged in as...")
