# V3 User Workflow Demo Runbook

Date: 2026-04-30

This runbook is for the kind environment V3 user-flow demo. It is intentionally
demo-focused and should not be treated as a production ops procedure.

## Environment

- App: `https://gpuaas-kind-app.tailfe39f5.ts.net`
- API: `https://gpuaas-kind-api.tailfe39f5.ts.net`
- Auth: `https://gpuaas-kind-auth.tailfe39f5.ts.net`
- Login: `dev-admin` / `admin123`
- Project: `Default Project`
- V3 entry: `https://gpuaas-kind-app.tailfe39f5.ts.net/workloads`
- Persona note: use `dev-admin` for this demo because the prepared workloads are in
  `dev-admin`'s workspace. Switch the top-bar mode to `User` if you want the shell to
  frame the flow as an end-user runtime journey.

## Demo Story

1. Open V3 workloads.
   - Route: `/workloads`
   - Show that workloads are the single runtime workbench for compute and app runtimes.

2. Open the active compute allocation.
   - Preferred allocation ID: `9d986c1f-933d-4e16-90ce-1674d6701bd5`
   - Preferred allocation name: `workercompute2`
   - Route: `/workloads/9d986c1f-933d-4e16-90ce-1674d6701bd5`
   - Backup allocation ID: `27037d6a-64e8-4f39-a8ef-cccea1a97daf`
   - Show the resource header, context strip, and tabs.

3. Show console access.
   - Open the `Connect` tab.
   - The embedded terminal mints a terminal token through the allocation API.
   - Use pop-out only if the embedded view needs more room.

4. Show metrics.
   - Open the `Metrics` tab.
   - Live metrics are loaded from:
     - `/api/v1/allocations/27037d6a-64e8-4f39-a8ef-cccea1a97daf/metrics`
     - `/api/v1/allocations/27037d6a-64e8-4f39-a8ef-cccea1a97daf/metrics/timeseries`
   - This kind node is CPU-only for the current demo, so GPU rows may be empty.

5. Show app catalog selection.
   - Route: `/apps`
   - Use these examples:
     - Scheduler/orchestration: `Slurm Reference`
     - Scheduler/orchestration: `Self-managed Kubernetes (RKE2)`
     - OCI app: `JupyterLab`
     - OCI app: `vLLM OpenAI Server`

6. Show running scheduler examples.
   - Slurm instance: `67b3cc56-ab9e-42ac-ad17-b5ba10ab8c58`
   - RKE2 instance: `da70062d-1892-4066-ba5c-d96970357f61`
   - These are shown as app workloads through the V3 workload read model.

7. Show running Jupyter.
   - App instance ID: `da1f866e-93f0-4e90-9c10-23a9bb6a26d4`
   - Route: `/workloads/da1f866e-93f0-4e90-9c10-23a9bb6a26d4`
   - Open the `Connect` tab and use `Open notebook`.
   - The platform proxy launch context is:
     - endpoint: `web`
     - open path: `/lab`
     - verify strategy: `html_plus_asset`

8. Show vLLM as launch-ready.
   - Route: `/launch/app/vllm-openai`
   - Pick the active compute allocation if available.
   - Verified kind dependencies for launch readiness:
     - target allocation: `workercompute2`
     - SSH key: `v3-demo-kind-key`
     - service account: `V3 Demo App Operator`
     - endpoint visibility: `platform_proxy`
   - Safe defaults are pre-filled for the demo.
   - Current limitation: running vLLM live depends on the app runtime deploy path and
     available target capacity. Use JupyterLab as the live proxy/open example unless a
     second app runtime has already been staged.

## Fast Verification Commands

```bash
API_BASE_URL=https://gpuaas-kind-api.tailfe39f5.ts.net
KEYCLOAK_BASE_URL=https://gpuaas-kind-auth.tailfe39f5.ts.net
PROJECT_ID=41000000-0000-4000-8000-000000000002
TOKEN=$(curl -fsS -X POST "$KEYCLOAK_BASE_URL/realms/gpuaas/protocol/openid-connect/token" \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -d grant_type=password \
  -d client_id=gpuaas-api \
  -d client_secret=dev-client-secret \
  -d username=dev-admin \
  -d password=admin123 | jq -r .access_token)

curl -fsS "$API_BASE_URL/api/v1/v3/workloads?status=active" \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Project-ID: $PROJECT_ID" | jq '.items[] | {id, name, kind, status, primary_action, app_slug}'

curl -fsS "$API_BASE_URL/api/v1/v3/workloads/9d986c1f-933d-4e16-90ce-1674d6701bd5" \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Project-ID: $PROJECT_ID" | jq '{id:.workload.id, status:.workload.status, metrics:.tabs.metrics}'

curl -fsS "$API_BASE_URL/api/v1/v3/apps/jupyterlab" \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Project-ID: $PROJECT_ID" | jq '.running_workloads[] | {id, name, status, primary_action}'
```

## Known Demo Limits

- The active compute target is currently a CPU-only kind node, so GPU metrics are not expected.
- Only one launchable OCI app can run on the active compute allocation at a time.
- vLLM is launch-ready for the flow, but running it live requires decommissioning Jupyter
  or adding another active compute target.
- The storage flow currently models WEKAFS/managed storage concepts; S3 enablement on the
  external WEKA cluster is still under investigation.

## Last Verified

Verified on 2026-04-30 against kind:

- V3 workloads, compute detail, Jupyter detail, compute catalog, apps catalog, and
  Jupyter launch routes load without blocked/error states.
- Compute detail `Connect` and `Metrics` tabs render.
- Jupyter detail `Connect` tab renders.
- App detail pages now surface their running workloads for JupyterLab, Slurm, and RKE2.
- JupyterLab and vLLM app launch prechecks return `ready: true` when supplied with the
  active demo target, SSH key, storage bucket, and operator service account.
- App launch pages render for `jupyterlab`, `vllm-openai`, `slurm-reference`, and
  `rke2-self-managed`.
- Jupyter browser-session mint returns an open URL and the proxied `/lab` page returns
  `200 text/html` with the JupyterLab shell.
- Targeted local V3 E2E passed for workloads, compute/apps catalog, and launch wizards:
  `9 passed`.
