App SDK Overview implemented
The App SDK is the developer-facing product surface for building apps on GPUaaS. Its purpose is to let app teams declare runtime needs, publish trusted artifacts, operate app instances, and integrate with GPUaaS identity, billing, audit, and lifecycle controls without patching platform-core code.
Shipped Reference Proof
The SDK path already has real implementation proof, not only documentation:
| Proof point | What it shows |
|---|---|
cmd/slurm-reference-controller/main.go | scheduler-grade runtime lifecycle and reconcile behavior on platform contracts |
cmd/rke2-self-managed-controller/main.go | cluster bootstrap and member/runtime operations on the same app-platform path |
This does not mean every future app flow is productized equally well yet. It does mean the platform should stop describing the whole App SDK surface as if it were still conceptual.
Composition Model
The SDK is not “one helper library.” It is the boundary that lets app teams reuse platform authority without patching product-core behavior.
What GPUaaS Owns
- Identity, IAM, tenant and project hierarchy.
- App catalog, entitlements, app instances, and shared runtime resources.
- Allocation lifecycle, placement primitives, billing attribution, and audit.
- Credential custody and delivery through supported platform paths.
- Common UX shell, API contracts, and evidence/correlation surfaces.
What The App Team Owns
- Runtime-specific controller logic.
- Runtime bootstrap, reconcile, health, recovery, and teardown behavior.
- App-specific operational knowledge and failure handling.
- Manifest metadata, version metadata, and artifact package discipline.
Developer Mental Model
Catalog and entitlement
-> app is visible and allowed for a project
App instance or shared runtime
-> durable control-plane resource owned by a project or tenant
App-owned worker/operator
-> runtime-specific reconcile loop using public APIs
Runtime/data plane
-> Slurm, Ray, MLflow, model gateway, notebook, or another app runtime
Contract Rules
- Build against public APIs and committed contracts, not internal Go packages.
- Do not assume database access or undocumented routes.
- Express behavior through declared capabilities, endpoint types, auth pattern, lifecycle hooks, and manifest fields.
- Treat closed enums as platform commitments; add new values deliberately.
- Keep app-specific runtime intelligence outside platform core unless it becomes a reusable primitive.
Change Classification
Every app-facing change should name its class before implementation.
| Class | Belongs in | Examples |
|---|---|---|
| Runtime fix | runtime/controller/backend owning layer | app route reconciliation, artifact selection bugs, node-task reconciliation, runtime readiness |
| Catalog or manifest change | App SDK and manifest contract | ports, health paths, route intent, auth mode, connect actions, launch defaults |
| SDK/developer contract change | SDK examples, validators, portal, smoke tests | service-account expectations, artifact promotion, launch/connect/decommission flows, developer-visible failure behavior |
The SDK should become the app developer contract plus validation harness. Backend runtime implements that contract; seed data should not be the only place where app behavior lives.
Use This With The Practical Onboarding Path
This page is the mental model. The operational sequence for getting a real app into GPUaaS lives in Add a New App. Use both together:
- this page to understand the boundary and structure;
- the onboarding page to execute manifest -> artifact -> service account -> catalog -> entitlement -> launch/connect/decommission.
The first readiness artifact is a manifest/launch/connect matrix for supported apps such as vLLM, Headlamp, OpenClaw, Jupyter, and Slurm. It should show what the SDK can express today, what is still a backend compatibility bridge, and which examples need launch/connect/decommission smoke coverage.
Current Readiness Path
The current platform-foundation docs treat the App SDK as both a product surface and an internal developer platform capability. The readiness path is:
- use the App SDK readiness matrix to decide which app behaviors are contract-backed, example-backed, or still backend-compatibility bridges;
- use the executable product onboarding packet for product/app registration, ownership, evidence, release, and support expectations;
- use the registry and artifact trust docs for artifact type, trust state, signing/provenance, and promotion evidence;
- keep SDK examples tied to launch, connect, decommission, and runtime smoke evidence rather than seed data or backend-only assumptions.
What To Read Next
Start with the quickstart, then the manifest guide, then the artifact trust and promotion model. Scheduler or clustered apps should also read the external app team integration and reference workflow docs.