App SDK Proof implemented
This page exists to answer a direct review question:
Is the App SDK a real builder surface, or only a design direction?
The answer today is: it is real. The strongest proof is not the manifest documentation. It is the existence of two shipped reference controllers built on the same shared-platform path.
What This Proves
| Audience | What this page should prove |
|---|---|
| App developers | the platform already supports non-trivial runtime/controller patterns |
| Architecture | the shared-platform composition model is implemented, not only proposed |
| Product | GPUaaS is already proving second-product / builder reuse |
| Engineering | runtime examples exist that can guide future app onboarding work |
The Two Reference Proofs
| Example | Proof in repo | What it demonstrates |
|---|---|---|
| Slurm reference controller | cmd/slurm-reference-controller/main.go | scheduler-grade cluster/member lifecycle, runtime reconcile, token/artifact mediation |
| RKE2 self-managed controller | cmd/rke2-self-managed-controller/main.go | cluster bootstrap, member join/drain, cluster lifecycle orchestration |
Shared Composition Model
This is the important point: the reference controllers are not bypassing the platform. They prove the platform boundary can support real controllers.
Slurm As Proof
Slurm is useful because it is not a trivial demo workload. It proves:
- scheduler-shaped lifecycle management;
- multi-component reconcile behavior;
- runtime state and cluster/member transitions;
- integration with platform contracts rather than direct platform patching.
For a reviewer, Slurm answers: can GPUaaS support a serious runtime that looks more like infrastructure than a single container? Yes.
RKE2 As Proof
RKE2 is useful because it proves a different class of problem:
- bootstrap a cluster-like runtime;
- join and drain members;
- manage server/agent roles;
- keep lifecycle logic in the controller instead of leaking it into platform core.
For a reviewer, RKE2 answers: can the platform support self-managed clustered runtime behavior instead of only catalog launches? Yes.
Ownership Split
| Owned by GPUaaS / shared platform | Owned by the app team / controller |
|---|---|
| IAM, service-account integration, scoped authority | runtime-specific reconcile logic |
| app catalog, instance surfaces, and shared contracts | bootstrap, health, repair, and teardown details |
| artifact trust and promotion path | runtime operational knowledge |
| routes/connect surface families | runtime-specific failure handling |
| billing, audit, evidence, and policy primitives | runtime-specific state decisions |
This split is the core App SDK claim. The platform owns reusable authority. The app team owns runtime-specific intelligence.
What This Does Not Claim
This page should stay calibrated.
It does not claim that:
- every future app flow is already productized;
- every runtime family has the same polish as these references;
- self-service onboarding is complete for all app teams;
- every evidence path is equally mature.
It only claims what the code already proves: the shared-platform builder model works for serious controller-driven runtimes.
Developer Route From Here
If you are building on this path, read in this order:
Why This Matters To The Portal
Without this page, the portal can undersell the work and make the SDK look like conceptual scaffolding. That is false. The right posture is:
- reference-controller proof is real;
- onboarding/self-service maturity can still be partial;
- both statements can be true at once.