# ADR-011: Terminal Node WebSocket Bridge

## Status

Proposed

## Date

2026-03-22

## Author

GPUaaS Core Team

## Context

The existing terminal node/control-plane transport used NDJSON over internal HTTP
streaming. Production debugging in March 2026 proved that:

- ingress-path request/response streaming buffered live duplex terminal traffic
- splitting upstream/downstream HTTP endpoints did not remove the buffering while still
  traversing ingress
- a direct node-reachable path restored immediate stream behavior
- but the current identity model for that path still depended on ingress/mTLS handoff
  assumptions

GPUaaS runs on bare metal. There is no hypervisor console path for tenant shell access.
So node-agent remains the in-band terminal execution owner on the node.

We need a terminal transport that:

- is truly full duplex
- does not depend on HTTP body streaming semantics through proxies
- preserves explicit node identity verification
- keeps browser terminal contract stable

## Decision

Adopt a dual-WebSocket bridge design for terminal runtime:

- browser ↔ terminal-gateway: existing public websocket contract remains
- node-agent ↔ terminal-gateway: new dedicated internal websocket over mTLS

The terminal-gateway becomes the terminal byte bridge.

The API remains session authority:

- mint terminal tokens
- validate allocation ownership
- create terminal session bindings
- enqueue `terminal.open`
- record audit/session lifecycle events

The API is not the steady-state terminal data plane.

## Consequences

### Positive

- removes dependence on ingress request/response buffering for terminal liveliness
- keeps browser contract unchanged
- makes node identity explicit at the gateway TLS layer instead of via forwarded headers
- removes Redis pubsub from the live frame path
- gives terminal a clean transport boundary independent from lifecycle task polling

### Costs

- terminal-gateway gains a second listener and more session ownership logic
- node-agent needs a new internal websocket client for terminal sessions
- Kubernetes/network exposure must provide a worker-node-routable terminal listener

## Alternatives Considered

### Continue patching NDJSON-over-HTTP

Rejected:
- production already disproved this as robust
- too proxy-sensitive
- keeps transport and identity coupled to ingress behavior

### gRPC bidirectional streaming

Initially attractive and documented during the redesign analysis.

Rejected for v1:
- adds protobuf/codegen/runtime complexity for what is fundamentally a byte-stream bridge
- websocket is already a native part of the browser/gateway side of the system
- dedicated internal websocket is sufficient for the v1 transport goals

### API as terminal data-plane relay

Rejected:
- control-plane authority and live byte relay should remain separate concerns
- gateway is the correct place to own active websocket bridging

## Related Documents

- [ADR-005-terminal-gateway-isolation](./ADR-005-terminal-gateway-isolation.md)
- [ADR-007-terminal-access-auth-model](./ADR-007-terminal-access-auth-model.md)
- [Terminal WebSocket Bridge Architecture v1](../Terminal_WebSocket_Bridge_Architecture_v1.md)
- [Terminal WebSocket Bridge Implementation Plan v1](../Terminal_WebSocket_Bridge_Implementation_Plan_v1.md)
