Idempotency And Retries contracted
GPUaaS clients should assume networks fail, retries happen, and events may be delivered more than once. The platform contract separates REST mutation idempotency from event-consumer idempotency.
REST Mutations
Mutating REST calls should use X-Idempotency-Key unless the endpoint
explicitly documents an exception. SDK and CLI callers should generate a stable
random key for each logical mutation and reuse that same key when retrying the
same operation.
| Situation | Client behavior |
|---|---|
| Timeout before response | Retry with the same idempotency key. |
| 409 conflict | Treat as a completed or competing logical operation; fetch current state before retrying. |
| 5xx upstream dependency | Retry only when the operation is documented as retry-safe. |
| Validation error | Do not retry until the request shape is fixed. |
Terminal-token minting is intentionally single-use and should follow its endpoint contract rather than a generic retry wrapper.
Event Consumers
Events carry an envelope with event_id, event_type, occurred_at,
version, correlation_id, and typed payload. Consumers should be
idempotent using the event catalog's idempotency key guidance.
Operational expectations:
- Acknowledge only after durable processing.
- Expect retries with exponential backoff.
- Route poison events to DLQ for replay or investigation.
- Preserve
correlation_idinto logs and follow-on events. - Treat breaking event payload changes as a new version with a dual-consume window.
Current Gap To Track
The production-readiness portfolio identifies uniform mutating-route wrapping as an enforcement gap. Until that gate is complete, API consumers should still follow the idempotency contract and report endpoints that do not expose the expected header behavior.