# Audit Tamper Evidence And WORM Retention v1

Status: architecture maturity path for `SEC-ARCH-AUDIT-TAMPER-EVIDENCE-001`

Owner: Platform Audit / Security Architecture

Last updated: 2026-06-05

## Purpose

Define the audit integrity path beyond ordinary append-oriented audit rows.

This document closes the ambiguity from older security material that implied
cryptographic or WORM audit immutability. GPUaaS currently has useful audit
discipline, but it must not claim tamper-proof audit evidence until hash-chain,
signing, external replication, retention, separation-of-duties, and alerting
controls exist and are producing release evidence.

## Current Posture

Current state:

1. `platform_audit_logs` records privileged and security-relevant actions with
   actor, action, target, result, correlation id, metadata, and timestamp.
2. `packages/platform/audit` exposes an append/query service and a legacy
   adapter that inserts audit rows through typed platform code.
3. Route and service code increasingly uses platform audit helpers rather than
   direct ad hoc inserts.
4. Platform evidence/status tables are append-oriented and can link audit,
   security, CI, UAT, and release artifacts to readiness gates.
5. `platform_audit_logs_append_only` blocks ordinary `UPDATE` and `DELETE`
   attempts against audit rows. Approved maintenance and integration-test
   cleanup must explicitly set `gpuaas.allow_audit_log_mutation` to
   `maintenance` or `test_cleanup`.

Current non-claims:

1. Ordinary Postgres audit rows are mutable by sufficiently privileged database
   operators who control the database and maintenance exception path.
2. There is no cryptographic record digest, batch hash chain, or signed audit
   checkpoint in the current schema.
3. There is no independent WORM/Object Lock retention store for audit batches.
4. There is no enforced separation proving that one operator cannot control the
   database, signing key, replication path, and retention policy.
5. There is no production gate proving audit replication freshness, hash-chain
   continuity, signing key custody, or WORM retention state.

Therefore the correct current claim is:

```text
GPUaaS has append-oriented audit records and release evidence hooks. It does
not yet provide cryptographic tamper evidence or WORM-retained audit evidence.
```

## Target Maturity Model

| Level | Name | Claim allowed | Required controls |
|---|---|---|---|
| L0 | Append-oriented audit | Privileged actions are recorded through platform audit paths. | Audit coverage, correlation ids, query/read model, no direct product-owned audit invention. |
| L1 | Append-only database guard | Database updates/deletes to audit rows are blocked for application roles and alerted for privileged maintenance. | DB privileges, optional trigger/RLS guard, migration-safe exception path, CI/schema check. |
| L2 | Hash-chained audit batches | Audit batches are tamper-evident inside the primary system. | Canonical record digest, batch sequence, previous hash, batch root hash, verifier. |
| L3 | Signed external checkpoints | Audit batch roots are signed and replicated outside the primary DB. | Signing key custody, signer identity, external object/security-lake replication, lag alerts. |
| L4 | WORM retained audit evidence | Audit checkpoints and batch manifests are retained under WORM/Object Lock. | Object Lock retention profile, legal hold path, lifecycle policy, separation-of-duties. |
| L5 | Regulated audit custody | Audit evidence is independently verifiable for regulated customer profiles. | HSM/KMS/FIPS custody where required, compliance-mode WORM, independent verifier, formal retention and evidence review. |

Baseline production should target L2/L3 for security-relevant audit evidence.
Regulated profiles should target L4/L5 before making WORM or compliance-grade
immutability claims.

## Hash-Chained Batch Contract

Audit batching should not rewrite `platform_audit_logs`. It should create an
append-only checkpoint stream over existing rows.

Minimum batch fields:

| Field | Meaning |
|---|---|
| `batch_id` | Stable UUID for the audit batch. |
| `environment_profile` | `dev`, `demo`, `staging`, `platform-control`, `production`, or regulated profile. |
| `sequence_number` | Monotonic sequence per environment/profile and audit stream. |
| `first_audit_id`, `last_audit_id` | Inclusive audit row range covered by the batch. |
| `first_occurred_at`, `last_occurred_at` | Time window covered by the batch. |
| `record_count` | Count of audit rows covered. |
| `canonicalization_version` | Version of the canonical digest algorithm. |
| `record_digest_algorithm` | Initial value should be `sha256`. |
| `batch_root_hash` | Hash over canonical per-row digests and batch metadata. |
| `previous_batch_hash` | Prior batch root hash for the same stream, nullable only for genesis. |
| `signing_key_id` | Key identifier used to sign the batch root. |
| `signature` | Signature over sequence, previous hash, root hash, and environment. |
| `replication_uri` | External object/security-lake location for the batch manifest. |
| `worm_retention_until` | Retention deadline when WORM/Object Lock applies. |
| `verification_status` | `pass`, `fail`, `partial`, or `blocked`. |
| `created_at` | Batch creation timestamp. |

Canonicalization must define exactly which audit row fields are hashed and how
JSON metadata is normalized. The initial hash input should include at least:

1. audit row id;
2. org id;
3. actor user id;
4. actor service account id;
5. actor role;
6. action;
7. target type;
8. target id;
9. result;
10. correlation id;
11. canonical metadata JSON;
12. occurred_at.

## Signing Key Custody

Signing key custody is a platform security responsibility, not a database
responsibility.

Baseline production:

1. signer runs as a platform-controlled workload with a scoped key identity;
2. signing key is stored in the approved secrets/KMS custody path;
3. signer can sign batch roots but cannot mutate historical audit rows;
4. database operators can query audit rows but cannot use the signing key;
5. key access and signer configuration changes produce audit and status
   evidence.

Regulated profile:

1. signing key should move to HSM/KMS custody appropriate for the selected
   regulated profile;
2. key policy changes require dual control;
3. key usage logs are exported independently from application logs;
4. verification tooling can validate historical signatures without write access
   to the production database.

## External Replication And WORM Retention

External replication should write audit batch manifests and optional row-digest
files to a store that is outside the primary database failure and operator
control plane.

Executable gate: `scripts/ci/audit_worm_replication_gate.sh`.
Operator usage and evidence payload requirements are documented in
`doc/operations/Audit_WORM_Replication_Gate_Runbook.md`.

Baseline production target:

1. replicate signed batch manifests to object storage or a security data lake;
2. record replication URI and digest in platform evidence;
3. alert when replication lag exceeds the environment threshold;
4. maintain a documented restore/verification procedure.

WORM/Object Lock target:

1. use retention-enabled bucket/container policy;
2. separate object retention administrators from database administrators and
   signer operators;
3. record retention mode, retention-until date, and legal-hold state as
   evidence;
4. alert on retention policy changes, failed object lock application, delete
   attempts, replication failures, and unexpected gaps in batch sequence.

Do not claim WORM retention from normal object storage versioning alone. The
claim requires retention policy evidence and separation-of-duties evidence.

## Alerting And Status Evidence

Status/Ops should expose these audit integrity signals:

| Signal | Failing or partial condition |
|---|---|
| `audit.write.coverage` | privileged mutation lacks audit evidence or correlation id. |
| `audit.batch.freshness` | latest batch is older than the environment threshold. |
| `audit.batch.continuity` | sequence gap, missing previous hash, or verifier mismatch. |
| `audit.signer.health` | signer unavailable, wrong key id, or signature verification failure. |
| `audit.replication.lag` | signed batch not replicated within threshold. |
| `audit.worm.retention` | retention policy missing, expired, or not independently evidenced. |
| `audit.control.change` | signer, retention, replication, or DB privilege changes without approved evidence. |

These signals should feed platform evidence bundles and release gates. For
production-impacting changes, missing audit integrity evidence should be
`partial` or `blocked` based on the environment profile and declared claim.

## Implementation Sequence

### Phase A: Baseline Production Honesty

Output:

1. this design document;
2. current-state docs use append-oriented language only;
3. platform evidence/status references audit integrity as a distinct evidence
   input;
4. Fairway follow-up tasks exist for implementation work.

Allowed claim after Phase A:

```text
Audit rows are append-oriented and linked into release evidence, but
cryptographic tamper evidence and WORM retention are not yet implemented.
```

### Phase B: Append-Only Database Guard

Output:

1. database trigger guard that blocks application-role and normal-session
   update/delete to `platform_audit_logs`;
2. controlled maintenance exception path through
   `gpuaas.allow_audit_log_mutation = 'maintenance'`;
3. test cleanup exception path through
   `gpuaas.allow_audit_log_mutation = 'test_cleanup'`;
4. CI/schema smoke proving the guard exists;
5. optional live Postgres smoke proving ordinary update/delete attempts fail
   and the explicit cleanup exception works.

This reduces accidental mutation risk but still does not create cryptographic
immutability.

### Phase C: Hash-Chain Batch Worker

Output:

1. additive checkpoint table or external manifest writer;
2. canonical digest implementation and tests;
3. verifier command/script;
4. status evidence for freshness and continuity.

Allowed claim after Phase C:

```text
Audit evidence is tamper-evident through hash-chained batches inside the
platform control plane.
```

### Phase D: Signed External Replication

Output:

1. signing key custody integration;
2. signed batch manifests;
3. external replication target;
4. replication lag and signer health alerts;
5. release evidence attachment.

Allowed claim after Phase D:

```text
Audit batch checkpoints are signed and replicated outside the primary database.
```

### Phase E: WORM/Object Lock Retention

Output:

1. retention-enabled storage profile;
2. legal hold and lifecycle procedure;
3. separation-of-duties evidence;
4. retention and control-change alerts.

Allowed claim after Phase E:

```text
Signed audit checkpoints are retained under a documented WORM/Object Lock
profile for the selected environment.
```

### Phase F: Regulated Profile Hardening

Output:

1. HSM/KMS/FIPS custody decision and evidence;
2. independent verifier package;
3. regulated retention matrix;
4. compliance evidence review workflow.

This phase is required before FedRAMP/FIPS/HIPAA/PCI-regulated immutability
claims.

## Fairway Follow-Ups

| Task | Purpose |
|---|---|
| `SEC-ARCH-AUDIT-APPEND-ONLY-DB-GUARD-001` | Add DB/schema/CI guard evidence that application roles cannot update or delete audit rows. |
| `SEC-ARCH-AUDIT-HASH-CHAIN-BATCH-WORKER-001` | Implement additive hash-chained audit batch manifests, canonical digests, and verifier evidence. |
| `SEC-ARCH-AUDIT-WORM-REPLICATION-GATE-001` | Add signed external replication and WORM/Object Lock readiness gate evidence. |

## Review Requirements

This design requires security, architecture, backend, ops, and governance
review before it is used as a production-readiness claim source. The document is
safe to use immediately as a current-state non-claim and implementation
roadmap.
