guildhouse-spire-plugins/SECURITY-AUDIT.md

# Security & Integrity Audit: guildhouse-spire-plugins

**Date:** 2026-02-18
**Commit:** eb9edf5
**Repository:** guildhouse-cooperative/guildhouse-spire-plugins
**Scope:** OIDC token handling, SSH certificate generation, Shellstream extensions, governance integration, key management, configuration/deployment, test fixtures, specification security review
**License:** Apache 2.0
**Build Status:** UNVERIFIABLE (Go toolchain not available on audit machine)

---

## Attacker Profiles

| ID | Profile | Capabilities |
|----|---------|-------------|
| A1 | Malicious Operator | Valid platform credentials, admin-level Keycloak access, can create intents and ceremonies, can configure SPIRE plugins |
| A2 | Network Attacker | Can intercept, replay, or DoS gRPC traffic between SPIRE Server and Quartermaster services; no valid credentials |
| A3 | Compromised Workload | Has a valid SPIFFE SVID, can call the Workload API, can present crafted OIDC tokens; confined to a single pod/namespace |
| A4 | Rogue Plugin | Binary substituted in the SPIRE plugin path (`/opt/spire/plugins/`); full access to plugin IPC channel |
| A5 | Insider with Merkle Access | Read/write access to NotaryService data; can query, replay, or forge merkle proofs and anchors |

## Severity Scale

| Level | Definition |
|-------|-----------|
| CRITICAL | Exploitable in current code/design; leads to authentication bypass, privilege escalation, or complete audit trail compromise |
| HIGH | Exploitable with moderate effort or upon implementation; leads to significant security degradation |
| MEDIUM | Requires specific conditions or affects defense-in-depth; leads to partial security degradation |
| LOW | Minor issue; limited exploitability or impact |
| INFORMATIONAL | Observation; no direct security impact but worth noting |
| DESIGN-NOTE | Architectural decision with security implications; not a vulnerability per se |

---

## Executive Summary

This audit identified **20 findings**: 3 CRITICAL, 6 HIGH, 5 MEDIUM, 2 LOW, 1 INFORMATIONAL, and 3 DESIGN-NOTE.

**Top 3 most impactful findings:**

1. **S-01 (CRITICAL):** `CertRequest` allows requester-controlled TTL and principals with no server-side enforcement of the spec's 1-hour maximum. A compromised workload can request arbitrarily long-lived certificates with unrestricted principal lists.

2. **S-02 (CRITICAL):** The `Verifier` interface does not contractually require audience validation, enabling token confusion attacks where an OIDC token intended for one service is accepted by another.

3. **S-03 (CRITICAL):** Merkle proofs embedded in SSH certificates are not cryptographically bound to the certificate content, allowing proof replay across certificates to forge governance provenance.

**Security posture:** The architectural design is sound — fail-closed governance authorization, short-lived certificates, domain-separated hashing, and JCS canonicalization are strong foundations. However, the interface designs in `pkg/oidc` and `pkg/sshcert` contain structural flaws that, if carried into implementation, would create exploitable vulnerabilities. The `pkg/shellstream` implementation is the most mature and has good validation, though it lacks input size limits on the decode path.

**Key architectural strengths:**
- GovernanceService fail-closed policy (spec Section 10.1) — credential operations cannot bypass authorization
- Short-lived SSH certificates (5m default, 1h max per spec) — limits blast radius of credential compromise
- Domain-separated SHA-256 hashing (`guildhouse.credential.v1:` prefix) — prevents cross-protocol collisions
- JCS (RFC 8785) canonicalization — ensures deterministic merkle leaf construction
- Append-only merkle chain with `previous_root` linkage — tamper-evident audit trail

---

## Trust Boundary Diagram

```
                                    TRUST BOUNDARY: Cluster Network
                                    ================================
 ┌─────────────┐                                                          ┌──────────────────────┐
 │  Workload   │──[Unix Socket]──▶┌──────────────┐                       │  GovernanceService   │
 │  (Pod)      │  Workload API    │  SPIRE Agent │──[gRPC/mTLS]──▶      │  (Quartermaster)     │
 │             │                  │              │  SPIRE Server API     │  :50051              │
 │  OIDC Token │                  │  Attestors:  │      │               └──────────────────────┘
 │  (injected) │                  │  - k8s_psat  │      │                         ▲
 └─────────────┘                  │  - guildhouse│      ▼                         │
       │                          │    _oidc     │┌──────────────┐                │ [gRPC/mTLS]
       │                          └──────────────┘│ SPIRE Server │                │
       │                                          │              │  ┌──────────────────────┐
       │ TRUST BOUNDARY:                          │  Plugins:    │  │  CeremonyService     │
       │ Pod Filesystem                           │  ─────────── │  │  (Bascule)           │
       │                                          │  KeyManager  │──│  :50052              │
       ▼                                          │  (substrate) │  └──────────────────────┘
 ┌─────────────┐                                  │              │            │
 │ token_path  │                                  │  Credential  │            │ [gRPC/mTLS]
 │ /var/run/   │                                  │  Composer    │            ▼
 │ secrets/    │                                  │  (ssh)       │  ┌──────────────────────┐
 │ oidc/token  │                                  │              │  │  NotaryService       │
 └─────────────┘                                  │  Notifier    │──│  (Quartermaster)     │
                                                  │  (governance)│  │  :50051              │
                                                  └──────────────┘  └──────────────────────┘
                                                         │
                                                         │ [SSH Certificate]
                                                         ▼
                                                  ┌──────────────┐
                                                  │  SSH Server   │
                                                  │  (validates   │
                                                  │   cert + ext) │
                                                  └──────────────┘

Trust Boundaries:
  TB-1: Workload ↔ SPIRE Agent       — Unix domain socket, kernel PID attestation
  TB-2: SPIRE Agent ↔ SPIRE Server   — gRPC with mTLS (X.509-SVIDs)
  TB-3: SPIRE Server ↔ Governance    — gRPC, mTLS required (TODO in code)
  TB-4: SPIRE Server ↔ Ceremony      — gRPC, mTLS required (TODO in code)
  TB-5: SPIRE Server ↔ Notary        — gRPC, mTLS required (TODO in code)
  TB-6: Workload ↔ SSH Server        — SSH protocol, certificate-based auth
  TB-7: Pod filesystem ↔ Workload    — OIDC token file, filesystem permissions
```

---

## Threat Analysis Matrix

| Finding | Severity | Attacker | Scope Area | Confidentiality | Integrity | Availability |
|---------|----------|----------|------------|-----------------|-----------|--------------|
| S-01 | CRITICAL | A1, A3 | SSH Cert Generation | Low | **High** | Low |
| S-02 | CRITICAL | A3 | OIDC Token Handling | **High** | **High** | Low |
| S-03 | CRITICAL | A5 | Spec Security | Low | **High** | Low |
| S-04 | HIGH | A2 | Spec Security | Low | **High** | Medium |
| S-05 | HIGH | A2, A3 | Shellstream Extensions | Low | Low | **High** |
| S-06 | HIGH | A1 | Governance Integration | Low | **High** | Low |
| S-07 | HIGH | A1 | Configuration | Low | Low | **High** |
| S-08 | HIGH | A1 | OIDC Token Handling | **High** | **High** | Low |
| S-09 | HIGH | A4 | Key Management | **High** | **High** | **High** |
| S-10 | MEDIUM | A1, A3 | Governance Integration | Low | Medium | Low |
| S-11 | MEDIUM | A3 | SSH Cert Generation | Low | Medium | Low |
| S-12 | MEDIUM | A1 | Configuration | Low | Low | Medium |
| S-13 | MEDIUM | A1, A5 | Shellstream Extensions | Low | Medium | Low |
| S-14 | MEDIUM | A1, A3 | Test Fixtures | Medium | Medium | Low |
| S-15 | DESIGN-NOTE | A2 | Spec Security | Low | Medium | Low |
| S-16 | LOW | A1 | Spec Security | Low | Medium | Low |
| S-17 | LOW | A5 | Spec Security | Low | Medium | Low |
| S-18 | INFORMATIONAL | A1 | Governance Integration | Low | Medium | Low |
| S-19 | DESIGN-NOTE | A2 | Key Management | Medium | Medium | Low |
| S-20 | DESIGN-NOTE | — | Governance Integration | Low | Low | Low |

---

## Findings

### CRITICAL

#### S-01: CertRequest Allows Requester-Controlled TTL and Principals

**Severity:** CRITICAL
**Attacker Profile:** A1 (Malicious Operator), A3 (Compromised Workload)
**Scope Area:** SSH Certificate Generation
**Location:** `pkg/sshcert/sshcert.go:18-29`

**Description:**

The `CertRequest` struct exposes two security-critical fields as requester-settable values:

```go
type CertRequest struct {
    SpiffeID     string
    Extensions   *shellstream.ShellstreamExtensions
    ValidSeconds uint64      // Requester-controlled
    Principals   []string    // Requester-controlled
}
```

The SSH-SVID specification (`specs/spiffe-ssh-svid.md`, Section 7.1, lines 390-396) mandates:
- Maximum TTL: 1 hour (MUST NOT exceed)
- Minimum TTL: 30 seconds (MUST NOT issue below)
- Default TTL: 5 minutes (RECOMMENDED)

The `Build()` method (line 47) performs no TTL bounds checking and no principal validation. The `Principals` field allows arbitrary SSH principals beyond the SPIFFE ID, which could grant access to hosts or accounts the workload is not authorized to reach.

**Attack Scenario:**
1. A3 obtains a valid SPIFFE SVID through legitimate attestation.
2. A3 crafts a `CertRequest` with `ValidSeconds: 31536000` (1 year) and `Principals: ["root", "admin", "deploy"]`.
3. The SSH Credential Composer builds a certificate with a 1-year lifetime and root/admin principals.
4. A3 uses this certificate for persistent, privileged SSH access across the infrastructure.

**Impact:** Privilege escalation (unrestricted principals), credential persistence beyond intended lifetime (up to max uint64 seconds). Violates the short-lived credential model that is the primary security guarantee of the SSH-SVID design.

**Recommendation:**
- Add `MaxValidSeconds`, `MinValidSeconds`, and `AllowedPrincipals` to the `Builder` config, enforced server-side.
- `Build()` must clamp `ValidSeconds` to `[MinValidSeconds, MaxValidSeconds]` and reject principals not in the allow list.
- The SPIFFE ID should be the *only* principal unless explicitly configured otherwise in the SPIRE registration entry.

---

#### S-02: Verifier Interface Lacks Audience Parameter

**Severity:** CRITICAL
**Attacker Profile:** A3 (Compromised Workload)
**Scope Area:** OIDC Token Handling
**Location:** `pkg/oidc/oidc.go:31-34`

**Description:**

The `Verifier` interface defines:

```go
type Verifier interface {
    Verify(ctx context.Context, rawToken string) (*Claims, error)
}
```

The `Config` struct (line 10-18) includes `Audience string`, but this field is never referenced in the `Verifier` interface contract. An implementation of `Verifier` has no obligation to check the audience claim. The `Claims` struct (line 22-28) includes `Audience []string` in the output, but this is informational — no enforcement occurs.

Additionally:
- No nonce parameter exists (replay protection)
- No JTI (JWT ID) tracking exists (token reuse detection)
- `Config.Audience` is not validated as required in `NewVerifier` (line 37-43)

**Attack Scenario:**
1. A3 is a workload in namespace `tenant-b` with a valid OIDC token whose `aud` claim is `["api-gateway"]`.
2. A3 presents this token to the SPIRE OIDC attestor, which calls `Verify(ctx, token)`.
3. Because audience validation is not part of the interface contract, the implementation may accept the token despite the audience mismatch.
4. A3 obtains SPIRE selectors (`oidc:sub:...`) and gains an SVID for a workload identity it should not have.

**Impact:** Token confusion / cross-service authentication bypass. A token issued for one relying party is accepted by another. This is a fundamental OIDC security control.

**Recommendation:**
- Change the interface to `Verify(ctx context.Context, rawToken string, expectedAudience string) (*Claims, error)` to make audience validation contractually required.
- Add `audience` as a required parameter that cannot be empty.
- Implement JTI tracking with a TTL-bounded cache to prevent token replay.

---

#### S-03: Merkle Proof Not Bound to Specific Credential Content

**Severity:** CRITICAL
**Attacker Profile:** A5 (Insider with Merkle Access)
**Scope Area:** Specification Security Review
**Location:** `specs/credential-governance.md:589-599`, `pkg/shellstream/shellstream.go:69-70`

**Description:**

The Shellstream `merkle-proof@guildhouse.dev` extension carries a binary inclusion proof, and `merkle-root@guildhouse.dev` carries the root hash. Together they prove that *some* leaf was included in the governance merkle tree. However, there is no cryptographic binding between the proof and the specific SSH certificate that carries it.

The merkle leaf is `SHA-256(JCS(MutationEnvelope))` (spec Section 7.4, line 360-361). The MutationEnvelope includes `payload_hash`, `actor_svid`, `tenant_id`, `event_type`, and `intent_id` — but it does NOT include the SSH certificate's serial number, public key fingerprint, or any certificate-specific identifier.

**Attack Scenario:**
1. A5 observes a legitimately issued certificate C1 with a valid merkle proof P1.
2. A5 extracts P1 and the corresponding merkle root R1.
3. A5 creates a new certificate C2 (possibly with different principals or TTL) and embeds P1 and R1 as Shellstream extensions.
4. A verifier calls `NotaryService.VerifyInclusion(R1, leaf_from_C1, P1)` — this succeeds because P1 proves inclusion of C1's leaf, which is still valid.
5. The verifier concludes C2 has valid governance provenance, but C2 was never actually governed.

**Impact:** Complete governance audit trail bypass. Certificates can be minted with fabricated governance provenance.

**Recommendation:**
- The MutationEnvelope MUST include a credential-binding field: the SHA-256 hash of the SSH certificate's public key (or serial number). This binds the merkle leaf to a specific certificate.
- The verifier MUST recompute the leaf hash from the certificate's own content and verify that the proof matches *this specific certificate*, not just any historical leaf.
- Add the credential binding field to `specs/credential-governance.md` Section 7.3 and the `MutationEnvelope` schema.

---

### HIGH

#### S-04: NotaryService Fail-Open Creates Unaudited Credential Window

**Severity:** HIGH
**Attacker Profile:** A2 (Network Attacker)
**Scope Area:** Specification Security Review
**Location:** `specs/credential-governance.md:661-671`

**Description:**

The specification defines that when the NotaryService is unreachable after a credential operation completes, the credential "MUST proceed" (line 665). This is a fail-open design for the audit channel. The spec requires local queueing and retry (lines 666-668), but the current codebase has no implementation of this queue.

A network attacker who can disrupt connectivity to the NotaryService (TB-5 in the trust boundary diagram) can cause all credential operations to proceed without merkle anchoring. The credentials are valid and usable, but invisible to the audit trail.

**Attack Scenario:**
1. A2 identifies the NotaryService endpoint (`notary.quartermaster.svc.cluster.local:50051` per deploy config).
2. A2 initiates a sustained DoS against the NotaryService or disrupts the network path.
3. A1 (colluding or independently) issues credentials during the outage.
4. Credentials are issued with valid governance authorization (GovernanceService is fail-closed and still reachable) but no audit trail.
5. When the NotaryService recovers, the local queue (if implemented) would eventually anchor the events — but if the queue is lost (pod restart), the events are permanently unaudited.

**Impact:** Audit gap during NotaryService outage. If combined with local queue loss, permanent audit gap for credential operations during the window.

**Recommendation:**
- Implement the local durable queue as specified (on-disk WAL or embedded database).
- Add a monitoring metric `governance_anchoring_pending_total` as specified in the spec (line 669).
- Consider adding a "maximum anchoring delay" after which credentials should be flagged as requiring re-verification.
- Document the durable queue implementation as a MUST in the implementation guide.

---

#### S-05: No Size Limit on Merkle-Proof Base64 Decode

**Severity:** HIGH
**Attacker Profile:** A2 (Network Attacker), A3 (Compromised Workload)
**Scope Area:** Shellstream Extensions
**Location:** `pkg/shellstream/shellstream.go:202-208`

**Description:**

The `Decode()` function performs base64 decoding on the `merkle-proof@guildhouse.dev` extension value with no size limit:

```go
if v, ok := extensions[ExtMerkleProof]; ok {
    proof, err := base64.StdEncoding.DecodeString(v)
    if err != nil {
        return nil, fmt.Errorf("shellstream: decode merkle-proof: %w", err)
    }
    ext.MerkleProof = proof
}
```

The Shellstream spec (Section 9.4, lines 439-445) recommends a 4 KB total extension size limit, but the code enforces nothing. A crafted SSH certificate with a multi-megabyte `merkle-proof` value causes unbounded memory allocation when any service calls `Decode()`.

The merkle proof should be at most 8 × 32 = 256 bytes (8 levels × 32-byte SHA-256 hashes) based on the 256-leaf depth limit (spec Section 6.8). In base64, this is ~344 characters.

**Attack Scenario:**
1. A3 crafts an SSH certificate with a `merkle-proof@guildhouse.dev` extension containing 100 MB of base64-encoded data.
2. Any service that decodes Shellstream extensions (SSH server middleware, audit verifier, monitoring) calls `Decode()`.
3. The base64 decode allocates ~75 MB of memory per call. Repeated calls exhaust server memory.

**Impact:** Denial of service against any component that processes Shellstream extensions.

**Recommendation:**
- Add a size check before decoding: `if len(v) > 512 { return error }` (344 base64 chars for max proof, with margin).
- Consider adding a total extension payload size check at the start of `Decode()` to enforce the 4 KB spec recommendation.

---

#### S-06: TOCTOU Between Intent Creation and Credential Issuance

**Severity:** HIGH
**Attacker Profile:** A1 (Malicious Operator)
**Scope Area:** Governance Integration
**Location:** `specs/credential-governance.md:30`, `pkg/governance/governance.go:53-56`

**Description:**

The governance flow is:
1. `CreateIntent` → Accord policy evaluation → intent authorized or ceremony required
2. (Optional) Ceremony approval
3. `RedeemIntent` → SAT issued
4. Credential operation executed with SAT

Between step 1 and step 3, an arbitrary amount of time may pass (minutes for ceremony approval, up to `ttl_seconds` of the intent). During this window, Accord policy may change. The SAT issued at step 3 reflects the policy evaluation at step 1.

**Attack Scenario:**
1. A1 creates an intent for a credential operation that is currently classified as `Autonomous` (no approval needed).
2. A1 (or another admin) updates the Accord policy to reclassify this operation as `QuorumApproval`.
3. A1 redeems the intent. The GovernanceService issues a SAT based on the original `Autonomous` classification.
4. The credential is issued without the now-required quorum approval.

**Impact:** Policy bypass during policy transition windows. The window size equals `min(intent_ttl, ceremony_timeout)`.

**Recommendation:**
- Re-evaluate Accord policy at `RedeemIntent` time, not just at `CreateIntent` time. If the classification has escalated, the intent should transition to the new ceremony requirement.
- Alternatively, add a `policy_version` field to the intent and check at redemption that the policy version hasn't changed.
- Document this behavior explicitly in the spec's error handling section.

---

#### S-07: GovernanceEpochSeconds Defaults to Zero

**Severity:** HIGH
**Attacker Profile:** A1 (Malicious Operator)
**Scope Area:** Configuration & Deployment
**Location:** `pkg/config/config.go:30`

**Description:**

```go
GovernanceEpochSeconds int `hcl:"governance_epoch_seconds"`
```

Go initializes `int` to `0`. The `Validate()` method (lines 34-38) only checks `TrustDomain`. If `governance_epoch_seconds` is omitted from configuration, it defaults to 0, which could mean:
- Division by zero in epoch boundary calculations
- Continuous anchoring on every event (resource exhaustion)
- No anchoring at all (implementation-dependent interpretation of epoch=0)

The field comment (line 27-29) documents a default of 300 seconds, but this default is not enforced in code.

**Attack Scenario:**
1. A1 deploys a SPIRE server with a config file that omits `governance_epoch_seconds`.
2. The plugin starts with epoch=0.
3. Depending on implementation: epoch boundary triggers on every event (DoS on NotaryService) or never triggers (no anchoring, audit gap).

**Impact:** Denial of service or silent audit trail failure due to misconfiguration.

**Recommendation:**
- In `Validate()`, check `GovernanceEpochSeconds > 0` and set a default of 300 if unset.
- Add validation for all required fields: `GovernanceAddr`, `ClusterID`, `GovernanceEpochSeconds`.
- Consider using a pointer type (`*int`) to distinguish "not set" from "set to 0".

---

#### S-08: OIDC Config Lacks URL Validation for Issuer

**Severity:** HIGH
**Attacker Profile:** A1 (Malicious Operator)
**Scope Area:** OIDC Token Handling
**Location:** `pkg/oidc/oidc.go:37-40`

**Description:**

```go
func NewVerifier(cfg Config) (Verifier, error) {
    if cfg.Issuer == "" {
        return nil, fmt.Errorf("oidc: issuer is required")
    }
    // TODO: implement
}
```

The issuer field is validated only for non-emptiness. OIDC discovery (`/.well-known/openid-configuration`) fetches the JWKS endpoint from the issuer URL. If the issuer is a `file://` URL, `http://` URL (no TLS), or points to an attacker-controlled domain, the JWKS keys are attacker-controlled.

Additionally, `Config.Audience` (line 15) is not validated as required, and `Config.JWKSURL` (line 18) could override discovery with an arbitrary URL.

**Attack Scenario:**
1. A1 configures the OIDC attestor with `issuer: "http://attacker.example.com"`.
2. The attestor fetches `http://attacker.example.com/.well-known/openid-configuration`.
3. The attacker's OIDC discovery document points to their own JWKS endpoint.
4. The attacker can now sign any OIDC token that will be accepted by the attestor.
5. Workloads in the cluster are attested with attacker-controlled identities.

**Impact:** Complete OIDC authentication bypass via misconfigured or malicious issuer URL.

**Recommendation:**
- Validate that `Issuer` is a well-formed `https://` URL. Reject `http://`, `file://`, and non-URL strings.
- Validate that `Audience` is non-empty.
- If `JWKSURL` is set, validate it is also `https://`.
- Consider pinning the expected issuer to a known domain pattern.

---

#### S-09: No go-plugin Serving Pattern in Plugin Binaries

**Severity:** HIGH
**Attacker Profile:** A4 (Rogue Plugin)
**Scope Area:** Key Management
**Location:** `cmd/substrate-keymanager/main.go`, `cmd/ssh-credential-composer/main.go`, `cmd/governance-notifier/main.go`, `cmd/oidc-attestor/main.go` (all contain `func main() { os.Exit(1) }`)

**Description:**

SPIRE external plugins use HashiCorp's `go-plugin` framework, which includes a magic cookie handshake to verify that the plugin binary was launched by the expected host process. Without this handshake:
- Any binary placed at the plugin path will be loaded by SPIRE Server.
- There is no mutual authentication between SPIRE Server and the plugin binary.
- The plugin stub `plugin.go` files define struct types but have no `plugin.Serve()` call.

The deploy config (`deploy/spire-server-config.yaml`, lines 39, 50, 60) specifies absolute plugin paths under `/opt/spire/plugins/`. If an attacker can write to this path (container escape, misconfigured volume mount, supply chain compromise), they can substitute a rogue binary.

**Attack Scenario:**
1. A4 gains write access to `/opt/spire/plugins/` (e.g., via a writable `hostPath` volume mount in Kubernetes).
2. A4 replaces `substrate-keymanager` with a rogue binary that implements the KeyManager interface.
3. SPIRE Server loads the rogue binary on next restart.
4. The rogue KeyManager has access to all signing keys and can issue arbitrary SVIDs.

**Impact:** Complete CA key compromise, arbitrary SVID issuance, full trust domain takeover.

**Recommendation:**
- Implement the `go-plugin` serving pattern with the SPIRE-defined handshake config in all `cmd/*/main.go`.
- Use plugin binary checksums in the SPIRE server config (SPIRE supports `plugin_checksum`).
- Ensure plugin paths are on read-only filesystems in production.

---

### MEDIUM

#### S-10: RedeemResult Missing ExpiresAt for SAT TTL Enforcement

**Severity:** MEDIUM
**Attacker Profile:** A1 (Malicious Operator), A3 (Compromised Workload)
**Scope Area:** Governance Integration
**Location:** `pkg/governance/governance.go:31-36`

**Description:**

```go
type RedeemResult struct {
    Success bool
    SatHash []byte
    Status  string
    Error   string
}
```

The proto definition (`proto/quartermaster/v1/governance.proto:77-85`) includes `expires_at` in `SatToken`, but the Go `RedeemResult` struct does not surface this field. Code consuming `RedeemResult` has no way to check whether the SAT has expired.

**Attack Scenario:**
1. A credential composer redeems an intent and receives a SAT with a 5-minute TTL.
2. Due to processing delay or retry logic, the SAT expires before the credential is issued.
3. The composer cannot detect the expiration because `RedeemResult` lacks `ExpiresAt`.
4. The credential is issued with an expired SAT hash embedded in the `sat-hash@guildhouse.dev` extension.

**Impact:** Credentials issued with expired authorization tokens. Downstream verifiers that check SAT validity would reject these credentials.

**Recommendation:**
- Add `ExpiresAt time.Time` to `RedeemResult`.
- Check `time.Now().Before(result.ExpiresAt)` before proceeding with credential issuance.
- Add `SatBytes []byte` to enable downstream SAT verification.

---

#### S-11: Roles Decoded Without Validation in Decode Path

**Severity:** MEDIUM
**Attacker Profile:** A3 (Compromised Workload)
**Scope Area:** SSH Certificate Generation
**Location:** `pkg/shellstream/shellstream.go:159-161`

**Description:**

```go
if v, ok := extensions[ExtRoles]; ok && v != "" {
    ext.Roles = strings.Split(v, ",")
}
```

`strings.Split("admin,,engineer", ",")` produces `["admin", "", "engineer"]`. The empty string is silently included in the roles list. While `Validate()` (lines 247-254) catches this, `Decode()` and `Validate()` are separate functions. Any code path that calls `Decode()` without subsequently calling `Validate()` operates on unvalidated role data.

Similarly, `strings.Split("admin, engineer", ",")` produces `["admin", " engineer"]` — the leading space is preserved, which could cause authorization mismatches if roles are compared by exact string equality.

**Attack Scenario:**
1. A3 presents an SSH certificate with `roles@guildhouse.dev: "admin,,,"`.
2. A server-side authorization check calls `Decode()` but not `Validate()`.
3. The roles list is `["admin", "", "", ""]` — the empty strings may match wildcard or default role rules.

**Impact:** Authorization bypass if roles are consumed after `Decode()` without `Validate()`.

**Recommendation:**
- Filter empty strings and trim whitespace in `Decode()` itself, not just in `Validate()`.
- Alternatively, make `Decode()` call `Validate()` internally and return validated data only.
- Document that `Decode()` output MUST be validated before use.

---

#### S-12: Config.Validate() Incomplete — Only Checks TrustDomain

**Severity:** MEDIUM
**Attacker Profile:** A1 (Malicious Operator)
**Scope Area:** Configuration & Deployment
**Location:** `pkg/config/config.go:34-38`

**Description:**

```go
func (c *PluginConfig) Validate() error {
    if c.TrustDomain == "" {
        return fmt.Errorf("config: trust_domain is required")
    }
    return nil
}
```

The `PluginConfig` struct has 6 fields (lines 10-30) but only `TrustDomain` is validated. Fields `GovernanceAddr`, `CeremonyAddr`, `NotaryAddr`, `ClusterID`, and `GovernanceEpochSeconds` can all be empty or zero without triggering a validation error.

**Impact:** Plugins start with incomplete configuration and fail at runtime when the first governance call is made, potentially after having already partially processed credentials.

**Recommendation:**
- Validate all required fields: `GovernanceAddr` (non-empty, valid address format), `ClusterID` (non-empty), `GovernanceEpochSeconds` (> 0).
- `CeremonyAddr` and `NotaryAddr` should be validated when the corresponding features are enabled.

---

#### S-13: GovernanceIntent Only Format-Validated, Not Existence-Validated

**Severity:** MEDIUM
**Attacker Profile:** A1 (Malicious Operator), A5 (Insider with Merkle Access)
**Scope Area:** Shellstream Extensions
**Location:** `pkg/shellstream/shellstream.go:331-335`

**Description:**

```go
if ext.GovernanceIntent != "" {
    if !isValidUUID(ext.GovernanceIntent) {
        return fmt.Errorf("shellstream: governance-intent is not a valid UUID: %q", ext.GovernanceIntent)
    }
}
```

The `governance-intent@guildhouse.dev` extension is validated as a well-formed UUID but there is no mechanism to verify the intent actually exists in the GovernanceService or that it corresponds to the current credential operation.

**Attack Scenario:**
1. A1 crafts a certificate with `governance-intent@guildhouse.dev: "00000000-0000-0000-0000-000000000000"`.
2. The UUID is syntactically valid, so `Validate()` passes.
3. A verifier sees the governance-intent field and assumes the credential went through governance.
4. But the intent ID is fabricated — no such intent exists in the GovernanceService.

**Impact:** False governance provenance. Certificates appear governed but were never actually subject to governance authorization.

**Recommendation:**
- The SSH Credential Composer (at issuance time) must set `governance-intent` from the actual `CreateIntentResponse.intent_id`, not from external input.
- Verifiers should call `GovernanceService.ListIntents` with the intent ID to confirm existence.
- Document that `governance-intent` is server-set and MUST NOT be accepted from external sources.

---

#### S-14: Sample OIDC Token Contains tenant_id Custom Claim

**Severity:** MEDIUM
**Attacker Profile:** A1 (Malicious Operator), A3 (Compromised Workload)
**Scope Area:** Test Fixtures
**Location:** `test/fixtures/sample-oidc-token.json:21`

**Description:**

```json
{
  "tenant_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
```

The OIDC token fixture includes `tenant_id` as a top-level custom claim. The deploy config references Keycloak (`https://keycloak.guildhouse.example.org/realms/platform`), which is self-managed. In Keycloak, custom claims are configurable via protocol mappers and, depending on configuration, may be user-editable (e.g., through a user attribute mapper).

If the OIDC attestor uses `tenant_id` from the token to determine tenant context (rather than deriving it server-side from the workload's namespace or registration entry), a user who can control their Keycloak attributes can claim any tenant.

**Attack Scenario:**
1. A3 is a workload in `tenant-alpha` that has access to modify its own Keycloak user profile.
2. A3 changes its `tenant_id` user attribute to `tenant-beta`'s UUID.
3. Keycloak issues a token with `"tenant_id": "<tenant-beta-uuid>"`.
4. The OIDC attestor extracts `tenant_id` from the token and provides selectors indicating `tenant-beta`.
5. A3 obtains an SVID with tenant-beta's context.

**Impact:** Tenant boundary bypass if `tenant_id` is sourced from the OIDC token.

**Recommendation:**
- Do NOT use `tenant_id` from OIDC tokens for authorization decisions. Derive tenant context from the workload's SPIFFE ID path or Kubernetes namespace.
- If `tenant_id` must come from OIDC, use a Keycloak hardcoded claim mapper (not user attribute mapper) and restrict mapper editing to admin-only.
- Document this trust assumption explicitly.

---

### LOW

#### S-15: DESIGN-NOTE — Governance-Optional Model (Fail-Closed Auth, Fail-Open Audit)

**Severity:** DESIGN-NOTE
**Attacker Profile:** A2 (Network Attacker)
**Scope Area:** Specification Security Review
**Location:** `specs/credential-governance.md:642-671`

**Description:**

The architecture uses two different failure modes:
- **GovernanceService unreachable** (Section 10.1, line 644): Fail-CLOSED. Credential operations MUST fail. Authorization is mandatory.
- **NotaryService unreachable** (Section 10.3, line 661): Fail-OPEN. Credential operations MUST proceed. Audit is best-effort.

This is a deliberate, documented design tradeoff. The rationale (line 671) is that a temporary audit gap is preferable to failing a legitimately authorized credential operation.

**Security Implication:** An attacker who can selectively disrupt NotaryService connectivity creates a window where credential operations proceed without audit trail entries. This is acceptable in the threat model if:
1. The local durable queue (spec line 666) is implemented correctly.
2. The queue survives pod restarts.
3. Monitoring alerts on `governance_anchoring_pending_total` (spec line 669).

**Recommendation:** No design change needed, but operators must be aware that NotaryService availability directly affects audit completeness. Document this prominently in the deployment guide.

---

#### S-16: Emergency Break-Glass Post-Hoc Approval Window Undefined

**Severity:** LOW
**Attacker Profile:** A1 (Malicious Operator)
**Scope Area:** Specification Security Review
**Location:** `specs/credential-governance.md:535-543`

**Description:**

The Accord policy emergency section specifies:

```yaml
emergency:
  classification: EmergencyBreakGlass
  post_hoc_approval_window_hours: 24
```

The spec defines that emergency break-glass ceremonies allow credential operations to proceed before approval, with post-hoc approval required within 24 hours. However, the spec does not define:
- What happens if post-hoc approval is not obtained within the window.
- Whether the credential is automatically revoked.
- Whether subsequent uses of the credential are blocked.
- Whether an alert is generated.

**Impact:** Emergency credentials could persist indefinitely without retrospective approval. A malicious operator could use emergency break-glass repeatedly as a governance bypass mechanism.

**Recommendation:**
- Specify in `specs/credential-governance.md` that credentials issued via EmergencyBreakGlass MUST be automatically revoked if post-hoc approval is not obtained within the window.
- Add a required `escalation_action` field (e.g., `revoke`, `alert`, `suspend`) to the emergency policy schema.

---

#### S-17: Merkle Chain Fork Detection Not Specified

**Severity:** LOW
**Attacker Profile:** A5 (Insider with Merkle Access)
**Scope Area:** Specification Security Review
**Location:** `specs/credential-governance.md:624-630`

**Description:**

The audit chain uses `previous_root` linkage:

> "Each anchor's `previous_root` field MUST reference the `merkle_root` of the immediately preceding anchor." (line 625)

This detects history rewriting (changing an existing anchor) but does not prevent forks: two anchors could both claim the same `previous_root`, creating a split-brain audit trail. The spec does not define:
- A canonical sequence number for anchors.
- A consensus mechanism for anchor creation.
- Fork detection or resolution procedures.

The `CreateAnchorRequest` proto (`proto/quartermaster/v1/notary.proto:18-22`) includes `etcd_revision` which could serve as a sequence number, but its use is not specified (`0 means not set`).

**Impact:** If the NotaryService has multiple writers (e.g., multiple SPIRE servers in an HA deployment), concurrent anchor creation could fork the chain. Each fork is internally consistent but they diverge, making cross-fork audit queries unreliable.

**Recommendation:**
- Specify that `etcd_revision` (or an equivalent monotonic counter) MUST be set and MUST be unique per anchor.
- The NotaryService MUST reject `CreateAnchor` requests where `previous_root` does not match the current latest anchor's `merkle_root` (optimistic concurrency control).
- Document fork detection: auditors should verify that no two anchors share the same `previous_root`.

---

### INFORMATIONAL

#### S-18: CreateIntentRequest identity_claim Oneof Allows Unverifiable External Events

**Severity:** INFORMATIONAL
**Attacker Profile:** A1 (Malicious Operator)
**Scope Area:** Governance Integration
**Location:** `proto/quartermaster/v1/governance.proto:35-38`

**Description:**

```protobuf
oneof identity_claim {
    string oidc_token = 5;
    ExternalEventClaim external_event = 6;
}
```

The `ExternalEventClaim` message (lines 45-50) includes a `verification` string field with no defined verification protocol:

```protobuf
message ExternalEventClaim {
    string source = 1;
    string event_id = 2;
    string event_type = 3;
    string verification = 4;
}
```

The GovernanceService must decide whether to trust an external event claim, but there is no standard for what `verification` should contain (HMAC, signature, URL to verify, etc.).

**Impact:** If the GovernanceService accepts external event claims without strong verification, a malicious operator can forge identity claims to create intents on behalf of other actors.

**Recommendation:**
- Define a verification protocol for external event claims (e.g., HMAC with a shared secret, or a URL that returns verification status).
- Add a `verification_type` field to `ExternalEventClaim` to disambiguate verification methods.
- Consider requiring that external event claims always trigger at least `SingleApproval` ceremony classification.

---

### DESIGN-NOTE

#### S-19: No mTLS Requirement in gRPC Client Configuration

**Severity:** DESIGN-NOTE
**Attacker Profile:** A2 (Network Attacker)
**Scope Area:** Key Management
**Location:** `pkg/governance/governance.go:44-49`

**Description:**

```go
func NewClient(cfg Config) (*Client, error) {
    if cfg.GovernanceAddr == "" {
        return nil, fmt.Errorf("governance: governance address is required")
    }
    // TODO: implement — establish gRPC connections with mTLS
    return &Client{config: cfg}, nil
}
```

The `Config` struct (lines 10-20) has `GovernanceAddr`, `CeremonyAddr`, and `NotaryAddr` but no TLS configuration fields (certificate paths, CA bundle, TLS required flag). The TODO comment mentions mTLS, but when this is implemented, if mTLS is made *optional* rather than *required*, a network attacker can intercept governance traffic.

**Impact:** If mTLS is not enforced, all trust boundaries TB-3, TB-4, and TB-5 in the trust diagram are vulnerable to man-in-the-middle attacks.

**Recommendation:**
- Add `TLSCertPath`, `TLSKeyPath`, `TLSCAPath` to the governance `Config` struct.
- Make TLS mandatory — `NewClient` should fail if TLS configuration is missing.
- Use SPIFFE-aware TLS (SVID-based mTLS) for consistency with the SPIRE ecosystem.

---

#### S-20: Proto Files Lack Field Validation Annotations

**Severity:** DESIGN-NOTE
**Attacker Profile:** N/A
**Scope Area:** Governance Integration
**Location:** All proto files under `proto/`

**Description:**

None of the 4 proto files use field validation annotations (e.g., `buf/validate` or `protoc-gen-validate`). Fields with implicit constraints have no documentation of valid ranges:

| Proto File | Field | Expected Constraint |
|-----------|-------|-------------------|
| `governance.proto:40` | `ttl_seconds` | > 0, <= max_intent_ttl |
| `governance.proto:41` | `max_redemptions` | > 0, reasonable upper bound |
| `ceremony.proto:43` | `required_approvals` | > 0, <= pool_size |
| `ceremony.proto:44` | `ttl_hours` | > 0, reasonable upper bound |
| `notary.proto:21` | `etcd_revision` | >= 0 |

**Impact:** Generated client/server code has no built-in validation. All validation must happen in application code, increasing the risk of missing checks.

**Recommendation:**
- Add `buf/validate` annotations to proto fields with constraints.
- This provides both documentation and generated validation code.
- At minimum, add proto-level comments documenting valid ranges for each constrained field.

---

## Attack Trees

### Attack Tree 1: Forge SSH Certificate with Arbitrary Privileges

```
GOAL: Obtain SSH certificate with unauthorized privileges
├── 1. Exploit CertRequest TTL/Principal control [S-01] [CRITICAL]
│   ├── 1a. Compromised workload (A3) calls credential composer
│   │   ├── Set ValidSeconds = 31536000 (1 year)
│   │   ├── Set Principals = ["root", "deploy", "admin"]
│   │   └── RESULT: Long-lived cert with privileged principals
│   └── 1b. Malicious operator (A1) configures registration entry
│       ├── Set overly broad principal mapping
│       └── RESULT: Legitimate-looking but over-privileged certs
│
├── 2. Bypass OIDC attestation [S-02, S-08]
│   ├── 2a. Token confusion (A3) [S-02]
│   │   ├── Present token with wrong audience
│   │   ├── Verifier lacks audience check
│   │   └── RESULT: SVID for unintended workload identity
│   └── 2b. Malicious issuer (A1) [S-08]
│       ├── Configure issuer = "http://attacker.example.com"
│       ├── Serve attacker-controlled JWKS keys
│       └── RESULT: Arbitrary identity attestation
│
└── 3. Fabricate governance provenance [S-03, S-13]
    ├── 3a. Replay merkle proof (A5) [S-03]
    │   ├── Extract proof from legitimate cert C1
    │   ├── Embed in forged cert C2
    │   └── RESULT: C2 appears governed
    └── 3b. Fabricate intent ID (A1) [S-13]
        ├── Set governance-intent = random valid UUID
        ├── Passes format validation
        └── RESULT: Cert appears to have governance intent
```

### Attack Tree 2: Bypass Governance for Credential Issuance

```
GOAL: Issue credentials without proper governance authorization
├── 1. Exploit TOCTOU window [S-06] [HIGH]
│   ├── 1a. Create intent under permissive policy (A1)
│   │   ├── Intent classified as Autonomous
│   │   ├── Policy updated to require QuorumApproval
│   │   ├── Redeem intent (still uses old classification)
│   │   └── RESULT: Credential issued without required quorum
│   └── 1b. Race condition during policy deployment
│       ├── Multiple intents created during transition
│       └── RESULT: Batch of under-governed credentials
│
├── 2. Suppress audit trail [S-04, S-15]
│   ├── 2a. DoS NotaryService (A2) [S-04]
│   │   ├── NotaryService unreachable
│   │   ├── Credentials proceed (fail-open audit)
│   │   ├── Local queue lost on pod restart
│   │   └── RESULT: Permanently unaudited credential operations
│   └── 2b. Exploit anchoring delay
│       ├── Issue credentials faster than epoch boundary
│       ├── Exceed 256 leaves per epoch [spec depth limit]
│       └── RESULT: Leaves dropped or epoch overflow
│
└── 3. Exploit emergency break-glass [S-16]
    ├── Trigger emergency condition (A1)
    ├── Credential issued without pre-approval
    ├── Post-hoc approval window expires
    ├── No automatic revocation specified
    └── RESULT: Ungoverned credential persists indefinitely
```

### Attack Tree 3: Corrupt Audit Trail

```
GOAL: Compromise integrity of the governance audit trail
├── 1. Proof replay attack [S-03] [CRITICAL]
│   ├── 1a. Cross-credential replay (A5)
│   │   ├── Extract (merkle-root, merkle-proof) from cert C1
│   │   ├── Embed in crafted cert C2
│   │   ├── VerifyInclusion succeeds (proves C1's leaf, not C2's)
│   │   └── RESULT: C2 has false governance provenance
│   └── 1b. Cross-epoch replay
│       ├── Use proof from epoch N in cert issued at epoch N+1
│       ├── Governance-epoch mismatch may not be checked
│       └── RESULT: Stale governance state appears current
│
├── 2. Fork the merkle chain [S-17] [LOW]
│   ├── 2a. Concurrent anchor creation (A5)
│   │   ├── Two writers create anchors with same previous_root
│   │   ├── Chain forks into two valid branches
│   │   └── RESULT: Audit queries return inconsistent results
│   └── 2b. Reorder anchors
│       ├── etcd_revision = 0 (not set) per proto comment
│       ├── No sequence number to enforce ordering
│       └── RESULT: Anchors may be out of causal order
│
├── 3. DoS-induced audit gap [S-04, S-05]
│   ├── 3a. Overwhelm extension decoder (A3) [S-05]
│   │   ├── Craft cert with 100MB merkle-proof
│   │   ├── Audit verifier calls Decode()
│   │   ├── OOM kills verifier process
│   │   └── RESULT: Audit verification unavailable
│   └── 3b. NotaryService DoS (A2) [S-04]
│       ├── Sustained DoS during credential operations
│       └── RESULT: Audit gap for all operations during outage
│
└── 4. Fabricate governance metadata [S-13, S-18]
    ├── 4a. Fake intent ID [S-13]
    │   ├── governance-intent = random UUID
    │   └── RESULT: Cert appears governed when it wasn't
    └── 4b. Forge external event claim [S-18]
        ├── CreateIntent with fabricated ExternalEventClaim
        ├── GovernanceService may accept without strong verification
        └── RESULT: Intent created under false identity
```

---

## Scope Area Deep Dives

### 1. OIDC Token Handling

**Findings:** S-02 (CRITICAL), S-08 (HIGH), S-14 (MEDIUM)

**Files reviewed:** `pkg/oidc/oidc.go`, `cmd/oidc-attestor/plugin.go`, `test/fixtures/sample-oidc-token.json`

The OIDC subsystem has three compounding issues:
1. The `Verifier` interface (S-02) does not contractually require audience validation.
2. The issuer URL (S-08) is not validated for HTTPS.
3. The `tenant_id` claim (S-14) may be user-controllable in Keycloak.

Together, these create a layered authentication failure: even if one control is implemented correctly, the others provide bypass routes. The recommended fix order is: S-02 first (interface design, hardest to change later), then S-08 (configuration validation), then S-14 (trust model documentation).

### 2. SSH Certificate Generation

**Findings:** S-01 (CRITICAL), S-11 (MEDIUM)

**Files reviewed:** `pkg/sshcert/sshcert.go`, `cmd/ssh-credential-composer/plugin.go`, `specs/spiffe-ssh-svid.md`

The SSH certificate generation path has a fundamental interface design flaw (S-01) where the `CertRequest` struct allows callers to control security-critical fields. The SSH-SVID spec clearly defines TTL bounds (30s–1h) and principal derivation (from SPIFFE ID), but the code enforces neither. When this stub is implemented, the `Build()` method MUST enforce these constraints server-side.

### 3. Shellstream Extensions

**Findings:** S-05 (HIGH), S-13 (MEDIUM)

**Files reviewed:** `pkg/shellstream/shellstream.go`, `pkg/shellstream/shellstream_test.go`, `specs/shellstream-extensions.md`

The Shellstream implementation is the most mature code in the repository (360 lines, extensive tests). The primary security concern is the lack of size limits on the decode path (S-05), which is a straightforward DoS vector. The spec recommends a 4 KB total extension limit (Section 9.4, line 441-445), and the merkle proof has a known maximum size of ~256 bytes — both provide clear bounds for validation.

The governance-intent format-only validation (S-13) is a design issue: the extension value looks correct syntactically but could be fabricated. This is mitigated if the credential composer sets it server-side and verifiers cross-check with the GovernanceService.

### 4. Governance Integration

**Findings:** S-06 (HIGH), S-07 (HIGH), S-10 (MEDIUM), S-18 (INFORMATIONAL), S-20 (DESIGN-NOTE)

**Files reviewed:** `pkg/governance/governance.go`, `pkg/config/config.go`, `proto/quartermaster/v1/governance.proto`, `specs/credential-governance.md`

The governance integration has the widest attack surface. The TOCTOU window (S-06) is an inherent challenge in any async authorization system with separate create/redeem steps. The spec should either re-evaluate policy at redemption time or bound the intent TTL to minimize the window. The GovernanceEpochSeconds default-to-zero (S-07) is a dangerous misconfiguration vector. The `RedeemResult` missing `ExpiresAt` (S-10) prevents SAT TTL enforcement in consumer code.

### 5. Key Management

**Findings:** S-09 (HIGH), S-19 (DESIGN-NOTE)

**Files reviewed:** `cmd/substrate-keymanager/main.go`, `cmd/substrate-keymanager/plugin.go`, `pkg/governance/governance.go`

Key management is entirely scaffolded. The most significant risk is the absence of the `go-plugin` serving pattern (S-09), which means there is no binary trust verification between SPIRE Server and the plugin. The mTLS requirement (S-19) for gRPC connections to Quartermaster services is noted in TODOs but not yet structurally enforced.

### 6. Configuration & Deployment

**Findings:** S-07 (HIGH), S-08 (HIGH), S-12 (MEDIUM)

**Files reviewed:** `pkg/config/config.go`, `pkg/oidc/oidc.go`, `deploy/spire-server-config.yaml`, `deploy/spire-agent-config.yaml`

Configuration validation is the weakest link in the current codebase. The `PluginConfig.Validate()` method checks only one of six fields (S-12). The `oidc.NewVerifier` checks only issuer non-emptiness (S-08). The `GovernanceEpochSeconds` field defaults to zero with no validation (S-07). These gaps mean that misconfigured deployments will start successfully but fail in unpredictable ways at runtime.

### 7. Test Fixtures

**Findings:** S-14 (MEDIUM)

**Files reviewed:** `test/fixtures/sample-oidc-token.json`, `test/fixtures/sample-ssh-cert-extensions.json`

The OIDC token fixture includes a `tenant_id` custom claim that, in a self-managed Keycloak deployment, could be user-controllable. This is primarily a trust model documentation issue — the fixture itself is correct for testing, but the trust assumption about where `tenant_id` comes from must be made explicit in the implementation.

### 8. Specification Security Review

**Findings:** S-03 (CRITICAL), S-04 (HIGH), S-15 (DESIGN-NOTE), S-16 (LOW), S-17 (LOW)

**Files reviewed:** All 3 specs — `specs/spiffe-ssh-svid.md` (Section 8), `specs/shellstream-extensions.md` (Section 9), `specs/credential-governance.md` (Sections 9-10)

The specifications are well-written with dedicated security sections. The SSH-SVID spec (Section 8, 7 subsections) covers attestation trust, SPIFFE ID visibility, mTLS, socket protection, rate limiting, CA key compromise, and clock skew. The Shellstream spec (Section 9, 5 subsections) covers extension visibility, authorization boundary, parsing safety, size constraints, and replay/freshness. The Credential Governance spec (Section 10, 5 subsections) covers fail-closed/fail-open modes, ceremony timeout, and idempotency.

The critical gap is the merkle proof binding (S-03): the proof is not tied to the specific certificate. The other spec-level findings (S-16, S-17) are edge cases in the emergency and chain fork scenarios that should be addressed but are not urgent.

---

## Appendix A: Attacker Profile × Finding Matrix

Cells marked with severity initial: **C**=CRITICAL, **H**=HIGH, **M**=MEDIUM, **L**=LOW, **I**=INFO, **D**=DESIGN-NOTE. Empty = not exploitable by this profile.

| Finding | A1 Malicious Operator | A2 Network Attacker | A3 Compromised Workload | A4 Rogue Plugin | A5 Merkle Insider |
|---------|-----------------------|---------------------|-------------------------|-----------------|-------------------|
| S-01 | **C** | | **C** | | |
| S-02 | | | **C** | | |
| S-03 | | | | | **C** |
| S-04 | | **H** | | | |
| S-05 | | **H** | **H** | | |
| S-06 | **H** | | | | |
| S-07 | **H** | | | | |
| S-08 | **H** | | | | |
| S-09 | | | | **H** | |
| S-10 | **M** | | **M** | | |
| S-11 | | | **M** | | |
| S-12 | **M** | | | | |
| S-13 | **M** | | | | **M** |
| S-14 | **M** | | **M** | | |
| S-15 | | **D** | | | |
| S-16 | **L** | | | | |
| S-17 | | | | | **L** |
| S-18 | **I** | | | | |
| S-19 | | **D** | | | |
| S-20 | | | | | |

**Summary by attacker profile:**
- **A1 (Malicious Operator):** 9 findings exploitable. Highest concentration of risk.
- **A2 (Network Attacker):** 3 findings exploitable. Primary vector is service DoS.
- **A3 (Compromised Workload):** 5 findings exploitable. Primary vector is cert/token manipulation.
- **A4 (Rogue Plugin):** 1 finding exploitable. High impact but narrow vector.
- **A5 (Merkle Insider):** 3 findings exploitable. Primary vector is audit trail manipulation.

---

## Appendix B: Trust Boundary Inventory

| ID | Boundary | From | To | Protocol | Authentication | Encryption | Status |
|----|----------|------|----|----------|----------------|------------|--------|
| TB-1 | Workload API | Workload | SPIRE Agent | Unix Socket | Kernel PID attestation | N/A (local) | Specified in SSH-SVID spec Section 8.4 |
| TB-2 | SPIRE Node API | SPIRE Agent | SPIRE Server | gRPC | mTLS with X.509-SVIDs | TLS 1.2+ | Specified in SSH-SVID spec Section 8.3 |
| TB-3 | Governance API | SPIRE Server | GovernanceService | gRPC | mTLS (TODO) | TLS (TODO) | S-19: No TLS config in code |
| TB-4 | Ceremony API | SPIRE Server | CeremonyService | gRPC | mTLS (TODO) | TLS (TODO) | S-19: No TLS config in code |
| TB-5 | Notary API | SPIRE Server | NotaryService | gRPC | mTLS (TODO) | TLS (TODO) | S-19: No TLS config in code |
| TB-6 | SSH Auth | Workload | SSH Server | SSH | Certificate-based | SSH encryption | Spec-compliant (standard OpenSSH) |
| TB-7 | Token File | Pod filesystem | Workload | File I/O | Filesystem permissions | N/A | Specified in agent config: `/var/run/secrets/oidc/token` |
| TB-8 | Plugin IPC | SPIRE Server | Plugin Binary | go-plugin/gRPC | Magic cookie handshake (TODO) | Local | S-09: No handshake implemented |

---

## Appendix C: Specification Security Coverage

### SSH-SVID Specification (`specs/spiffe-ssh-svid.md`, Section 8)

| Topic | Covered | CWE Reference | Assessment |
|-------|---------|---------------|------------|
| Attestation trust | Section 8.1 | CWE-287 (Improper Authentication) | Thorough. Recommends k8s attestation, warns about PID-based. |
| Identity visibility | Section 8.2 | CWE-200 (Exposure of Sensitive Information) | Good. Notes SPIFFE ID in cleartext during SSH handshake. |
| Transport security | Section 8.3 | CWE-319 (Cleartext Transmission) | Good. Requires mTLS for Agent-Server. |
| Socket protection | Section 8.4 | CWE-732 (Incorrect Permission Assignment) | Good. Requires filesystem permissions + kernel attestation. |
| Rate limiting | Section 8.5 | CWE-770 (Allocation Without Limits) | Good. Recommends 60 issuances/min/SPIFFE-ID. |
| CA compromise | Section 8.6 | CWE-321 (Use of Hard-coded Cryptographic Key) | Good. Recommends HSM, monitoring, rotation. |
| Clock skew | Section 8.7 | CWE-613 (Insufficient Session Expiration) | Good. 60-second tolerance. |
| **Certificate TTL enforcement** | **NOT COVERED** | **CWE-613** | **GAP: Spec defines bounds (S-01) but no server-side enforcement guidance.** |
| **Principal restriction** | **NOT COVERED** | **CWE-269 (Improper Privilege Management)** | **GAP: Spec derives principal from SPIFFE ID but doesn't restrict additional principals.** |

### Shellstream Extensions Specification (`specs/shellstream-extensions.md`, Section 9)

| Topic | Covered | CWE Reference | Assessment |
|-------|---------|---------------|------------|
| Extension visibility | Section 9.1 | CWE-200 | Good. "MUST NOT embed secrets." |
| Authorization boundary | Section 9.2 | CWE-863 (Incorrect Authorization) | Excellent. "Merkle proofs provide auditability, not authorization." |
| Parsing safety | Section 9.3 | CWE-94 (Improper Control of Code Generation) | Good. "MUST NOT evaluate as code." |
| Size constraints | Section 9.4 | CWE-770 | Partial. Recommends 4 KB limit but uses SHOULD, not MUST. |
| Replay/freshness | Section 9.5 | CWE-294 (Authentication Bypass by Capture-replay) | Partial. Relies on cert validity period. No proof-to-cert binding. |
| **Proof binding** | **NOT COVERED** | **CWE-345 (Insufficient Verification of Data Authenticity)** | **GAP: S-03 — proof is not bound to specific certificate content.** |

### Credential Governance Specification (`specs/credential-governance.md`, Sections 9-10)

| Topic | Covered | CWE Reference | Assessment |
|-------|---------|---------------|------------|
| Merkle anchoring | Section 9.1 | CWE-354 (Improper Validation of Integrity Check) | Good. Detailed merkle construction with domain separation. |
| Certificate-embedded audit | Section 9.2 | CWE-778 (Insufficient Logging) | Good. Defines required extensions. |
| Audit queries | Section 9.3 | CWE-778 | Good. Three verification patterns defined. |
| Chain continuity | Section 9.4 | CWE-354 | Good. Append-only with `previous_root` linkage. |
| Fail-closed auth | Section 10.1 | CWE-636 (Not Failing Securely) | Excellent. GovernanceService unreachable = operation fails. |
| Ceremony timeout | Section 10.2 | CWE-613 | Good. Timeout = denial. |
| Fail-open audit | Section 10.3 | CWE-778 | Acceptable tradeoff, well-documented rationale. |
| Policy missing | Section 10.4 | CWE-636 | Good. Default to SingleApproval. |
| **TOCTOU** | **NOT COVERED** | **CWE-367 (TOCTOU Race Condition)** | **GAP: S-06 — policy re-evaluation at redemption not specified.** |
| **Chain fork** | **NOT COVERED** | **CWE-362 (Concurrent Execution Using Shared Resource)** | **GAP: S-17 — no fork prevention mechanism.** |
| **Break-glass expiry** | **PARTIAL** | **CWE-613** | **GAP: S-16 — post-hoc approval expiry action undefined.** |

---

*End of security audit report.*