guildhouse-spire-plugins/SECURITY-AUDIT.md
Tyler King a58d548518 feat: network-policy extension, governance lifecycle, audit remediation
- Network-policy SPIRE plugin extension
- Governance event notification with merkle anchoring
- Shellstream specs for consent channels + HFL embedded ABI
- All 17 audit findings from AUDIT.md remediated
- SSH credential composer + substrate key manager updates
- Test coverage for config + sshcert packages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:54:46 -04:00

58 KiB
Raw Blame History

Security & Integrity Audit: guildhouse-spire-plugins

Date: 2026-02-18 Commit: eb9edf5 Repository: guildhouse-cooperative/guildhouse-spire-plugins Scope: OIDC token handling, SSH certificate generation, Shellstream extensions, governance integration, key management, configuration/deployment, test fixtures, specification security review License: Apache 2.0 Build Status: UNVERIFIABLE (Go toolchain not available on audit machine)


Attacker Profiles

ID Profile Capabilities
A1 Malicious Operator Valid platform credentials, admin-level Keycloak access, can create intents and ceremonies, can configure SPIRE plugins
A2 Network Attacker Can intercept, replay, or DoS gRPC traffic between SPIRE Server and Quartermaster services; no valid credentials
A3 Compromised Workload Has a valid SPIFFE SVID, can call the Workload API, can present crafted OIDC tokens; confined to a single pod/namespace
A4 Rogue Plugin Binary substituted in the SPIRE plugin path (/opt/spire/plugins/); full access to plugin IPC channel
A5 Insider with Merkle Access Read/write access to NotaryService data; can query, replay, or forge merkle proofs and anchors

Severity Scale

Level Definition
CRITICAL Exploitable in current code/design; leads to authentication bypass, privilege escalation, or complete audit trail compromise
HIGH Exploitable with moderate effort or upon implementation; leads to significant security degradation
MEDIUM Requires specific conditions or affects defense-in-depth; leads to partial security degradation
LOW Minor issue; limited exploitability or impact
INFORMATIONAL Observation; no direct security impact but worth noting
DESIGN-NOTE Architectural decision with security implications; not a vulnerability per se

Executive Summary

This audit identified 20 findings: 3 CRITICAL, 6 HIGH, 5 MEDIUM, 2 LOW, 1 INFORMATIONAL, and 3 DESIGN-NOTE.

Top 3 most impactful findings:

  1. S-01 (CRITICAL): CertRequest allows requester-controlled TTL and principals with no server-side enforcement of the spec's 1-hour maximum. A compromised workload can request arbitrarily long-lived certificates with unrestricted principal lists.

  2. S-02 (CRITICAL): The Verifier interface does not contractually require audience validation, enabling token confusion attacks where an OIDC token intended for one service is accepted by another.

  3. S-03 (CRITICAL): Merkle proofs embedded in SSH certificates are not cryptographically bound to the certificate content, allowing proof replay across certificates to forge governance provenance.

Security posture: The architectural design is sound — fail-closed governance authorization, short-lived certificates, domain-separated hashing, and JCS canonicalization are strong foundations. However, the interface designs in pkg/oidc and pkg/sshcert contain structural flaws that, if carried into implementation, would create exploitable vulnerabilities. The pkg/shellstream implementation is the most mature and has good validation, though it lacks input size limits on the decode path.

Key architectural strengths:

  • GovernanceService fail-closed policy (spec Section 10.1) — credential operations cannot bypass authorization
  • Short-lived SSH certificates (5m default, 1h max per spec) — limits blast radius of credential compromise
  • Domain-separated SHA-256 hashing (guildhouse.credential.v1: prefix) — prevents cross-protocol collisions
  • JCS (RFC 8785) canonicalization — ensures deterministic merkle leaf construction
  • Append-only merkle chain with previous_root linkage — tamper-evident audit trail

Trust Boundary Diagram

                                    TRUST BOUNDARY: Cluster Network
                                    ================================
 ┌─────────────┐                                                          ┌──────────────────────┐
 │  Workload   │──[Unix Socket]──▶┌──────────────┐                       │  GovernanceService   │
 │  (Pod)      │  Workload API    │  SPIRE Agent │──[gRPC/mTLS]──▶      │  (Quartermaster)     │
 │             │                  │              │  SPIRE Server API     │  :50051              │
 │  OIDC Token │                  │  Attestors:  │      │               └──────────────────────┘
 │  (injected) │                  │  - k8s_psat  │      │                         ▲
 └─────────────┘                  │  - guildhouse│      ▼                         │
       │                          │    _oidc     │┌──────────────┐                │ [gRPC/mTLS]
       │                          └──────────────┘│ SPIRE Server │                │
       │                                          │              │  ┌──────────────────────┐
       │ TRUST BOUNDARY:                          │  Plugins:    │  │  CeremonyService     │
       │ Pod Filesystem                           │  ─────────── │  │  (Bascule)           │
       │                                          │  KeyManager  │──│  :50052              │
       ▼                                          │  (substrate) │  └──────────────────────┘
 ┌─────────────┐                                  │              │            │
 │ token_path  │                                  │  Credential  │            │ [gRPC/mTLS]
 │ /var/run/   │                                  │  Composer    │            ▼
 │ secrets/    │                                  │  (ssh)       │  ┌──────────────────────┐
 │ oidc/token  │                                  │              │  │  NotaryService       │
 └─────────────┘                                  │  Notifier    │──│  (Quartermaster)     │
                                                  │  (governance)│  │  :50051              │
                                                  └──────────────┘  └──────────────────────┘
                                                         │
                                                         │ [SSH Certificate]
                                                         ▼
                                                  ┌──────────────┐
                                                  │  SSH Server   │
                                                  │  (validates   │
                                                  │   cert + ext) │
                                                  └──────────────┘

Trust Boundaries:
  TB-1: Workload ↔ SPIRE Agent       — Unix domain socket, kernel PID attestation
  TB-2: SPIRE Agent ↔ SPIRE Server   — gRPC with mTLS (X.509-SVIDs)
  TB-3: SPIRE Server ↔ Governance    — gRPC, mTLS required (TODO in code)
  TB-4: SPIRE Server ↔ Ceremony      — gRPC, mTLS required (TODO in code)
  TB-5: SPIRE Server ↔ Notary        — gRPC, mTLS required (TODO in code)
  TB-6: Workload ↔ SSH Server        — SSH protocol, certificate-based auth
  TB-7: Pod filesystem ↔ Workload    — OIDC token file, filesystem permissions

Threat Analysis Matrix

Finding Severity Attacker Scope Area Confidentiality Integrity Availability
S-01 CRITICAL A1, A3 SSH Cert Generation Low High Low
S-02 CRITICAL A3 OIDC Token Handling High High Low
S-03 CRITICAL A5 Spec Security Low High Low
S-04 HIGH A2 Spec Security Low High Medium
S-05 HIGH A2, A3 Shellstream Extensions Low Low High
S-06 HIGH A1 Governance Integration Low High Low
S-07 HIGH A1 Configuration Low Low High
S-08 HIGH A1 OIDC Token Handling High High Low
S-09 HIGH A4 Key Management High High High
S-10 MEDIUM A1, A3 Governance Integration Low Medium Low
S-11 MEDIUM A3 SSH Cert Generation Low Medium Low
S-12 MEDIUM A1 Configuration Low Low Medium
S-13 MEDIUM A1, A5 Shellstream Extensions Low Medium Low
S-14 MEDIUM A1, A3 Test Fixtures Medium Medium Low
S-15 DESIGN-NOTE A2 Spec Security Low Medium Low
S-16 LOW A1 Spec Security Low Medium Low
S-17 LOW A5 Spec Security Low Medium Low
S-18 INFORMATIONAL A1 Governance Integration Low Medium Low
S-19 DESIGN-NOTE A2 Key Management Medium Medium Low
S-20 DESIGN-NOTE Governance Integration Low Low Low

Findings

CRITICAL

S-01: CertRequest Allows Requester-Controlled TTL and Principals

Severity: CRITICAL Attacker Profile: A1 (Malicious Operator), A3 (Compromised Workload) Scope Area: SSH Certificate Generation Location: pkg/sshcert/sshcert.go:18-29

Description:

The CertRequest struct exposes two security-critical fields as requester-settable values:

type CertRequest struct {
    SpiffeID     string
    Extensions   *shellstream.ShellstreamExtensions
    ValidSeconds uint64      // Requester-controlled
    Principals   []string    // Requester-controlled
}

The SSH-SVID specification (specs/spiffe-ssh-svid.md, Section 7.1, lines 390-396) mandates:

  • Maximum TTL: 1 hour (MUST NOT exceed)
  • Minimum TTL: 30 seconds (MUST NOT issue below)
  • Default TTL: 5 minutes (RECOMMENDED)

The Build() method (line 47) performs no TTL bounds checking and no principal validation. The Principals field allows arbitrary SSH principals beyond the SPIFFE ID, which could grant access to hosts or accounts the workload is not authorized to reach.

Attack Scenario:

  1. A3 obtains a valid SPIFFE SVID through legitimate attestation.
  2. A3 crafts a CertRequest with ValidSeconds: 31536000 (1 year) and Principals: ["root", "admin", "deploy"].
  3. The SSH Credential Composer builds a certificate with a 1-year lifetime and root/admin principals.
  4. A3 uses this certificate for persistent, privileged SSH access across the infrastructure.

Impact: Privilege escalation (unrestricted principals), credential persistence beyond intended lifetime (up to max uint64 seconds). Violates the short-lived credential model that is the primary security guarantee of the SSH-SVID design.

Recommendation:

  • Add MaxValidSeconds, MinValidSeconds, and AllowedPrincipals to the Builder config, enforced server-side.
  • Build() must clamp ValidSeconds to [MinValidSeconds, MaxValidSeconds] and reject principals not in the allow list.
  • The SPIFFE ID should be the only principal unless explicitly configured otherwise in the SPIRE registration entry.

S-02: Verifier Interface Lacks Audience Parameter

Severity: CRITICAL Attacker Profile: A3 (Compromised Workload) Scope Area: OIDC Token Handling Location: pkg/oidc/oidc.go:31-34

Description:

The Verifier interface defines:

type Verifier interface {
    Verify(ctx context.Context, rawToken string) (*Claims, error)
}

The Config struct (line 10-18) includes Audience string, but this field is never referenced in the Verifier interface contract. An implementation of Verifier has no obligation to check the audience claim. The Claims struct (line 22-28) includes Audience []string in the output, but this is informational — no enforcement occurs.

Additionally:

  • No nonce parameter exists (replay protection)
  • No JTI (JWT ID) tracking exists (token reuse detection)
  • Config.Audience is not validated as required in NewVerifier (line 37-43)

Attack Scenario:

  1. A3 is a workload in namespace tenant-b with a valid OIDC token whose aud claim is ["api-gateway"].
  2. A3 presents this token to the SPIRE OIDC attestor, which calls Verify(ctx, token).
  3. Because audience validation is not part of the interface contract, the implementation may accept the token despite the audience mismatch.
  4. A3 obtains SPIRE selectors (oidc:sub:...) and gains an SVID for a workload identity it should not have.

Impact: Token confusion / cross-service authentication bypass. A token issued for one relying party is accepted by another. This is a fundamental OIDC security control.

Recommendation:

  • Change the interface to Verify(ctx context.Context, rawToken string, expectedAudience string) (*Claims, error) to make audience validation contractually required.
  • Add audience as a required parameter that cannot be empty.
  • Implement JTI tracking with a TTL-bounded cache to prevent token replay.

S-03: Merkle Proof Not Bound to Specific Credential Content

Severity: CRITICAL Attacker Profile: A5 (Insider with Merkle Access) Scope Area: Specification Security Review Location: specs/credential-governance.md:589-599, pkg/shellstream/shellstream.go:69-70

Description:

The Shellstream merkle-proof@guildhouse.dev extension carries a binary inclusion proof, and merkle-root@guildhouse.dev carries the root hash. Together they prove that some leaf was included in the governance merkle tree. However, there is no cryptographic binding between the proof and the specific SSH certificate that carries it.

The merkle leaf is SHA-256(JCS(MutationEnvelope)) (spec Section 7.4, line 360-361). The MutationEnvelope includes payload_hash, actor_svid, tenant_id, event_type, and intent_id — but it does NOT include the SSH certificate's serial number, public key fingerprint, or any certificate-specific identifier.

Attack Scenario:

  1. A5 observes a legitimately issued certificate C1 with a valid merkle proof P1.
  2. A5 extracts P1 and the corresponding merkle root R1.
  3. A5 creates a new certificate C2 (possibly with different principals or TTL) and embeds P1 and R1 as Shellstream extensions.
  4. A verifier calls NotaryService.VerifyInclusion(R1, leaf_from_C1, P1) — this succeeds because P1 proves inclusion of C1's leaf, which is still valid.
  5. The verifier concludes C2 has valid governance provenance, but C2 was never actually governed.

Impact: Complete governance audit trail bypass. Certificates can be minted with fabricated governance provenance.

Recommendation:

  • The MutationEnvelope MUST include a credential-binding field: the SHA-256 hash of the SSH certificate's public key (or serial number). This binds the merkle leaf to a specific certificate.
  • The verifier MUST recompute the leaf hash from the certificate's own content and verify that the proof matches this specific certificate, not just any historical leaf.
  • Add the credential binding field to specs/credential-governance.md Section 7.3 and the MutationEnvelope schema.

HIGH

S-04: NotaryService Fail-Open Creates Unaudited Credential Window

Severity: HIGH Attacker Profile: A2 (Network Attacker) Scope Area: Specification Security Review Location: specs/credential-governance.md:661-671

Description:

The specification defines that when the NotaryService is unreachable after a credential operation completes, the credential "MUST proceed" (line 665). This is a fail-open design for the audit channel. The spec requires local queueing and retry (lines 666-668), but the current codebase has no implementation of this queue.

A network attacker who can disrupt connectivity to the NotaryService (TB-5 in the trust boundary diagram) can cause all credential operations to proceed without merkle anchoring. The credentials are valid and usable, but invisible to the audit trail.

Attack Scenario:

  1. A2 identifies the NotaryService endpoint (notary.quartermaster.svc.cluster.local:50051 per deploy config).
  2. A2 initiates a sustained DoS against the NotaryService or disrupts the network path.
  3. A1 (colluding or independently) issues credentials during the outage.
  4. Credentials are issued with valid governance authorization (GovernanceService is fail-closed and still reachable) but no audit trail.
  5. When the NotaryService recovers, the local queue (if implemented) would eventually anchor the events — but if the queue is lost (pod restart), the events are permanently unaudited.

Impact: Audit gap during NotaryService outage. If combined with local queue loss, permanent audit gap for credential operations during the window.

Recommendation:

  • Implement the local durable queue as specified (on-disk WAL or embedded database).
  • Add a monitoring metric governance_anchoring_pending_total as specified in the spec (line 669).
  • Consider adding a "maximum anchoring delay" after which credentials should be flagged as requiring re-verification.
  • Document the durable queue implementation as a MUST in the implementation guide.

S-05: No Size Limit on Merkle-Proof Base64 Decode

Severity: HIGH Attacker Profile: A2 (Network Attacker), A3 (Compromised Workload) Scope Area: Shellstream Extensions Location: pkg/shellstream/shellstream.go:202-208

Description:

The Decode() function performs base64 decoding on the merkle-proof@guildhouse.dev extension value with no size limit:

if v, ok := extensions[ExtMerkleProof]; ok {
    proof, err := base64.StdEncoding.DecodeString(v)
    if err != nil {
        return nil, fmt.Errorf("shellstream: decode merkle-proof: %w", err)
    }
    ext.MerkleProof = proof
}

The Shellstream spec (Section 9.4, lines 439-445) recommends a 4 KB total extension size limit, but the code enforces nothing. A crafted SSH certificate with a multi-megabyte merkle-proof value causes unbounded memory allocation when any service calls Decode().

The merkle proof should be at most 8 × 32 = 256 bytes (8 levels × 32-byte SHA-256 hashes) based on the 256-leaf depth limit (spec Section 6.8). In base64, this is ~344 characters.

Attack Scenario:

  1. A3 crafts an SSH certificate with a merkle-proof@guildhouse.dev extension containing 100 MB of base64-encoded data.
  2. Any service that decodes Shellstream extensions (SSH server middleware, audit verifier, monitoring) calls Decode().
  3. The base64 decode allocates ~75 MB of memory per call. Repeated calls exhaust server memory.

Impact: Denial of service against any component that processes Shellstream extensions.

Recommendation:

  • Add a size check before decoding: if len(v) > 512 { return error } (344 base64 chars for max proof, with margin).
  • Consider adding a total extension payload size check at the start of Decode() to enforce the 4 KB spec recommendation.

S-06: TOCTOU Between Intent Creation and Credential Issuance

Severity: HIGH Attacker Profile: A1 (Malicious Operator) Scope Area: Governance Integration Location: specs/credential-governance.md:30, pkg/governance/governance.go:53-56

Description:

The governance flow is:

  1. CreateIntent → Accord policy evaluation → intent authorized or ceremony required
  2. (Optional) Ceremony approval
  3. RedeemIntent → SAT issued
  4. Credential operation executed with SAT

Between step 1 and step 3, an arbitrary amount of time may pass (minutes for ceremony approval, up to ttl_seconds of the intent). During this window, Accord policy may change. The SAT issued at step 3 reflects the policy evaluation at step 1.

Attack Scenario:

  1. A1 creates an intent for a credential operation that is currently classified as Autonomous (no approval needed).
  2. A1 (or another admin) updates the Accord policy to reclassify this operation as QuorumApproval.
  3. A1 redeems the intent. The GovernanceService issues a SAT based on the original Autonomous classification.
  4. The credential is issued without the now-required quorum approval.

Impact: Policy bypass during policy transition windows. The window size equals min(intent_ttl, ceremony_timeout).

Recommendation:

  • Re-evaluate Accord policy at RedeemIntent time, not just at CreateIntent time. If the classification has escalated, the intent should transition to the new ceremony requirement.
  • Alternatively, add a policy_version field to the intent and check at redemption that the policy version hasn't changed.
  • Document this behavior explicitly in the spec's error handling section.

S-07: GovernanceEpochSeconds Defaults to Zero

Severity: HIGH Attacker Profile: A1 (Malicious Operator) Scope Area: Configuration & Deployment Location: pkg/config/config.go:30

Description:

GovernanceEpochSeconds int `hcl:"governance_epoch_seconds"`

Go initializes int to 0. The Validate() method (lines 34-38) only checks TrustDomain. If governance_epoch_seconds is omitted from configuration, it defaults to 0, which could mean:

  • Division by zero in epoch boundary calculations
  • Continuous anchoring on every event (resource exhaustion)
  • No anchoring at all (implementation-dependent interpretation of epoch=0)

The field comment (line 27-29) documents a default of 300 seconds, but this default is not enforced in code.

Attack Scenario:

  1. A1 deploys a SPIRE server with a config file that omits governance_epoch_seconds.
  2. The plugin starts with epoch=0.
  3. Depending on implementation: epoch boundary triggers on every event (DoS on NotaryService) or never triggers (no anchoring, audit gap).

Impact: Denial of service or silent audit trail failure due to misconfiguration.

Recommendation:

  • In Validate(), check GovernanceEpochSeconds > 0 and set a default of 300 if unset.
  • Add validation for all required fields: GovernanceAddr, ClusterID, GovernanceEpochSeconds.
  • Consider using a pointer type (*int) to distinguish "not set" from "set to 0".

S-08: OIDC Config Lacks URL Validation for Issuer

Severity: HIGH Attacker Profile: A1 (Malicious Operator) Scope Area: OIDC Token Handling Location: pkg/oidc/oidc.go:37-40

Description:

func NewVerifier(cfg Config) (Verifier, error) {
    if cfg.Issuer == "" {
        return nil, fmt.Errorf("oidc: issuer is required")
    }
    // TODO: implement
}

The issuer field is validated only for non-emptiness. OIDC discovery (/.well-known/openid-configuration) fetches the JWKS endpoint from the issuer URL. If the issuer is a file:// URL, http:// URL (no TLS), or points to an attacker-controlled domain, the JWKS keys are attacker-controlled.

Additionally, Config.Audience (line 15) is not validated as required, and Config.JWKSURL (line 18) could override discovery with an arbitrary URL.

Attack Scenario:

  1. A1 configures the OIDC attestor with issuer: "http://attacker.example.com".
  2. The attestor fetches http://attacker.example.com/.well-known/openid-configuration.
  3. The attacker's OIDC discovery document points to their own JWKS endpoint.
  4. The attacker can now sign any OIDC token that will be accepted by the attestor.
  5. Workloads in the cluster are attested with attacker-controlled identities.

Impact: Complete OIDC authentication bypass via misconfigured or malicious issuer URL.

Recommendation:

  • Validate that Issuer is a well-formed https:// URL. Reject http://, file://, and non-URL strings.
  • Validate that Audience is non-empty.
  • If JWKSURL is set, validate it is also https://.
  • Consider pinning the expected issuer to a known domain pattern.

S-09: No go-plugin Serving Pattern in Plugin Binaries

Severity: HIGH Attacker Profile: A4 (Rogue Plugin) Scope Area: Key Management Location: cmd/substrate-keymanager/main.go, cmd/ssh-credential-composer/main.go, cmd/governance-notifier/main.go, cmd/oidc-attestor/main.go (all contain func main() { os.Exit(1) })

Description:

SPIRE external plugins use HashiCorp's go-plugin framework, which includes a magic cookie handshake to verify that the plugin binary was launched by the expected host process. Without this handshake:

  • Any binary placed at the plugin path will be loaded by SPIRE Server.
  • There is no mutual authentication between SPIRE Server and the plugin binary.
  • The plugin stub plugin.go files define struct types but have no plugin.Serve() call.

The deploy config (deploy/spire-server-config.yaml, lines 39, 50, 60) specifies absolute plugin paths under /opt/spire/plugins/. If an attacker can write to this path (container escape, misconfigured volume mount, supply chain compromise), they can substitute a rogue binary.

Attack Scenario:

  1. A4 gains write access to /opt/spire/plugins/ (e.g., via a writable hostPath volume mount in Kubernetes).
  2. A4 replaces substrate-keymanager with a rogue binary that implements the KeyManager interface.
  3. SPIRE Server loads the rogue binary on next restart.
  4. The rogue KeyManager has access to all signing keys and can issue arbitrary SVIDs.

Impact: Complete CA key compromise, arbitrary SVID issuance, full trust domain takeover.

Recommendation:

  • Implement the go-plugin serving pattern with the SPIRE-defined handshake config in all cmd/*/main.go.
  • Use plugin binary checksums in the SPIRE server config (SPIRE supports plugin_checksum).
  • Ensure plugin paths are on read-only filesystems in production.

MEDIUM

S-10: RedeemResult Missing ExpiresAt for SAT TTL Enforcement

Severity: MEDIUM Attacker Profile: A1 (Malicious Operator), A3 (Compromised Workload) Scope Area: Governance Integration Location: pkg/governance/governance.go:31-36

Description:

type RedeemResult struct {
    Success bool
    SatHash []byte
    Status  string
    Error   string
}

The proto definition (proto/quartermaster/v1/governance.proto:77-85) includes expires_at in SatToken, but the Go RedeemResult struct does not surface this field. Code consuming RedeemResult has no way to check whether the SAT has expired.

Attack Scenario:

  1. A credential composer redeems an intent and receives a SAT with a 5-minute TTL.
  2. Due to processing delay or retry logic, the SAT expires before the credential is issued.
  3. The composer cannot detect the expiration because RedeemResult lacks ExpiresAt.
  4. The credential is issued with an expired SAT hash embedded in the sat-hash@guildhouse.dev extension.

Impact: Credentials issued with expired authorization tokens. Downstream verifiers that check SAT validity would reject these credentials.

Recommendation:

  • Add ExpiresAt time.Time to RedeemResult.
  • Check time.Now().Before(result.ExpiresAt) before proceeding with credential issuance.
  • Add SatBytes []byte to enable downstream SAT verification.

S-11: Roles Decoded Without Validation in Decode Path

Severity: MEDIUM Attacker Profile: A3 (Compromised Workload) Scope Area: SSH Certificate Generation Location: pkg/shellstream/shellstream.go:159-161

Description:

if v, ok := extensions[ExtRoles]; ok && v != "" {
    ext.Roles = strings.Split(v, ",")
}

strings.Split("admin,,engineer", ",") produces ["admin", "", "engineer"]. The empty string is silently included in the roles list. While Validate() (lines 247-254) catches this, Decode() and Validate() are separate functions. Any code path that calls Decode() without subsequently calling Validate() operates on unvalidated role data.

Similarly, strings.Split("admin, engineer", ",") produces ["admin", " engineer"] — the leading space is preserved, which could cause authorization mismatches if roles are compared by exact string equality.

Attack Scenario:

  1. A3 presents an SSH certificate with roles@guildhouse.dev: "admin,,,".
  2. A server-side authorization check calls Decode() but not Validate().
  3. The roles list is ["admin", "", "", ""] — the empty strings may match wildcard or default role rules.

Impact: Authorization bypass if roles are consumed after Decode() without Validate().

Recommendation:

  • Filter empty strings and trim whitespace in Decode() itself, not just in Validate().
  • Alternatively, make Decode() call Validate() internally and return validated data only.
  • Document that Decode() output MUST be validated before use.

S-12: Config.Validate() Incomplete — Only Checks TrustDomain

Severity: MEDIUM Attacker Profile: A1 (Malicious Operator) Scope Area: Configuration & Deployment Location: pkg/config/config.go:34-38

Description:

func (c *PluginConfig) Validate() error {
    if c.TrustDomain == "" {
        return fmt.Errorf("config: trust_domain is required")
    }
    return nil
}

The PluginConfig struct has 6 fields (lines 10-30) but only TrustDomain is validated. Fields GovernanceAddr, CeremonyAddr, NotaryAddr, ClusterID, and GovernanceEpochSeconds can all be empty or zero without triggering a validation error.

Impact: Plugins start with incomplete configuration and fail at runtime when the first governance call is made, potentially after having already partially processed credentials.

Recommendation:

  • Validate all required fields: GovernanceAddr (non-empty, valid address format), ClusterID (non-empty), GovernanceEpochSeconds (> 0).
  • CeremonyAddr and NotaryAddr should be validated when the corresponding features are enabled.

S-13: GovernanceIntent Only Format-Validated, Not Existence-Validated

Severity: MEDIUM Attacker Profile: A1 (Malicious Operator), A5 (Insider with Merkle Access) Scope Area: Shellstream Extensions Location: pkg/shellstream/shellstream.go:331-335

Description:

if ext.GovernanceIntent != "" {
    if !isValidUUID(ext.GovernanceIntent) {
        return fmt.Errorf("shellstream: governance-intent is not a valid UUID: %q", ext.GovernanceIntent)
    }
}

The governance-intent@guildhouse.dev extension is validated as a well-formed UUID but there is no mechanism to verify the intent actually exists in the GovernanceService or that it corresponds to the current credential operation.

Attack Scenario:

  1. A1 crafts a certificate with governance-intent@guildhouse.dev: "00000000-0000-0000-0000-000000000000".
  2. The UUID is syntactically valid, so Validate() passes.
  3. A verifier sees the governance-intent field and assumes the credential went through governance.
  4. But the intent ID is fabricated — no such intent exists in the GovernanceService.

Impact: False governance provenance. Certificates appear governed but were never actually subject to governance authorization.

Recommendation:

  • The SSH Credential Composer (at issuance time) must set governance-intent from the actual CreateIntentResponse.intent_id, not from external input.
  • Verifiers should call GovernanceService.ListIntents with the intent ID to confirm existence.
  • Document that governance-intent is server-set and MUST NOT be accepted from external sources.

S-14: Sample OIDC Token Contains tenant_id Custom Claim

Severity: MEDIUM Attacker Profile: A1 (Malicious Operator), A3 (Compromised Workload) Scope Area: Test Fixtures Location: test/fixtures/sample-oidc-token.json:21

Description:

{
  "tenant_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

The OIDC token fixture includes tenant_id as a top-level custom claim. The deploy config references Keycloak (https://keycloak.guildhouse.example.org/realms/platform), which is self-managed. In Keycloak, custom claims are configurable via protocol mappers and, depending on configuration, may be user-editable (e.g., through a user attribute mapper).

If the OIDC attestor uses tenant_id from the token to determine tenant context (rather than deriving it server-side from the workload's namespace or registration entry), a user who can control their Keycloak attributes can claim any tenant.

Attack Scenario:

  1. A3 is a workload in tenant-alpha that has access to modify its own Keycloak user profile.
  2. A3 changes its tenant_id user attribute to tenant-beta's UUID.
  3. Keycloak issues a token with "tenant_id": "<tenant-beta-uuid>".
  4. The OIDC attestor extracts tenant_id from the token and provides selectors indicating tenant-beta.
  5. A3 obtains an SVID with tenant-beta's context.

Impact: Tenant boundary bypass if tenant_id is sourced from the OIDC token.

Recommendation:

  • Do NOT use tenant_id from OIDC tokens for authorization decisions. Derive tenant context from the workload's SPIFFE ID path or Kubernetes namespace.
  • If tenant_id must come from OIDC, use a Keycloak hardcoded claim mapper (not user attribute mapper) and restrict mapper editing to admin-only.
  • Document this trust assumption explicitly.

LOW

S-15: DESIGN-NOTE — Governance-Optional Model (Fail-Closed Auth, Fail-Open Audit)

Severity: DESIGN-NOTE Attacker Profile: A2 (Network Attacker) Scope Area: Specification Security Review Location: specs/credential-governance.md:642-671

Description:

The architecture uses two different failure modes:

  • GovernanceService unreachable (Section 10.1, line 644): Fail-CLOSED. Credential operations MUST fail. Authorization is mandatory.
  • NotaryService unreachable (Section 10.3, line 661): Fail-OPEN. Credential operations MUST proceed. Audit is best-effort.

This is a deliberate, documented design tradeoff. The rationale (line 671) is that a temporary audit gap is preferable to failing a legitimately authorized credential operation.

Security Implication: An attacker who can selectively disrupt NotaryService connectivity creates a window where credential operations proceed without audit trail entries. This is acceptable in the threat model if:

  1. The local durable queue (spec line 666) is implemented correctly.
  2. The queue survives pod restarts.
  3. Monitoring alerts on governance_anchoring_pending_total (spec line 669).

Recommendation: No design change needed, but operators must be aware that NotaryService availability directly affects audit completeness. Document this prominently in the deployment guide.


S-16: Emergency Break-Glass Post-Hoc Approval Window Undefined

Severity: LOW Attacker Profile: A1 (Malicious Operator) Scope Area: Specification Security Review Location: specs/credential-governance.md:535-543

Description:

The Accord policy emergency section specifies:

emergency:
  classification: EmergencyBreakGlass
  post_hoc_approval_window_hours: 24

The spec defines that emergency break-glass ceremonies allow credential operations to proceed before approval, with post-hoc approval required within 24 hours. However, the spec does not define:

  • What happens if post-hoc approval is not obtained within the window.
  • Whether the credential is automatically revoked.
  • Whether subsequent uses of the credential are blocked.
  • Whether an alert is generated.

Impact: Emergency credentials could persist indefinitely without retrospective approval. A malicious operator could use emergency break-glass repeatedly as a governance bypass mechanism.

Recommendation:

  • Specify in specs/credential-governance.md that credentials issued via EmergencyBreakGlass MUST be automatically revoked if post-hoc approval is not obtained within the window.
  • Add a required escalation_action field (e.g., revoke, alert, suspend) to the emergency policy schema.

S-17: Merkle Chain Fork Detection Not Specified

Severity: LOW Attacker Profile: A5 (Insider with Merkle Access) Scope Area: Specification Security Review Location: specs/credential-governance.md:624-630

Description:

The audit chain uses previous_root linkage:

"Each anchor's previous_root field MUST reference the merkle_root of the immediately preceding anchor." (line 625)

This detects history rewriting (changing an existing anchor) but does not prevent forks: two anchors could both claim the same previous_root, creating a split-brain audit trail. The spec does not define:

  • A canonical sequence number for anchors.
  • A consensus mechanism for anchor creation.
  • Fork detection or resolution procedures.

The CreateAnchorRequest proto (proto/quartermaster/v1/notary.proto:18-22) includes etcd_revision which could serve as a sequence number, but its use is not specified (0 means not set).

Impact: If the NotaryService has multiple writers (e.g., multiple SPIRE servers in an HA deployment), concurrent anchor creation could fork the chain. Each fork is internally consistent but they diverge, making cross-fork audit queries unreliable.

Recommendation:

  • Specify that etcd_revision (or an equivalent monotonic counter) MUST be set and MUST be unique per anchor.
  • The NotaryService MUST reject CreateAnchor requests where previous_root does not match the current latest anchor's merkle_root (optimistic concurrency control).
  • Document fork detection: auditors should verify that no two anchors share the same previous_root.

INFORMATIONAL

S-18: CreateIntentRequest identity_claim Oneof Allows Unverifiable External Events

Severity: INFORMATIONAL Attacker Profile: A1 (Malicious Operator) Scope Area: Governance Integration Location: proto/quartermaster/v1/governance.proto:35-38

Description:

oneof identity_claim {
    string oidc_token = 5;
    ExternalEventClaim external_event = 6;
}

The ExternalEventClaim message (lines 45-50) includes a verification string field with no defined verification protocol:

message ExternalEventClaim {
    string source = 1;
    string event_id = 2;
    string event_type = 3;
    string verification = 4;
}

The GovernanceService must decide whether to trust an external event claim, but there is no standard for what verification should contain (HMAC, signature, URL to verify, etc.).

Impact: If the GovernanceService accepts external event claims without strong verification, a malicious operator can forge identity claims to create intents on behalf of other actors.

Recommendation:

  • Define a verification protocol for external event claims (e.g., HMAC with a shared secret, or a URL that returns verification status).
  • Add a verification_type field to ExternalEventClaim to disambiguate verification methods.
  • Consider requiring that external event claims always trigger at least SingleApproval ceremony classification.

DESIGN-NOTE

S-19: No mTLS Requirement in gRPC Client Configuration

Severity: DESIGN-NOTE Attacker Profile: A2 (Network Attacker) Scope Area: Key Management Location: pkg/governance/governance.go:44-49

Description:

func NewClient(cfg Config) (*Client, error) {
    if cfg.GovernanceAddr == "" {
        return nil, fmt.Errorf("governance: governance address is required")
    }
    // TODO: implement — establish gRPC connections with mTLS
    return &Client{config: cfg}, nil
}

The Config struct (lines 10-20) has GovernanceAddr, CeremonyAddr, and NotaryAddr but no TLS configuration fields (certificate paths, CA bundle, TLS required flag). The TODO comment mentions mTLS, but when this is implemented, if mTLS is made optional rather than required, a network attacker can intercept governance traffic.

Impact: If mTLS is not enforced, all trust boundaries TB-3, TB-4, and TB-5 in the trust diagram are vulnerable to man-in-the-middle attacks.

Recommendation:

  • Add TLSCertPath, TLSKeyPath, TLSCAPath to the governance Config struct.
  • Make TLS mandatory — NewClient should fail if TLS configuration is missing.
  • Use SPIFFE-aware TLS (SVID-based mTLS) for consistency with the SPIRE ecosystem.

S-20: Proto Files Lack Field Validation Annotations

Severity: DESIGN-NOTE Attacker Profile: N/A Scope Area: Governance Integration Location: All proto files under proto/

Description:

None of the 4 proto files use field validation annotations (e.g., buf/validate or protoc-gen-validate). Fields with implicit constraints have no documentation of valid ranges:

Proto File Field Expected Constraint
governance.proto:40 ttl_seconds > 0, <= max_intent_ttl
governance.proto:41 max_redemptions > 0, reasonable upper bound
ceremony.proto:43 required_approvals > 0, <= pool_size
ceremony.proto:44 ttl_hours > 0, reasonable upper bound
notary.proto:21 etcd_revision >= 0

Impact: Generated client/server code has no built-in validation. All validation must happen in application code, increasing the risk of missing checks.

Recommendation:

  • Add buf/validate annotations to proto fields with constraints.
  • This provides both documentation and generated validation code.
  • At minimum, add proto-level comments documenting valid ranges for each constrained field.

Attack Trees

Attack Tree 1: Forge SSH Certificate with Arbitrary Privileges

GOAL: Obtain SSH certificate with unauthorized privileges
├── 1. Exploit CertRequest TTL/Principal control [S-01] [CRITICAL]
│   ├── 1a. Compromised workload (A3) calls credential composer
│   │   ├── Set ValidSeconds = 31536000 (1 year)
│   │   ├── Set Principals = ["root", "deploy", "admin"]
│   │   └── RESULT: Long-lived cert with privileged principals
│   └── 1b. Malicious operator (A1) configures registration entry
│       ├── Set overly broad principal mapping
│       └── RESULT: Legitimate-looking but over-privileged certs
│
├── 2. Bypass OIDC attestation [S-02, S-08]
│   ├── 2a. Token confusion (A3) [S-02]
│   │   ├── Present token with wrong audience
│   │   ├── Verifier lacks audience check
│   │   └── RESULT: SVID for unintended workload identity
│   └── 2b. Malicious issuer (A1) [S-08]
│       ├── Configure issuer = "http://attacker.example.com"
│       ├── Serve attacker-controlled JWKS keys
│       └── RESULT: Arbitrary identity attestation
│
└── 3. Fabricate governance provenance [S-03, S-13]
    ├── 3a. Replay merkle proof (A5) [S-03]
    │   ├── Extract proof from legitimate cert C1
    │   ├── Embed in forged cert C2
    │   └── RESULT: C2 appears governed
    └── 3b. Fabricate intent ID (A1) [S-13]
        ├── Set governance-intent = random valid UUID
        ├── Passes format validation
        └── RESULT: Cert appears to have governance intent

Attack Tree 2: Bypass Governance for Credential Issuance

GOAL: Issue credentials without proper governance authorization
├── 1. Exploit TOCTOU window [S-06] [HIGH]
│   ├── 1a. Create intent under permissive policy (A1)
│   │   ├── Intent classified as Autonomous
│   │   ├── Policy updated to require QuorumApproval
│   │   ├── Redeem intent (still uses old classification)
│   │   └── RESULT: Credential issued without required quorum
│   └── 1b. Race condition during policy deployment
│       ├── Multiple intents created during transition
│       └── RESULT: Batch of under-governed credentials
│
├── 2. Suppress audit trail [S-04, S-15]
│   ├── 2a. DoS NotaryService (A2) [S-04]
│   │   ├── NotaryService unreachable
│   │   ├── Credentials proceed (fail-open audit)
│   │   ├── Local queue lost on pod restart
│   │   └── RESULT: Permanently unaudited credential operations
│   └── 2b. Exploit anchoring delay
│       ├── Issue credentials faster than epoch boundary
│       ├── Exceed 256 leaves per epoch [spec depth limit]
│       └── RESULT: Leaves dropped or epoch overflow
│
└── 3. Exploit emergency break-glass [S-16]
    ├── Trigger emergency condition (A1)
    ├── Credential issued without pre-approval
    ├── Post-hoc approval window expires
    ├── No automatic revocation specified
    └── RESULT: Ungoverned credential persists indefinitely

Attack Tree 3: Corrupt Audit Trail

GOAL: Compromise integrity of the governance audit trail
├── 1. Proof replay attack [S-03] [CRITICAL]
│   ├── 1a. Cross-credential replay (A5)
│   │   ├── Extract (merkle-root, merkle-proof) from cert C1
│   │   ├── Embed in crafted cert C2
│   │   ├── VerifyInclusion succeeds (proves C1's leaf, not C2's)
│   │   └── RESULT: C2 has false governance provenance
│   └── 1b. Cross-epoch replay
│       ├── Use proof from epoch N in cert issued at epoch N+1
│       ├── Governance-epoch mismatch may not be checked
│       └── RESULT: Stale governance state appears current
│
├── 2. Fork the merkle chain [S-17] [LOW]
│   ├── 2a. Concurrent anchor creation (A5)
│   │   ├── Two writers create anchors with same previous_root
│   │   ├── Chain forks into two valid branches
│   │   └── RESULT: Audit queries return inconsistent results
│   └── 2b. Reorder anchors
│       ├── etcd_revision = 0 (not set) per proto comment
│       ├── No sequence number to enforce ordering
│       └── RESULT: Anchors may be out of causal order
│
├── 3. DoS-induced audit gap [S-04, S-05]
│   ├── 3a. Overwhelm extension decoder (A3) [S-05]
│   │   ├── Craft cert with 100MB merkle-proof
│   │   ├── Audit verifier calls Decode()
│   │   ├── OOM kills verifier process
│   │   └── RESULT: Audit verification unavailable
│   └── 3b. NotaryService DoS (A2) [S-04]
│       ├── Sustained DoS during credential operations
│       └── RESULT: Audit gap for all operations during outage
│
└── 4. Fabricate governance metadata [S-13, S-18]
    ├── 4a. Fake intent ID [S-13]
    │   ├── governance-intent = random UUID
    │   └── RESULT: Cert appears governed when it wasn't
    └── 4b. Forge external event claim [S-18]
        ├── CreateIntent with fabricated ExternalEventClaim
        ├── GovernanceService may accept without strong verification
        └── RESULT: Intent created under false identity

Scope Area Deep Dives

1. OIDC Token Handling

Findings: S-02 (CRITICAL), S-08 (HIGH), S-14 (MEDIUM)

Files reviewed: pkg/oidc/oidc.go, cmd/oidc-attestor/plugin.go, test/fixtures/sample-oidc-token.json

The OIDC subsystem has three compounding issues:

  1. The Verifier interface (S-02) does not contractually require audience validation.
  2. The issuer URL (S-08) is not validated for HTTPS.
  3. The tenant_id claim (S-14) may be user-controllable in Keycloak.

Together, these create a layered authentication failure: even if one control is implemented correctly, the others provide bypass routes. The recommended fix order is: S-02 first (interface design, hardest to change later), then S-08 (configuration validation), then S-14 (trust model documentation).

2. SSH Certificate Generation

Findings: S-01 (CRITICAL), S-11 (MEDIUM)

Files reviewed: pkg/sshcert/sshcert.go, cmd/ssh-credential-composer/plugin.go, specs/spiffe-ssh-svid.md

The SSH certificate generation path has a fundamental interface design flaw (S-01) where the CertRequest struct allows callers to control security-critical fields. The SSH-SVID spec clearly defines TTL bounds (30s1h) and principal derivation (from SPIFFE ID), but the code enforces neither. When this stub is implemented, the Build() method MUST enforce these constraints server-side.

3. Shellstream Extensions

Findings: S-05 (HIGH), S-13 (MEDIUM)

Files reviewed: pkg/shellstream/shellstream.go, pkg/shellstream/shellstream_test.go, specs/shellstream-extensions.md

The Shellstream implementation is the most mature code in the repository (360 lines, extensive tests). The primary security concern is the lack of size limits on the decode path (S-05), which is a straightforward DoS vector. The spec recommends a 4 KB total extension limit (Section 9.4, line 441-445), and the merkle proof has a known maximum size of ~256 bytes — both provide clear bounds for validation.

The governance-intent format-only validation (S-13) is a design issue: the extension value looks correct syntactically but could be fabricated. This is mitigated if the credential composer sets it server-side and verifiers cross-check with the GovernanceService.

4. Governance Integration

Findings: S-06 (HIGH), S-07 (HIGH), S-10 (MEDIUM), S-18 (INFORMATIONAL), S-20 (DESIGN-NOTE)

Files reviewed: pkg/governance/governance.go, pkg/config/config.go, proto/quartermaster/v1/governance.proto, specs/credential-governance.md

The governance integration has the widest attack surface. The TOCTOU window (S-06) is an inherent challenge in any async authorization system with separate create/redeem steps. The spec should either re-evaluate policy at redemption time or bound the intent TTL to minimize the window. The GovernanceEpochSeconds default-to-zero (S-07) is a dangerous misconfiguration vector. The RedeemResult missing ExpiresAt (S-10) prevents SAT TTL enforcement in consumer code.

5. Key Management

Findings: S-09 (HIGH), S-19 (DESIGN-NOTE)

Files reviewed: cmd/substrate-keymanager/main.go, cmd/substrate-keymanager/plugin.go, pkg/governance/governance.go

Key management is entirely scaffolded. The most significant risk is the absence of the go-plugin serving pattern (S-09), which means there is no binary trust verification between SPIRE Server and the plugin. The mTLS requirement (S-19) for gRPC connections to Quartermaster services is noted in TODOs but not yet structurally enforced.

6. Configuration & Deployment

Findings: S-07 (HIGH), S-08 (HIGH), S-12 (MEDIUM)

Files reviewed: pkg/config/config.go, pkg/oidc/oidc.go, deploy/spire-server-config.yaml, deploy/spire-agent-config.yaml

Configuration validation is the weakest link in the current codebase. The PluginConfig.Validate() method checks only one of six fields (S-12). The oidc.NewVerifier checks only issuer non-emptiness (S-08). The GovernanceEpochSeconds field defaults to zero with no validation (S-07). These gaps mean that misconfigured deployments will start successfully but fail in unpredictable ways at runtime.

7. Test Fixtures

Findings: S-14 (MEDIUM)

Files reviewed: test/fixtures/sample-oidc-token.json, test/fixtures/sample-ssh-cert-extensions.json

The OIDC token fixture includes a tenant_id custom claim that, in a self-managed Keycloak deployment, could be user-controllable. This is primarily a trust model documentation issue — the fixture itself is correct for testing, but the trust assumption about where tenant_id comes from must be made explicit in the implementation.

8. Specification Security Review

Findings: S-03 (CRITICAL), S-04 (HIGH), S-15 (DESIGN-NOTE), S-16 (LOW), S-17 (LOW)

Files reviewed: All 3 specs — specs/spiffe-ssh-svid.md (Section 8), specs/shellstream-extensions.md (Section 9), specs/credential-governance.md (Sections 9-10)

The specifications are well-written with dedicated security sections. The SSH-SVID spec (Section 8, 7 subsections) covers attestation trust, SPIFFE ID visibility, mTLS, socket protection, rate limiting, CA key compromise, and clock skew. The Shellstream spec (Section 9, 5 subsections) covers extension visibility, authorization boundary, parsing safety, size constraints, and replay/freshness. The Credential Governance spec (Section 10, 5 subsections) covers fail-closed/fail-open modes, ceremony timeout, and idempotency.

The critical gap is the merkle proof binding (S-03): the proof is not tied to the specific certificate. The other spec-level findings (S-16, S-17) are edge cases in the emergency and chain fork scenarios that should be addressed but are not urgent.


Appendix A: Attacker Profile × Finding Matrix

Cells marked with severity initial: C=CRITICAL, H=HIGH, M=MEDIUM, L=LOW, I=INFO, D=DESIGN-NOTE. Empty = not exploitable by this profile.

Finding A1 Malicious Operator A2 Network Attacker A3 Compromised Workload A4 Rogue Plugin A5 Merkle Insider
S-01 C C
S-02 C
S-03 C
S-04 H
S-05 H H
S-06 H
S-07 H
S-08 H
S-09 H
S-10 M M
S-11 M
S-12 M
S-13 M M
S-14 M M
S-15 D
S-16 L
S-17 L
S-18 I
S-19 D
S-20

Summary by attacker profile:

  • A1 (Malicious Operator): 9 findings exploitable. Highest concentration of risk.
  • A2 (Network Attacker): 3 findings exploitable. Primary vector is service DoS.
  • A3 (Compromised Workload): 5 findings exploitable. Primary vector is cert/token manipulation.
  • A4 (Rogue Plugin): 1 finding exploitable. High impact but narrow vector.
  • A5 (Merkle Insider): 3 findings exploitable. Primary vector is audit trail manipulation.

Appendix B: Trust Boundary Inventory

ID Boundary From To Protocol Authentication Encryption Status
TB-1 Workload API Workload SPIRE Agent Unix Socket Kernel PID attestation N/A (local) Specified in SSH-SVID spec Section 8.4
TB-2 SPIRE Node API SPIRE Agent SPIRE Server gRPC mTLS with X.509-SVIDs TLS 1.2+ Specified in SSH-SVID spec Section 8.3
TB-3 Governance API SPIRE Server GovernanceService gRPC mTLS (TODO) TLS (TODO) S-19: No TLS config in code
TB-4 Ceremony API SPIRE Server CeremonyService gRPC mTLS (TODO) TLS (TODO) S-19: No TLS config in code
TB-5 Notary API SPIRE Server NotaryService gRPC mTLS (TODO) TLS (TODO) S-19: No TLS config in code
TB-6 SSH Auth Workload SSH Server SSH Certificate-based SSH encryption Spec-compliant (standard OpenSSH)
TB-7 Token File Pod filesystem Workload File I/O Filesystem permissions N/A Specified in agent config: /var/run/secrets/oidc/token
TB-8 Plugin IPC SPIRE Server Plugin Binary go-plugin/gRPC Magic cookie handshake (TODO) Local S-09: No handshake implemented

Appendix C: Specification Security Coverage

SSH-SVID Specification (specs/spiffe-ssh-svid.md, Section 8)

Topic Covered CWE Reference Assessment
Attestation trust Section 8.1 CWE-287 (Improper Authentication) Thorough. Recommends k8s attestation, warns about PID-based.
Identity visibility Section 8.2 CWE-200 (Exposure of Sensitive Information) Good. Notes SPIFFE ID in cleartext during SSH handshake.
Transport security Section 8.3 CWE-319 (Cleartext Transmission) Good. Requires mTLS for Agent-Server.
Socket protection Section 8.4 CWE-732 (Incorrect Permission Assignment) Good. Requires filesystem permissions + kernel attestation.
Rate limiting Section 8.5 CWE-770 (Allocation Without Limits) Good. Recommends 60 issuances/min/SPIFFE-ID.
CA compromise Section 8.6 CWE-321 (Use of Hard-coded Cryptographic Key) Good. Recommends HSM, monitoring, rotation.
Clock skew Section 8.7 CWE-613 (Insufficient Session Expiration) Good. 60-second tolerance.
Certificate TTL enforcement NOT COVERED CWE-613 GAP: Spec defines bounds (S-01) but no server-side enforcement guidance.
Principal restriction NOT COVERED CWE-269 (Improper Privilege Management) GAP: Spec derives principal from SPIFFE ID but doesn't restrict additional principals.

Shellstream Extensions Specification (specs/shellstream-extensions.md, Section 9)

Topic Covered CWE Reference Assessment
Extension visibility Section 9.1 CWE-200 Good. "MUST NOT embed secrets."
Authorization boundary Section 9.2 CWE-863 (Incorrect Authorization) Excellent. "Merkle proofs provide auditability, not authorization."
Parsing safety Section 9.3 CWE-94 (Improper Control of Code Generation) Good. "MUST NOT evaluate as code."
Size constraints Section 9.4 CWE-770 Partial. Recommends 4 KB limit but uses SHOULD, not MUST.
Replay/freshness Section 9.5 CWE-294 (Authentication Bypass by Capture-replay) Partial. Relies on cert validity period. No proof-to-cert binding.
Proof binding NOT COVERED CWE-345 (Insufficient Verification of Data Authenticity) GAP: S-03 — proof is not bound to specific certificate content.

Credential Governance Specification (specs/credential-governance.md, Sections 9-10)

Topic Covered CWE Reference Assessment
Merkle anchoring Section 9.1 CWE-354 (Improper Validation of Integrity Check) Good. Detailed merkle construction with domain separation.
Certificate-embedded audit Section 9.2 CWE-778 (Insufficient Logging) Good. Defines required extensions.
Audit queries Section 9.3 CWE-778 Good. Three verification patterns defined.
Chain continuity Section 9.4 CWE-354 Good. Append-only with previous_root linkage.
Fail-closed auth Section 10.1 CWE-636 (Not Failing Securely) Excellent. GovernanceService unreachable = operation fails.
Ceremony timeout Section 10.2 CWE-613 Good. Timeout = denial.
Fail-open audit Section 10.3 CWE-778 Acceptable tradeoff, well-documented rationale.
Policy missing Section 10.4 CWE-636 Good. Default to SingleApproval.
TOCTOU NOT COVERED CWE-367 (TOCTOU Race Condition) GAP: S-06 — policy re-evaluation at redemption not specified.
Chain fork NOT COVERED CWE-362 (Concurrent Execution Using Shared Resource) GAP: S-17 — no fork prevention mechanism.
Break-glass expiry PARTIAL CWE-613 GAP: S-16 — post-hoc approval expiry action undefined.

End of security audit report.