Critical fixes: - F-01: SatScope array form support (single pointer → slice with polymorphic JSON) - F-02: Add governance-intent@guildhouse.dev as 10th Shellstream extension - F-06: Replace os.Exit(1) stubs with go-plugin Serve() boilerplate in all cmd/ - F-13: Validate SatScope.ResourcePattern is non-empty High priority: - F-03: Add normative Accord policy syntax note to credential-governance.md §8.2 - F-04: Replace OID XXXXX placeholder with explicit PEN reference and IANA TODO - F-05: Document CredentialComposer hook mapping in spec and plugin-types.md - F-07/F-08: Commit CI pipeline (.github/workflows/ci.yaml) - F-09: Add hashicorp/go-plugin v1.6.3 to go.mod Medium priority: - F-10: Wire sample-ssh-cert-extensions.json fixture into shellstream tests - F-11: Cross-reference merkle proof depth limit (256 leaves) in governance spec - F-12: Add YAML format clarification headers to deploy configs - F-14: Expand README with project status, docs links, and quick-start Low priority: - F-15: Standardize "SSH SVID" → "SSH-SVID" terminology across docs - F-16: Add GovernanceEpochSeconds to PluginConfig and deploy configs - F-17: Add troubleshooting section to deployment.md, error handling to OIDC docs Global: Rename all extension keys from @guildhouse.io to @guildhouse.dev Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
754 lines
46 KiB
Markdown
754 lines
46 KiB
Markdown
# Credential Governance Specification
|
|
|
|
**Version:** 0.1.0-draft
|
|
**Date:** 2026-02-18
|
|
**Authors:** Guildhouse Cooperative
|
|
|
|
---
|
|
|
|
## 1. Abstract
|
|
|
|
This specification defines the integration between SPIRE credential lifecycle events and the Guildhouse governance framework. Credential operations (issuance, rotation, revocation) are modeled as governed mutations, subject to Accord policy classification, optional ceremony approval, and merkle-anchored audit recording. The goal is to bring every credential operation under a unified governance model that provides authorization control, multi-stakeholder approval where required, and an immutable, verifiable audit trail.
|
|
|
|
## 2. Status
|
|
|
|
**Draft specification.** This document is a working draft and is subject to change. It has not yet been ratified by the Guildhouse governance body. Normative requirements use the key words defined in RFC 2119.
|
|
|
|
### 2.1 Key Words
|
|
|
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).
|
|
|
|
## 3. Terminology
|
|
|
|
**Credential Event**
|
|
A discrete lifecycle operation on a credential resource: issuance, rotation, or revocation. Each credential event is the atomic unit of governance for this specification.
|
|
|
|
**Governed Mutation**
|
|
Any state change within the Guildhouse platform that is subject to intent authorization, policy evaluation, and audit recording. A credential event becomes a governed mutation when it enters the governance flow.
|
|
|
|
**MutationIntent**
|
|
An authorization request registered with the GovernanceService (`quartermaster.v1.GovernanceService.CreateIntent`). Represents a declared intention to perform a mutation. An intent transitions through states: `pending` -> `ceremony_pending` | `authorized` -> `redeemed` | `expired` | `denied`.
|
|
|
|
**MutationEnvelope**
|
|
The universal wrapper for governed mutations. Contains the domain-separated hash of the canonicalized event payload, actor identity, timestamp, and tenant context. Serves as the merkle leaf input for audit anchoring.
|
|
|
|
**SAT (Substrate Attestation Token)**
|
|
The authorization token issued upon successful intent redemption (`quartermaster.v1.SatToken`). Carries `SatScopeMsg` entries specifying `registry_type`, `verbs`, and `resource_pattern`. The SAT is the bearer credential that authorizes the actual credential operation to proceed.
|
|
|
|
**Accord Policy**
|
|
A declarative policy evaluated by the Accord policy engine that classifies mutations and determines their ceremony requirements. Accord policies map `(registry_type, verb, artifact_scope)` tuples to ceremony classification levels.
|
|
|
|
**Ceremony**
|
|
A multi-stakeholder approval flow managed by the CeremonyService (`bascule.v1.CeremonyService`). Ceremonies require one or more identified approvers to authorize a mutation before the corresponding intent becomes redeemable.
|
|
|
|
**Merkle Anchor**
|
|
An immutable record created by the NotaryService (`quartermaster.v1.NotaryService.CreateAnchor`) that commits a batch of mutation envelope hashes into a merkle tree. Each anchor contains a `merkle_root`, `previous_root`, and the tree's leaf data.
|
|
|
|
**Governance Epoch**
|
|
A time-bounded interval during which merkle leaves accumulate before being committed in an anchor. The epoch boundary triggers anchor creation. Epoch duration is deployment-configurable.
|
|
|
|
**Trust Domain**
|
|
A SPIFFE trust domain (e.g., `spiffe://guildhouse.io`) that defines the boundary of identity authority. Cross-trust-domain credential operations carry elevated governance requirements.
|
|
|
|
**Credential Lifecycle**
|
|
The complete sequence of states a credential passes through: creation (issuance), active use, rotation (replacement), and termination (revocation or expiry). Each state transition that involves an active operation maps to a credential event.
|
|
|
|
## 4. Introduction
|
|
|
|
Traditional credential management -- certificates, SSH keys, database passwords, API tokens -- treats lifecycle operations as administrative actions executed by privileged operators with no formal governance beyond access control. In a multi-tenant, multi-stakeholder environment, credential operations carry security implications that extend beyond the immediate operator:
|
|
|
|
- An SSH certificate minted for a workload in Tenant A may grant access to shared infrastructure.
|
|
- A database credential rotation affects downstream services that depend on the credential.
|
|
- An emergency revocation of a CA signing key impacts every workload in the trust domain.
|
|
|
|
These operations require varying levels of scrutiny. Routine certificate issuance for short-lived workload identities needs no human approval, while CA key rotation demands multi-stakeholder consensus.
|
|
|
|
This specification bridges SPIRE's credential lifecycle operations with the Guildhouse governance model by:
|
|
|
|
1. Defining a canonical schema for each credential event type.
|
|
2. Mapping credential events to governed mutations via the GovernanceService intent lifecycle.
|
|
3. Leveraging Accord policy to classify events into ceremony tiers.
|
|
4. Anchoring every credential event in the NotaryService merkle tree for immutable audit.
|
|
|
|
The result is that every credential operation -- from the most routine certificate mint to an emergency CA revocation -- passes through a uniform governance pipeline with policy-appropriate controls and a verifiable audit trail.
|
|
|
|
## 5. Credential Events
|
|
|
|
Each credential lifecycle operation maps to a typed credential event. Events are the input to the governance pipeline and the payload of the resulting MutationEnvelope.
|
|
|
|
### 5.1 Issue
|
|
|
|
An issue event occurs when a new credential is created. Examples include: an SSH certificate minted by the ssh-credential-composer, a database credential provisioned by the substrate-keymanager, or an OIDC token issued by the oidc-attestor.
|
|
|
|
**Canonical Fields:**
|
|
|
|
| Field | Type | Required | Description |
|
|
|----------------------|----------|----------|---------------------------------------------------------------|
|
|
| `event_type` | string | REQUIRED | Fixed value: `"issue"` |
|
|
| `credential_type` | string | REQUIRED | Type identifier (e.g., `"ssh_user_cert"`, `"db_password"`, `"x509_svid"`) |
|
|
| `subject_spiffe_id` | string | REQUIRED | SPIFFE ID of the workload receiving the credential |
|
|
| `tenant_id` | string | REQUIRED | UUID of the owning tenant |
|
|
| `scope` | string | REQUIRED | Scope descriptor (e.g., host pattern, database name) |
|
|
| `requestor_identity` | string | REQUIRED | SPIFFE ID or OIDC subject of the entity requesting issuance |
|
|
| `credential_id` | string | REQUIRED | Unique identifier assigned to the new credential |
|
|
| `ttl_seconds` | uint32 | REQUIRED | Requested time-to-live for the credential |
|
|
| `metadata` | object | OPTIONAL | Additional type-specific metadata (e.g., key algorithm, extensions) |
|
|
|
|
**Example (JCS-canonicalized):**
|
|
|
|
```json
|
|
{"credential_id":"cred-a1b2c3","credential_type":"ssh_user_cert","event_type":"issue","metadata":{"extensions":["permit-pty"],"key_algorithm":"ed25519"},"requestor_identity":"spiffe://guildhouse.io/ns/platform/sa/operator","scope":"*.staging.internal","subject_spiffe_id":"spiffe://guildhouse.io/ns/tenant-acme/sa/web-server","tenant_id":"f47ac10b-58cc-4372-a567-0e02b2c3d479","ttl_seconds":3600}
|
|
```
|
|
|
|
### 5.2 Rotate
|
|
|
|
A rotate event occurs when an existing credential is replaced with a new one. Rotation may be scheduled (automated lifecycle), manual (operator-initiated), or emergency (compromise response).
|
|
|
|
**Canonical Fields:**
|
|
|
|
| Field | Type | Required | Description |
|
|
|------------------------|----------|----------|-----------------------------------------------------------------|
|
|
| `event_type` | string | REQUIRED | Fixed value: `"rotate"` |
|
|
| `old_credential_id` | string | REQUIRED | Identifier of the credential being replaced |
|
|
| `new_credential_type` | string | REQUIRED | Type identifier for the replacement credential |
|
|
| `subject_spiffe_id` | string | REQUIRED | SPIFFE ID of the workload whose credential is rotating |
|
|
| `tenant_id` | string | REQUIRED | UUID of the owning tenant |
|
|
| `rotation_reason` | string | REQUIRED | One of: `"scheduled"`, `"manual"`, `"compromised"` |
|
|
| `requestor_identity` | string | REQUIRED | SPIFFE ID or OIDC subject of the entity requesting rotation |
|
|
| `new_credential_id` | string | REQUIRED | Unique identifier assigned to the replacement credential |
|
|
| `metadata` | object | OPTIONAL | Additional type-specific metadata |
|
|
|
|
**Example (JCS-canonicalized):**
|
|
|
|
```json
|
|
{"event_type":"rotate","metadata":{"key_algorithm":"ed25519"},"new_credential_id":"cred-d4e5f6","new_credential_type":"ssh_user_cert","old_credential_id":"cred-a1b2c3","requestor_identity":"spiffe://guildhouse.io/ns/platform/sa/rotation-controller","rotation_reason":"scheduled","subject_spiffe_id":"spiffe://guildhouse.io/ns/tenant-acme/sa/web-server","tenant_id":"f47ac10b-58cc-4372-a567-0e02b2c3d479"}
|
|
```
|
|
|
|
### 5.3 Revoke
|
|
|
|
A revoke event occurs when a credential is invalidated before its natural expiry. Revocation is an irreversible operation within a governance epoch.
|
|
|
|
**Canonical Fields:**
|
|
|
|
| Field | Type | Required | Description |
|
|
|------------------------|----------|----------|-----------------------------------------------------------------|
|
|
| `event_type` | string | REQUIRED | Fixed value: `"revoke"` |
|
|
| `credential_id` | string | REQUIRED | Identifier of the credential being revoked |
|
|
| `credential_type` | string | REQUIRED | Type identifier of the credential being revoked |
|
|
| `subject_spiffe_id` | string | REQUIRED | SPIFFE ID of the workload whose credential is being revoked |
|
|
| `tenant_id` | string | REQUIRED | UUID of the owning tenant |
|
|
| `revocation_reason` | string | REQUIRED | Human-readable reason for revocation |
|
|
| `requestor_identity` | string | REQUIRED | SPIFFE ID or OIDC subject of the entity requesting revocation |
|
|
| `metadata` | object | OPTIONAL | Additional context (e.g., incident ID, CVE reference) |
|
|
|
|
**Example (JCS-canonicalized):**
|
|
|
|
```json
|
|
{"credential_id":"cred-a1b2c3","credential_type":"ssh_user_cert","event_type":"revoke","metadata":{"incident_id":"INC-2026-0042"},"requestor_identity":"spiffe://guildhouse.io/ns/platform/sa/security-responder","revocation_reason":"Private key compromised per INC-2026-0042","subject_spiffe_id":"spiffe://guildhouse.io/ns/tenant-acme/sa/web-server","tenant_id":"f47ac10b-58cc-4372-a567-0e02b2c3d479"}
|
|
```
|
|
|
|
### 5.4 Schema Validation
|
|
|
|
All credential event payloads MUST conform to the canonical field definitions above. Implementations MUST reject events with missing REQUIRED fields. Implementations MUST ignore unknown fields during canonicalization but SHOULD preserve them in non-canonical storage for forward compatibility.
|
|
|
|
## 6. Governance Integration Flow
|
|
|
|
This section defines the end-to-end flow by which a credential lifecycle event becomes a governed mutation.
|
|
|
|
### 6.1 Flow Overview
|
|
|
|
```
|
|
Credential Lifecycle Event
|
|
|
|
|
v
|
|
+----------------------------+
|
|
| governance-notifier plugin |
|
|
| (intercepts event) |
|
|
+----------------------------+
|
|
|
|
|
Step 2: CreateIntentRequest
|
|
|
|
|
v
|
|
+----------------------------+
|
|
| GovernanceService |
|
|
| (evaluates Accord policy)|
|
|
+----------------------------+
|
|
/ \
|
|
no ceremony ceremony required
|
|
required (ceremony_id returned)
|
|
| |
|
|
| Step 5: Ceremony
|
|
| |
|
|
| v
|
|
| +----------------------------+
|
|
| | CeremonyService |
|
|
| | (approval flow) |
|
|
| +----------------------------+
|
|
| / \
|
|
| approved denied
|
|
| | |
|
|
| | v
|
|
| | Operation blocked.
|
|
| | Intent denied.
|
|
v v
|
|
Step 6: RedeemIntent
|
|
|
|
|
v
|
|
+----------------------------+
|
|
| SAT issued |
|
|
| Credential op proceeds |
|
|
+----------------------------+
|
|
|
|
|
Step 7: MutationEnvelope
|
|
|
|
|
v
|
|
+----------------------------+
|
|
| NotaryService |
|
|
| (merkle anchoring) |
|
|
+----------------------------+
|
|
|
|
|
v
|
|
Audit record committed.
|
|
```
|
|
|
|
### 6.2 Step 1: Event Interception
|
|
|
|
The `governance-notifier` plugin (running as a SPIRE Server plugin or sidecar) intercepts credential lifecycle events at the point of origin. The plugin MUST intercept the event before the credential operation is executed. The event is the trigger for the governance flow; the credential operation is deferred until governance authorization completes.
|
|
|
|
For SSH certificate issuance, the interception point is the `ssh-credential-composer` plugin's `MintCredential` path. For credential rotation, it is the rotation controller's scheduling loop. For revocation, it is the revocation API endpoint.
|
|
|
|
### 6.3 Step 2: Intent Creation
|
|
|
|
The governance-notifier plugin constructs a `CreateIntentRequest` with the following field mappings:
|
|
|
|
| `CreateIntentRequest` field | Value |
|
|
|-----------------------------|-----------------------------------------------------------------------|
|
|
| `registry_type` | `"credential"` |
|
|
| `verb` | One of: `"issue"`, `"rotate"`, `"revoke"` |
|
|
| `artifact_scope` | JCS-canonicalized JSON of the credential event payload (Section 5) |
|
|
| `tenant_id` | The `tenant_id` from the credential event |
|
|
| `identity_claim` | `oidc_token` from OIDC attestation, or `external_event` for automated operations |
|
|
| `ttl_seconds` | Plugin-configurable; SHOULD default to 300 seconds |
|
|
| `max_redemptions` | `1` (credential events are single-use) |
|
|
| `idempotency_key` | `SHA-256(registry_type + ":" + verb + ":" + credential_id)` |
|
|
|
|
The plugin MUST set `max_redemptions` to `1` for all credential events. Credential operations are not idempotent at the infrastructure level (a second SSH certificate mint produces a distinct certificate), and the intent MUST NOT be reusable.
|
|
|
|
The `idempotency_key` ensures that duplicate event deliveries (e.g., from retry logic) map to the same intent rather than creating parallel authorization flows.
|
|
|
|
### 6.4 Step 3: Accord Policy Evaluation
|
|
|
|
The GovernanceService evaluates the intent against the Accord policy engine. The policy receives the `(registry_type, verb, artifact_scope)` tuple and returns a classification that determines the ceremony requirement. Policy evaluation is synchronous with the `CreateIntent` call.
|
|
|
|
If the Accord policy denies the operation outright (e.g., a forbidden credential type for the tenant), the `CreateIntentResponse` MUST have `denied = true` and `denial_reason` populated. The plugin MUST abort the credential operation.
|
|
|
|
### 6.5 Step 4a: No Ceremony Required (Autonomous / SelfGrant)
|
|
|
|
If the Accord policy classifies the event as `Autonomous` or `SelfGrant`, the `CreateIntentResponse` returns with `ceremony_id` empty and the intent in `authorized` status. The plugin proceeds directly to Step 6 (RedeemIntent).
|
|
|
|
For `SelfGrant` classification, the GovernanceService records the requestor's identity as the implicit approver. No external approval action is required, but the self-grant is recorded in the audit trail.
|
|
|
|
### 6.6 Step 4b: Ceremony Required
|
|
|
|
If the Accord policy classifies the event as `SingleApproval`, `QuorumApproval`, or `EmergencyBreakGlass`, the `CreateIntentResponse` returns with a non-empty `ceremony_id`. The intent status is `ceremony_pending` and cannot be redeemed until the ceremony resolves.
|
|
|
|
### 6.7 Step 5: Ceremony Approval Flow
|
|
|
|
When a ceremony is required, the following sub-flow executes:
|
|
|
|
1. The GovernanceService has already called `CeremonyService.CreateCeremony` internally during intent creation.
|
|
2. The governance-notifier plugin polls or streams the ceremony status via `CeremonyService.GetCeremony` using the `ceremony_id`.
|
|
3. Eligible approvers are notified through the platform's notification system (out of scope for this specification).
|
|
4. Approvers call `CeremonyService.ApproveCeremony` (or `DenyCeremony`) with their authenticated identity.
|
|
5. When the ceremony reaches its approval threshold:
|
|
- **Approved:** The GovernanceService transitions the intent from `ceremony_pending` to `authorized`. The plugin proceeds to Step 6.
|
|
- **Denied:** The GovernanceService transitions the intent to `denied`. The plugin MUST abort the credential operation and return an error to the caller.
|
|
|
|
The plugin MUST implement a configurable timeout for ceremony polling. If the ceremony does not resolve within the timeout, the plugin MUST treat this as a denial and abort the credential operation. The RECOMMENDED default timeout is 600 seconds (10 minutes) for interactive operations and 3600 seconds (1 hour) for operations that can tolerate asynchronous approval.
|
|
|
|
### 6.8 Step 6: Intent Redemption
|
|
|
|
The plugin calls `GovernanceService.RedeemIntent` with the `intent_id` received in Step 2.
|
|
|
|
On success, the `RedeemIntentResponse` contains a `SatToken` with:
|
|
|
|
- `bearer_svid`: The SPIFFE ID authorized to perform the credential operation.
|
|
- `scopes`: A `SatScopeMsg` entry with `registry_type = "credential"`, `verbs = ["{issue|rotate|revoke}"]`, and `resource_pattern` matching the credential event scope.
|
|
- `issued_at` / `expires_at`: The SAT validity window.
|
|
- `sat_hash`: The cryptographic binding of the SAT contents.
|
|
|
|
The plugin MUST verify that the SAT's `bearer_svid` matches the plugin's own SVID. The plugin MUST NOT proceed with the credential operation if the SAT has expired (`expires_at < now()`).
|
|
|
|
### 6.9 Step 7: MutationEnvelope Construction and Anchoring
|
|
|
|
After the credential operation completes successfully, the plugin constructs a MutationEnvelope (Section 7) and submits it to the NotaryService for merkle anchoring.
|
|
|
|
The plugin MUST construct the envelope after the credential operation succeeds, not before. The envelope records the fact that the operation occurred, not merely that it was authorized.
|
|
|
|
The plugin calls `NotaryService.CreateAnchor` with the envelope's merkle leaf hash. If the NotaryService batches leaves within governance epochs, the leaf is queued for the next anchor. The plugin SHOULD NOT block the credential operation on anchor confirmation (Section 10 defines the retry semantics for unreachable NotaryService).
|
|
|
|
## 7. MutationEnvelope Construction
|
|
|
|
This section defines the deterministic process for constructing a MutationEnvelope from a credential event.
|
|
|
|
### 7.1 Payload Canonicalization
|
|
|
|
The credential event payload (as defined in Section 5) MUST be serialized using [RFC 8785 JSON Canonicalization Scheme (JCS)](https://www.rfc-editor.org/rfc/rfc8785). JCS produces deterministic JSON output with the following properties:
|
|
|
|
- Object keys are sorted lexicographically by Unicode code point.
|
|
- No insignificant whitespace.
|
|
- Numbers are serialized in their shortest form without trailing zeros.
|
|
- Strings use minimal escape sequences.
|
|
|
|
Implementations MUST use a JCS-compliant serializer. Hand-rolled JSON serialization is not acceptable, as subtle differences (e.g., Unicode normalization, number formatting) break hash determinism.
|
|
|
|
The JCS output is the `jcs_bytes` value used in subsequent steps.
|
|
|
|
### 7.2 Domain Separation
|
|
|
|
The payload hash uses domain separation to prevent cross-protocol hash collisions. The hash input is constructed as:
|
|
|
|
```
|
|
payload_hash = SHA-256("guildhouse.credential.v1:" || jcs_bytes)
|
|
```
|
|
|
|
Where:
|
|
|
|
- `"guildhouse.credential.v1:"` is the domain separation prefix, encoded as UTF-8 bytes. The trailing colon is part of the prefix.
|
|
- `||` denotes byte concatenation.
|
|
- `jcs_bytes` is the JCS-canonicalized event payload from Section 7.1.
|
|
|
|
The domain prefix `guildhouse.credential.v1` is specific to this specification. Other governed mutation types (e.g., registry mutations, configuration changes) use their own domain prefixes. This ensures that a credential event payload cannot collide with a registry mutation payload even if they happen to contain identical JSON.
|
|
|
|
The `payload_hash` is encoded as lowercase hexadecimal for inclusion in the envelope.
|
|
|
|
### 7.3 Envelope Fields
|
|
|
|
The MutationEnvelope is a JSON object with the following fields:
|
|
|
|
| Field | Type | Description |
|
|
|----------------|--------|--------------------------------------------------------------|
|
|
| `domain` | string | Fixed value: `"guildhouse.credential.v1"` |
|
|
| `payload_hash` | string | Lowercase hex-encoded SHA-256 from Section 7.2 |
|
|
| `timestamp` | string | RFC 3339 timestamp with UTC offset (`Z` suffix) |
|
|
| `actor_svid` | string | SPIFFE ID of the entity that performed the credential operation |
|
|
| `tenant_id` | string | UUID of the owning tenant |
|
|
| `event_type` | string | One of: `"issue"`, `"rotate"`, `"revoke"` |
|
|
| `intent_id` | string | The intent ID from the governance flow |
|
|
| `sat_hash` | string | Lowercase hex-encoded hash of the SAT that authorized the operation |
|
|
|
|
**Example:**
|
|
|
|
```json
|
|
{
|
|
"domain": "guildhouse.credential.v1",
|
|
"payload_hash": "a3f2b8c1d4e5f67890abcdef1234567890abcdef1234567890abcdef12345678",
|
|
"timestamp": "2026-02-18T14:30:00Z",
|
|
"actor_svid": "spiffe://guildhouse.io/ns/platform/sa/ssh-credential-composer",
|
|
"tenant_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
|
|
"event_type": "issue",
|
|
"intent_id": "intent-x7y8z9",
|
|
"sat_hash": "b4c3d2e1f0a9876543210fedcba9876543210fedcba9876543210fedcba98765"
|
|
}
|
|
```
|
|
|
|
### 7.4 Merkle Leaf Construction
|
|
|
|
The merkle leaf is derived from the envelope itself:
|
|
|
|
1. The envelope JSON object (Section 7.3) is serialized using JCS (RFC 8785).
|
|
2. The JCS output is hashed: `leaf_hash = SHA-256(jcs_envelope_bytes)`.
|
|
3. The `leaf_hash` is submitted to `NotaryService.CreateAnchor`.
|
|
|
|
Note: The merkle leaf hash does NOT use domain separation. Domain separation is applied at the payload level (Section 7.2). The envelope is already scoped by its `domain` field, and the leaf is always interpreted in the context of the governance merkle tree.
|
|
|
|
### 7.5 Determinism Guarantee
|
|
|
|
Given identical inputs (same credential event payload, same timestamp, same actor), the MutationEnvelope construction MUST produce identical output bytes and identical leaf hashes. Implementations MUST NOT include non-deterministic data (e.g., random nonces, system-local timestamps with varying precision) in the envelope.
|
|
|
|
The `timestamp` field MUST be truncated to whole seconds (no fractional seconds) to ensure cross-implementation consistency.
|
|
|
|
## 8. Ceremony Classification
|
|
|
|
The Accord policy engine classifies credential events into ceremony tiers based on the event type, credential type, scope, and tenant context. This section defines the standard classification levels and their intended use.
|
|
|
|
### 8.1 Classification Levels
|
|
|
|
#### 8.1.1 Autonomous
|
|
|
|
No human approval is required. The GovernanceService authorizes the intent immediately upon policy evaluation.
|
|
|
|
**Intended use:**
|
|
- Routine short-lived SSH certificate issuance (TTL <= 8 hours)
|
|
- Scheduled credential rotation with `rotation_reason = "scheduled"`
|
|
- X.509 SVID minting for registered workloads
|
|
|
|
**Rationale:** These operations are high-frequency, low-risk, and automated. Requiring human approval would create an operational bottleneck without meaningful security benefit.
|
|
|
|
#### 8.1.2 SelfGrant
|
|
|
|
The requestor implicitly approves their own operation. No external approval is required, but the self-grant is recorded distinctly in the audit trail.
|
|
|
|
**Intended use:**
|
|
- Manual credential rotation (`rotation_reason = "manual"`) for services the requestor owns
|
|
- Database credential provisioning for the requestor's own tenant
|
|
- Long-lived SSH certificate issuance (TTL > 8 hours, <= 30 days)
|
|
|
|
**Rationale:** The requestor has legitimate authority over these operations, but the explicit self-grant classification enables audit differentiation from fully automated operations.
|
|
|
|
#### 8.1.3 SingleApproval
|
|
|
|
One approver (distinct from the requestor) MUST authorize the operation via the CeremonyService.
|
|
|
|
**Intended use:**
|
|
- Cross-tenant credential access (credential scope spans multiple tenants)
|
|
- Non-standard certificate parameters (unusual extensions, elevated privileges)
|
|
- SSH certificate issuance with TTL > 30 days
|
|
|
|
**Rationale:** These operations have elevated risk or cross-boundary implications that warrant a second pair of eyes.
|
|
|
|
#### 8.1.4 QuorumApproval
|
|
|
|
Multiple approvers MUST authorize the operation. The quorum size is policy-configurable (default: 2 of 3).
|
|
|
|
**Intended use:**
|
|
- CA signing key rotation
|
|
- Emergency credential revocation of high-value credentials
|
|
- Cross-trust-domain credential operations
|
|
- Bulk credential revocation (> 10 credentials in a single operation)
|
|
|
|
**Rationale:** These operations have platform-wide or cross-domain impact. Multi-stakeholder consensus prevents unilateral action.
|
|
|
|
#### 8.1.5 EmergencyBreakGlass
|
|
|
|
The operation proceeds immediately without prior approval. A ceremony is created post-hoc and MUST be approved within a configured window (default: 24 hours). If post-hoc approval is not obtained, an escalation alert is generated.
|
|
|
|
**Intended use:**
|
|
- Active compromise response requiring immediate credential revocation
|
|
- Emergency CA key rotation during a security incident
|
|
- Time-critical credential operations where approval delay would cause greater harm than the operation itself
|
|
|
|
**Rationale:** Security incidents cannot wait for approval flows. The break-glass mechanism balances immediate response with accountability by requiring after-the-fact justification.
|
|
|
|
### 8.2 Policy Syntax
|
|
|
|
> **Normative Note:** The Accord policy syntax defined in this section is the authoritative reference for credential governance policy evaluation until a standalone Accord Policy specification (`specs/accord-policy.md`) is published. Implementations MUST accept policies conforming to this syntax. Future revisions of this specification will replace this section with a normative reference to the standalone spec.
|
|
|
|
The following Accord policy file defines credential governance rules:
|
|
|
|
```yaml
|
|
# accord-policy/credential-governance.yaml
|
|
apiVersion: accord.guildhouse.io/v1
|
|
kind: CredentialGovernancePolicy
|
|
metadata:
|
|
name: default-credential-policy
|
|
tenant: "*" # applies to all tenants unless overridden
|
|
|
|
rules:
|
|
# Routine SSH certificate issuance — no approval needed
|
|
- match:
|
|
registry_type: credential
|
|
verb: issue
|
|
credential_type: ssh_user_cert
|
|
conditions:
|
|
ttl_seconds_lte: 28800 # <= 8 hours
|
|
classification: Autonomous
|
|
|
|
# Long-lived SSH certificates — self-grant
|
|
- match:
|
|
registry_type: credential
|
|
verb: issue
|
|
credential_type: ssh_user_cert
|
|
conditions:
|
|
ttl_seconds_gt: 28800
|
|
ttl_seconds_lte: 2592000 # <= 30 days
|
|
classification: SelfGrant
|
|
|
|
# Very long-lived SSH certificates — requires approval
|
|
- match:
|
|
registry_type: credential
|
|
verb: issue
|
|
credential_type: ssh_user_cert
|
|
conditions:
|
|
ttl_seconds_gt: 2592000
|
|
classification: SingleApproval
|
|
|
|
# Scheduled rotation — no approval
|
|
- match:
|
|
registry_type: credential
|
|
verb: rotate
|
|
rotation_reason: scheduled
|
|
classification: Autonomous
|
|
|
|
# Manual rotation — self-grant
|
|
- match:
|
|
registry_type: credential
|
|
verb: rotate
|
|
rotation_reason: manual
|
|
classification: SelfGrant
|
|
|
|
# Compromise-driven rotation — quorum
|
|
- match:
|
|
registry_type: credential
|
|
verb: rotate
|
|
rotation_reason: compromised
|
|
classification: QuorumApproval
|
|
quorum:
|
|
required: 2
|
|
pool_size: 3
|
|
|
|
# Standard revocation — single approval
|
|
- match:
|
|
registry_type: credential
|
|
verb: revoke
|
|
classification: SingleApproval
|
|
|
|
# Cross-trust-domain operations — quorum
|
|
- match:
|
|
registry_type: credential
|
|
conditions:
|
|
cross_trust_domain: true
|
|
classification: QuorumApproval
|
|
quorum:
|
|
required: 2
|
|
pool_size: 3
|
|
|
|
# X.509 SVID minting — autonomous
|
|
- match:
|
|
registry_type: credential
|
|
verb: issue
|
|
credential_type: x509_svid
|
|
classification: Autonomous
|
|
|
|
# Database credential provisioning — self-grant
|
|
- match:
|
|
registry_type: credential
|
|
verb: issue
|
|
credential_type: db_password
|
|
classification: SelfGrant
|
|
|
|
defaults:
|
|
# If no rule matches, require single approval (fail-safe)
|
|
classification: SingleApproval
|
|
ceremony_timeout_seconds: 600
|
|
|
|
emergency:
|
|
# Break-glass configuration
|
|
classification: EmergencyBreakGlass
|
|
post_hoc_approval_window_hours: 24
|
|
escalation_channel: platform-security
|
|
trigger_conditions:
|
|
- revocation_reason_contains: "compromise"
|
|
- revocation_reason_contains: "incident"
|
|
- metadata_contains_key: "incident_id"
|
|
```
|
|
|
|
#### Policy Schema Summary
|
|
|
|
An Accord policy document MUST contain the following top-level keys:
|
|
|
|
- `apiVersion` (string, REQUIRED): MUST be `accord.guildhouse.io/v1`.
|
|
- `kind` (string, REQUIRED): MUST be `CredentialGovernancePolicy`.
|
|
- `metadata` (object, REQUIRED): MUST include `name` (string) and `tenant` (string, `"*"` for wildcard).
|
|
- `rules` (array, REQUIRED): Ordered list of rule objects. Each rule MUST contain `match` (object) and `classification` (string, one of: `Autonomous`, `SelfGrant`, `SingleApproval`, `QuorumApproval`). Rules MAY include `quorum` (object with `required` and `pool_size` integers) when classification is `QuorumApproval`.
|
|
- `defaults` (object, REQUIRED): MUST include `classification` (string). MAY include `ceremony_timeout_seconds` (integer).
|
|
- `emergency` (object, OPTIONAL): Configuration for `EmergencyBreakGlass` ceremonies. MAY include `post_hoc_approval_window_hours` (integer), `escalation_channel` (string), and `trigger_conditions` (array of match expressions).
|
|
|
|
The `match` object supports: `registry_type`, `verb`, `credential_type` (string equality), and `conditions` (object with comparison operators suffixed to field names, e.g., `ttl_seconds_lte`, `ttl_seconds_gt`, `cross_trust_domain`).
|
|
|
|
### 8.3 Policy Precedence
|
|
|
|
When multiple rules match a credential event, the following precedence applies:
|
|
|
|
1. Tenant-specific policies override the wildcard (`"*"`) tenant policy.
|
|
2. More specific `match` criteria (more fields specified) take precedence over less specific rules.
|
|
3. If two rules have equal specificity, the rule appearing later in the file takes precedence.
|
|
4. If no rule matches, the `defaults.classification` applies. Implementations MUST default to `SingleApproval` if no default is configured (Section 10).
|
|
|
|
### 8.4 Emergency Override
|
|
|
|
A credential event MAY be elevated to `EmergencyBreakGlass` classification if it matches the `emergency.trigger_conditions`. Emergency classification overrides the normal rule match. The emergency trigger evaluation runs before normal rule matching.
|
|
|
|
Implementations MUST log emergency break-glass activations at a severity level of `WARN` or higher.
|
|
|
|
## 9. Audit Trail
|
|
|
|
### 9.1 Merkle Anchoring
|
|
|
|
Every credential event that completes the governance flow (Steps 1-7) produces a merkle leaf in the governance tree. The leaf hash is derived from the MutationEnvelope as defined in Section 7.4.
|
|
|
|
Leaves accumulate within a governance epoch. At epoch boundary, the NotaryService creates an anchor containing:
|
|
|
|
> **Depth Limit:** The Shellstream `merkle-proof@guildhouse.dev` extension encodes proof direction bits as a single byte, limiting merkle tree depth to 8 (maximum 256 leaves per epoch). Deployments MUST configure epoch duration such that the number of credential events per epoch does not exceed 256. See `specs/shellstream-extensions.md` Section 6.8 for the proof encoding format.
|
|
|
|
- `merkle_root`: The root hash of the merkle tree for this epoch.
|
|
- `previous_root`: The merkle root of the preceding anchor, forming an append-only chain.
|
|
- `leaf_count`: The number of leaves in this epoch.
|
|
- `epoch_start` / `epoch_end`: The time boundaries of the epoch.
|
|
|
|
### 9.2 Certificate-Embedded Audit References
|
|
|
|
For SSH certificates issued through the governance flow, the following certificate extensions MUST be embedded:
|
|
|
|
| Extension | Value |
|
|
|---------------------------------|--------------------------------------------|
|
|
| `merkle-root@guildhouse.dev` | Hex-encoded merkle root at issuance time |
|
|
| `merkle-proof@guildhouse.dev` | Base64-encoded merkle inclusion proof |
|
|
| `governance-intent@guildhouse.dev` | The `intent_id` from the governance flow |
|
|
|
|
These extensions enable offline verification: given a certificate, a verifier can extract the merkle root and proof, then call `NotaryService.VerifyInclusion` to confirm that the credential issuance event was recorded in the governance audit trail.
|
|
|
|
For X.509 SVIDs, the equivalent data SHOULD be embedded in a custom X.509 extension under the Guildhouse IANA Private Enterprise Number (PEN) OID arc. The OID structure is `1.3.6.1.4.1.<PEN>.1.1`, where `<PEN>` is the Guildhouse Cooperative PEN assigned by IANA. Implementations MUST NOT use this extension until the PEN is registered and this placeholder is replaced with the assigned number.
|
|
|
|
> **TODO:** Register an IANA PEN for Guildhouse Cooperative at https://www.iana.org/assignments/enterprise-numbers/ and replace `<PEN>` with the assigned number.
|
|
|
|
### 9.3 Audit Queries
|
|
|
|
The following audit queries MUST be supported:
|
|
|
|
**Verify a credential's governance record:**
|
|
1. Extract `merkle-root@guildhouse.dev` and `merkle-proof@guildhouse.dev` from the certificate.
|
|
2. Call `NotaryService.VerifyInclusion(merkle_root, leaf_hash, proof)`.
|
|
3. If inclusion is verified, the credential issuance event is confirmed to exist in the governance audit trail.
|
|
|
|
**Retrieve a credential's governance history:**
|
|
1. Query `GovernanceService.ListIntents` with `tenant_id` and `status_filter`.
|
|
2. Cross-reference intent records with NotaryService anchors by `intent_id`.
|
|
3. Reconstruct the complete governance chain for the credential lifecycle.
|
|
|
|
**Verify audit chain integrity:**
|
|
1. Call `NotaryService.GetLatestAnchor` to obtain the current anchor.
|
|
2. Walk the `previous_root` chain backward to verify continuity.
|
|
3. For each anchor, verify that the `merkle_root` is consistent with the known leaves.
|
|
|
|
### 9.4 Audit Chain Continuity
|
|
|
|
Each anchor's `previous_root` field MUST reference the `merkle_root` of the immediately preceding anchor. This forms an append-only linked list of audit states.
|
|
|
|
The first anchor in the chain (genesis anchor) MUST have `previous_root` set to the zero hash (`0x0000...0000`, 32 zero bytes).
|
|
|
|
Implementations MUST reject any anchor where `previous_root` does not match the `merkle_root` of the stored preceding anchor. This prevents history rewriting.
|
|
|
|
### 9.5 Retention
|
|
|
|
Anchors are immutable once created. Implementations MUST NOT delete or modify existing anchors.
|
|
|
|
Leaf data (the MutationEnvelope content, as opposed to the leaf hash) retention is policy-dependent. Implementations SHOULD retain leaf data for at least the lifetime of the longest-lived credential type in the deployment. Implementations MAY archive leaf data to cold storage after a configurable retention period, provided the leaf hashes remain available for inclusion verification.
|
|
|
|
## 10. Error Handling
|
|
|
|
This section defines the behavior when components in the governance flow are unavailable or return errors.
|
|
|
|
### 10.1 GovernanceService Unreachable
|
|
|
|
If the governance-notifier plugin cannot reach the GovernanceService (network error, timeout, or non-retryable gRPC status), the credential operation MUST fail. This is a **fail-closed** policy.
|
|
|
|
**Rationale:** Credential operations without governance authorization bypass the entire security model. Allowing ungoverned credential operations, even temporarily, creates an audit gap and potential for abuse.
|
|
|
|
The plugin SHOULD retry with exponential backoff (initial delay 100ms, max delay 10s, max retries 5) before declaring the GovernanceService unreachable. The plugin MUST surface the failure to the caller with a clear error message indicating governance unavailability.
|
|
|
|
### 10.2 Ceremony Timeout
|
|
|
|
If a ceremony does not reach approval or denial within the configured timeout:
|
|
|
|
1. The intent remains in `ceremony_pending` status.
|
|
2. The plugin MUST treat the timeout as a denial and abort the credential operation.
|
|
3. The plugin SHOULD call `GovernanceService.RevokeIntent` to clean up the pending intent.
|
|
4. The plugin MUST log the timeout event at `WARN` severity.
|
|
|
|
The intent's own `ttl_seconds` provides an additional backstop: even if the plugin fails to revoke the intent, the GovernanceService MUST expire intents that exceed their TTL.
|
|
|
|
### 10.3 NotaryService Unreachable
|
|
|
|
If the NotaryService is unreachable after the credential operation has completed:
|
|
|
|
1. The credential operation MUST proceed. The credential has already been issued/rotated/revoked; failing at this point would leave the system in an inconsistent state.
|
|
2. The MutationEnvelope MUST be persisted to local durable storage (e.g., an on-disk queue).
|
|
3. The plugin MUST retry anchoring with exponential backoff until the NotaryService becomes available.
|
|
4. The plugin MUST flag the credential as "anchoring-pending" in any status reporting.
|
|
5. The plugin SHOULD emit a metric (`governance_anchoring_pending_total`) for monitoring.
|
|
|
|
**Rationale:** Unlike the GovernanceService (which provides authorization), the NotaryService provides audit recording. A temporary audit gap is preferable to failing a legitimately authorized credential operation. The retry mechanism ensures eventual consistency of the audit trail.
|
|
|
|
### 10.4 Accord Policy Missing
|
|
|
|
If the Accord policy engine has no matching rule for a credential event's `(registry_type, verb, credential_type)` tuple, the GovernanceService MUST default to `SingleApproval` classification. This is a **fail-safe** policy.
|
|
|
|
**Rationale:** An unconfigured credential type is more likely an oversight than an intentionally ungoverned operation. Requiring at least single approval ensures that novel credential types receive human review.
|
|
|
|
### 10.5 Duplicate Intent (Idempotency Key Collision)
|
|
|
|
If a `CreateIntentRequest` is received with an `idempotency_key` that matches an existing, non-expired intent:
|
|
|
|
1. If the existing intent is in `authorized` or `ceremony_pending` status, the GovernanceService MUST return the existing intent's `intent_id` and `ceremony_id` (if applicable). A new intent MUST NOT be created.
|
|
2. If the existing intent is in `redeemed`, `expired`, or `denied` status, the GovernanceService MUST create a new intent (the idempotency window has passed).
|
|
|
|
This behavior ensures that retry logic in the governance-notifier plugin does not create parallel authorization flows for the same credential event.
|
|
|
|
## 11. Security Considerations
|
|
|
|
### 11.1 Plugin Trust Boundary
|
|
|
|
The governance-notifier plugin runs inside SPIRE Server's trust boundary (as a server plugin) or as a colocated sidecar with access to the SPIRE Server's Unix domain socket. Its gRPC calls to the GovernanceService MUST use mTLS with the SPIRE Server's own SVID as the client certificate.
|
|
|
|
The plugin MUST NOT accept or relay credentials from external sources. The identity claim in the `CreateIntentRequest` MUST be derived from SPIRE's own attestation data, not from caller-supplied headers or tokens.
|
|
|
|
### 11.2 Hash Determinism and Idempotency
|
|
|
|
MutationEnvelope hashes are deterministic: given the same credential event payload, timestamp, and actor, the same hash is always produced. This property enables:
|
|
|
|
- **Deduplication:** Duplicate events produce duplicate hashes, which can be detected.
|
|
- **Verification:** A verifier can independently reconstruct the expected hash from the event data and compare it to the anchored leaf.
|
|
|
|
Implementations MUST use a JCS library that passes the [RFC 8785 test vectors](https://www.rfc-editor.org/rfc/rfc8785#appendix-B). Non-conformant JCS implementations will produce different hashes, breaking verification.
|
|
|
|
### 11.3 Append-Only Audit Trail
|
|
|
|
The merkle tree anchored by the NotaryService is append-only. Once a leaf hash is included in an anchor, it cannot be removed or modified without invalidating the merkle root (and all subsequent roots in the chain).
|
|
|
|
Implementations MUST NOT provide an API to delete anchors or modify leaf data. The only write operations are `CreateAnchor` (append) and the internal leaf accumulation within an epoch.
|
|
|
|
### 11.4 Ceremony Identity Verification
|
|
|
|
Approvers participating in a ceremony MUST be authenticated via OIDC or SPIFFE identity. Self-asserted identity (e.g., a username/password without federated verification) MUST NOT be accepted for ceremony approval.
|
|
|
|
The CeremonyService MUST verify that the approver's identity is distinct from the requestor's identity for `SingleApproval` and `QuorumApproval` classifications. Self-approval MUST be rejected for these tiers. (For `SelfGrant`, the requestor is the implicit approver by definition.)
|
|
|
|
### 11.5 Time-of-Check / Time-of-Use (TOCTOU)
|
|
|
|
The SAT issued upon intent redemption has a bounded lifetime (`issued_at` to `expires_at`). The credential operation MUST complete within the SAT lifetime. If the SAT expires before the credential operation completes, the operation MUST be aborted.
|
|
|
|
The RECOMMENDED SAT TTL for credential operations is 60 seconds. This window is sufficient for the credential operation itself (certificate signing, database credential provisioning) but short enough to limit the window of vulnerability if the SAT is intercepted.
|
|
|
|
Implementations MUST check `expires_at` immediately before executing the credential operation, not only at the time the SAT is received.
|
|
|
|
### 11.6 Replay Protection
|
|
|
|
Each MutationIntent has the following replay protection mechanisms:
|
|
|
|
- **`max_redemptions`:** Set to `1` for all credential events (Section 6.3). The intent can only be redeemed once.
|
|
- **`idempotency_key`:** Derived from the credential event's identity (`registry_type + verb + credential_id`). Duplicate submissions within the intent TTL window return the existing intent rather than creating a new one (Section 10.5).
|
|
- **`ttl_seconds`:** The intent expires after its TTL, preventing stale intents from being redeemed after the operational context has changed.
|
|
|
|
The combination of single-use redemption, idempotency keying, and TTL expiration provides defense-in-depth against replay attacks.
|
|
|
|
### 11.7 Confidentiality of Credential Event Data
|
|
|
|
The `artifact_scope` field in `CreateIntentRequest` contains the JCS-canonicalized credential event payload, which may include sensitive information (e.g., SPIFFE IDs, tenant identifiers, credential scopes). The gRPC channel between the governance-notifier plugin and the GovernanceService MUST be encrypted (mTLS, per Section 11.1).
|
|
|
|
The MutationEnvelope stored in the merkle tree contains only the `payload_hash`, not the raw payload. The raw credential event data SHOULD be stored separately with appropriate access controls, linked to the envelope by `intent_id`.
|
|
|
|
## 12. References
|
|
|
|
- **SPIFFE Specification:** [https://spiffe.io/docs/latest/spiffe-about/overview/](https://spiffe.io/docs/latest/spiffe-about/overview/) -- The Secure Production Identity Framework for Everyone.
|
|
- **RFC 8785:** [https://www.rfc-editor.org/rfc/rfc8785](https://www.rfc-editor.org/rfc/rfc8785) -- JSON Canonicalization Scheme (JCS).
|
|
- **RFC 3339:** [https://www.rfc-editor.org/rfc/rfc3339](https://www.rfc-editor.org/rfc/rfc3339) -- Date and Time on the Internet: Timestamps.
|
|
- **RFC 2119:** [https://www.rfc-editor.org/rfc/rfc2119](https://www.rfc-editor.org/rfc/rfc2119) -- Key words for use in RFCs to Indicate Requirement Levels.
|
|
- **GovernanceService proto:** `quartermaster/v1/governance.proto` -- `CreateIntent`, `RedeemIntent`, `RevokeIntent`, `ListIntents`.
|
|
- **CeremonyService proto:** `bascule/v1/ceremony.proto` -- `CreateCeremony`, `ApproveCeremony`, `DenyCeremony`, `GetCeremony`.
|
|
- **NotaryService proto:** `quartermaster/v1/notary.proto` -- `CreateAnchor`, `GetLatestAnchor`, `VerifyInclusion`.
|
|
- **Accord Policy Specification:** `specs/accord-policy.md` (forthcoming) -- Declarative policy classification for governed mutations. Until published, the normative policy syntax is defined in Section 8.2 of this document.
|
|
|
|
---
|
|
|
|
*End of specification.*
|