Go-based network automation with YANG models, gRPC, Ansible, Terraform, and Kubernetes integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
419 lines
23 KiB
Markdown
419 lines
23 KiB
Markdown
# HFL–Kedge Network Integration
|
||
|
||
How Host Function Layer WASM module Shellstream sessions interact with Kedge's CNI plugin and DaemonSet.
|
||
|
||
> **Cross-references:**
|
||
> Substrate [`docs/shell-primitive.md`](../../substrate/docs/shell-primitive.md) — BPF map structures, eBPF hooks, shell lifecycle.
|
||
> Substrate [`docs/hfl-spec.md`](../../substrate/docs/hfl-spec.md) — WASM module grant attenuation, host function categories.
|
||
> Kedge [`internal/cni/plugin.go`](../internal/cni/plugin.go) — CNI `CmdAdd` flow.
|
||
> Kedge [`internal/cni/policy.go`](../internal/cni/policy.go) — SVID-scoped policy evaluation.
|
||
> Kedge [`internal/quartermaster/session_transit.go`](../internal/quartermaster/session_transit.go) — Session transit artifact type.
|
||
|
||
---
|
||
|
||
## 1. Overview
|
||
|
||
Kedge is the network data plane for the Guildhouse infrastructure. It operates as both a Kubernetes CNI plugin (creating the `net1` secondary interface via Multus) and a DaemonSet managing WireGuard tunnels, VLAN interfaces, and Shellstream session termination. The Host Function Layer (HFL) is a WASM-based runtime embedded in Bascule that exposes structured host functions — Quartermaster queries, telemetry reads, infrastructure state, YANG operations — to application workloads running inside attested shells.
|
||
|
||
When an HFL WASM module calls a host function that requires network access, the resulting Shellstream session traverses Kedge's network stack. The HFL does not interact with Kedge directly — it opens Shellstream connections that transit Kedge's `net1` interface like any other pod traffic. What makes HFL sessions distinct is the **three-layer enforcement model** that governs them:
|
||
|
||
1. **Kernel layer (eBPF shell):** The `shell_map` BPF map restricts which destinations and modes the pod's cgroup can reach. Kedge's CNI plugin reads `allowed_mode` from this map at interface creation time. The eBPF `CGROUP_SOCK_ADDR` hook enforces `allowed_targets` at connection time. These checks happen in the kernel — no userspace daemon is in the hot path.
|
||
|
||
2. **Application layer (HFL grants):** Before a host function opens any Shellstream session, the HFL runtime validates the WASM module's `ModuleGrant` against the requested operation. If the grant does not cover the operation (wrong capabilities, wrong registry, wrong mode), the request is rejected before it reaches the network.
|
||
|
||
3. **Cryptographic layer (SAT attenuation):** The per-request `RequestToken` carried in the Shellstream frame header is validated independently by the receiving service (Quartermaster, Bascule, Prometheus). Each layer operates without trusting the others.
|
||
|
||
These three layers compose but do not depend on each other. A compromised HFL runtime cannot bypass the eBPF shell. A compromised eBPF program cannot forge SAT tokens. This is defense in depth through independent enforcement points, not redundant checks in a single trust domain.
|
||
|
||
---
|
||
|
||
## 2. Shared BPF Map Architecture
|
||
|
||
Kedge and the shell primitive share state through `shell_map`, a `BPF_MAP_TYPE_HASH` map keyed by `cgroup_id`. There is no API intermediary, no sidecar, no IPC channel. Both the eBPF enforcement programs and Kedge's CNI plugin read from the same map.
|
||
|
||
### Map definition
|
||
|
||
```c
|
||
struct {
|
||
__uint(type, BPF_MAP_TYPE_HASH);
|
||
__type(key, __u64); /* cgroup_id */
|
||
__type(value, struct shell_state);
|
||
__uint(max_entries, 1024);
|
||
} shell_map SEC(".maps");
|
||
```
|
||
|
||
### `shell_state` layout (212 bytes)
|
||
|
||
```c
|
||
struct shell_state {
|
||
__u8 sat_hash[32]; /* SHA-256 of the bound SAT */
|
||
__u32 capabilities; /* Capability bitmask */
|
||
__u32 allowed_mode; /* Overlay/underlay mode bitmask */
|
||
__u32 num_allowed_targets; /* Count of populated entries */
|
||
struct shell_target allowed_targets[16]; /* Destination whitelist */
|
||
__u64 accord_id; /* Hash of the governing accord */
|
||
__u64 session_counter; /* Monotonic counter for correlation IDs */
|
||
__u64 created_at; /* Nanosecond timestamp (CLOCK_MONOTONIC) */
|
||
__u64 last_updated; /* Nanosecond timestamp (CLOCK_MONOTONIC) */
|
||
__u32 flags; /* Shell state flags */
|
||
__u32 _padding; /* Alignment to 8-byte boundary */
|
||
};
|
||
```
|
||
|
||
```c
|
||
struct shell_target {
|
||
__be32 addr; /* IPv4 destination (network byte order) */
|
||
__be16 port; /* Destination port (0 = any port) */
|
||
__u8 protocol; /* IPPROTO_TCP (6), IPPROTO_UDP (17), 0 = any */
|
||
__u8 flags; /* Per-target flags */
|
||
};
|
||
```
|
||
|
||
### Fields relevant to Kedge
|
||
|
||
| Field | Kedge usage |
|
||
|-------|-------------|
|
||
| `allowed_mode` | CNI plugin reads this at `CmdAdd` time to decide which route types to program on `net1`. `OVERLAY` (0x1) → WireGuard routes. `UNDERLAY` (0x2) → VLAN bridge routes. Both bits → both route sets. |
|
||
| `allowed_targets` | eBPF `CGROUP_SOCK_ADDR` hook validates outbound connections against this whitelist at connect time. Kedge does not enforce targets — the kernel does. |
|
||
| `capabilities` | `READ` (0x1), `PROPOSE` (0x2), `MUTATE` (0x4), `ADMIN` (0x8). Kedge uses this when building `SessionTransitArtifact` records. The DaemonSet's health endpoint exposes capability distribution as Prometheus metrics. |
|
||
| `flags` | `ACTIVE` (0x1), `FROZEN` (0x2), `DRAINING` (0x4). DaemonSet watches flag transitions. When a shell enters `FROZEN`, the DaemonSet stops accepting new Shellstream handshakes for that session. When `DRAINING`, it tears down associated tunnel state. |
|
||
| `sat_hash` | Included in every `SessionTransitArtifact` for Quartermaster governance chain linkage. |
|
||
| `accord_id` | Used by the DaemonSet to look up the local accord policy that governs capability grants for incoming Shellstream sessions. |
|
||
|
||
### Access pattern
|
||
|
||
The BPF map is pinned at `/sys/fs/bpf/shell_map`. Kedge accesses it from Go via `cilium/ebpf` map operations:
|
||
|
||
- **CNI plugin** (`CmdAdd`): Looks up the pod's `cgroup_id`, reads `allowed_mode` and `flags`. This is a single `MapLookupElem` call — O(1), no syscall overhead beyond the BPF command. If no entry exists (shell is `UNBOUND`), the CNI plugin installs no routes on `net1`, effectively isolating the pod.
|
||
- **DaemonSet**: Subscribes to `shell_map` changes via the `shell_events` ring buffer (see `ShellProgrammer.subscribe_events()` in shell-primitive.md §7.1). Reacts to `bind_shell`, `freeze_shell`, `drain_shell`, and `destroy_shell` events.
|
||
|
||
---
|
||
|
||
## 3. Routing HFL Shellstream Sessions Through Kedge
|
||
|
||
An HFL-originated Shellstream session follows the same packet path as any other pod egress through `net1`. The difference is that three independent checks have already passed before the first SYN reaches the wire. This section traces the four common HFL session types.
|
||
|
||
### 3.1 Overlay: HFL → Quartermaster (cross-cluster query)
|
||
|
||
A WASM module calls `query-records` on the `session-transit` registry hosted by a remote Quartermaster instance.
|
||
|
||
```
|
||
WASM module
|
||
│ host.quartermaster.query_records("session-transit", filter, limit)
|
||
▼
|
||
HFL Runtime
|
||
│ 1. Validate ModuleGrant: capabilities includes READ,
|
||
│ registries includes "session-transit",
|
||
│ allowed_mode includes OVERLAY
|
||
│ 2. Construct RequestToken (single_use=true, expires=30s)
|
||
│ 3. Open TCP connection to Quartermaster endpoint
|
||
▼
|
||
eBPF CGROUP_SOCK_ADDR hook
|
||
│ 4. Look up cgroup_id in shell_map
|
||
│ 5. Check flags: ACTIVE (not FROZEN/DRAINING)
|
||
│ 6. Check destination against allowed_targets[0..N]
|
||
│ 7. Permit or deny at kernel level
|
||
▼
|
||
net1 interface (Kedge CNI)
|
||
│ 8. Packet hits overlay route → forwarded to wg0
|
||
▼
|
||
WireGuard tunnel (wg0)
|
||
│ 9. Encrypted transit to remote cluster
|
||
▼
|
||
Remote Kedge DaemonSet (Shellstream listener)
|
||
│ 10. 3-way handshake: ATTEST-INIT / ATTEST-VERIFY / ATTEST-CONFIRM
|
||
│ 11. Validate SAT, evaluate capability against local accord
|
||
│ 12. Record SessionTransitArtifact
|
||
▼
|
||
Quartermaster
|
||
│ 13. Validate RequestToken independently
|
||
│ 14. Execute query, return results
|
||
```
|
||
|
||
Steps 1–2 are HFL enforcement (application layer). Steps 4–7 are eBPF enforcement (kernel layer). Step 8 depends on Kedge having programmed overlay routes at `CmdAdd` time — which only happens if `shell_state.allowed_mode & OVERLAY != 0`. Steps 10–13 are cryptographic enforcement (SAT + RequestToken).
|
||
|
||
### 3.2 Underlay: HFL → Bascule SDK dispatch (device read)
|
||
|
||
A WASM module calls `get-device-state` for a FortiGate on the local VLAN.
|
||
|
||
```
|
||
WASM module
|
||
│ host.infrastructure.get_device_state("fortigate.transit.local")
|
||
▼
|
||
HFL Runtime
|
||
│ 1. Validate ModuleGrant: capabilities includes READ,
|
||
│ allowed_mode includes UNDERLAY
|
||
│ 2. Construct RequestToken
|
||
│ 3. Open Shellstream to Bascule (may be local or cross-cluster)
|
||
▼
|
||
eBPF CGROUP_SOCK_ADDR hook
|
||
│ 4. Validate against shell_map (same as §3.1)
|
||
▼
|
||
net1 interface (Kedge CNI)
|
||
│ 5. Packet hits underlay route → forwarded via VLAN bridge
|
||
▼
|
||
VLAN bridge (br-mgmt / tagged interface)
|
||
│ 6. Frame reaches managed device on infrastructure VLAN
|
||
▼
|
||
Bascule SDK dispatch
|
||
│ 7. Validate RequestToken
|
||
│ 8. Execute read-only SDK call (e.g., fortiosapi.get)
|
||
│ 9. Return device state to WASM module
|
||
```
|
||
|
||
The underlay path differs at step 5: traffic exits through the VLAN bridge rather than the WireGuard tunnel. The CNI plugin only programs underlay routes if `shell_state.allowed_mode & UNDERLAY != 0`.
|
||
|
||
### 3.3 Local: HFL → YANG compiler (no Shellstream)
|
||
|
||
YANG host functions (`validate-instance`, `compile-dry-run`, `diff-state`) call the Python YANG compiler within the same container. No Shellstream session is opened and no traffic traverses `net1`. The HFL runtime invokes the compiler as a local subprocess. Kedge is not involved.
|
||
|
||
This is worth documenting because it is the exception: HFL host functions do not always produce network traffic. The YANG category is purely local computation.
|
||
|
||
### 3.4 Local: HFL → Prometheus (telemetry query)
|
||
|
||
Telemetry queries (`query-prometheus`, `query-timescale`) may target a Prometheus instance on the same cluster. When the destination is local (same node or same cluster overlay), the path is:
|
||
|
||
```
|
||
WASM module → HFL Runtime → eBPF check → net1 → overlay route → wg0 → local Prometheus
|
||
```
|
||
|
||
Even local-cluster queries transit `net1` and the WireGuard tunnel because Kedge treats all overlay traffic uniformly. There is no "local bypass" optimization — this simplifies the enforcement model by ensuring every session, regardless of locality, passes through the same BPF map checks.
|
||
|
||
---
|
||
|
||
## 4. CNI Plugin: `shell_state`-Aware Route Programming
|
||
|
||
The CNI plugin's `CmdAdd` handler programs routes on the pod's `net1` interface based on `shell_state.allowed_mode`. This is the critical integration point: Kedge decides at pod creation time what network paths are available.
|
||
|
||
### Current `CmdAdd` flow (from `internal/cni/plugin.go`)
|
||
|
||
```
|
||
ParseNetConf → loadMeshTopology → createVethPair → attachOverlayRoutes
|
||
→ attachUnderlayRoutes → programPodRoutes → applySVIDPolicy
|
||
```
|
||
|
||
### Extended flow with `shell_state` integration
|
||
|
||
```
|
||
ParseNetConf → loadMeshTopology → createVethPair
|
||
→ lookupShellState(cgroup_id)
|
||
→ if shell_state is UNBOUND: return (no routes, pod is isolated)
|
||
→ if flags & ACTIVE == 0: return (shell not active, pod is isolated)
|
||
→ if allowed_mode & OVERLAY: attachOverlayRoutes
|
||
→ if allowed_mode & UNDERLAY: attachUnderlayRoutes
|
||
→ programPodRoutes (only for modes granted above)
|
||
→ applySVIDPolicy (SVID check is orthogonal to shell_state)
|
||
```
|
||
|
||
### `lookupShellState` pseudocode
|
||
|
||
```go
|
||
func lookupShellState(cgroupID uint64) (*ShellState, error) {
|
||
m, err := ebpf.LoadPinnedMap("/sys/fs/bpf/shell_map", nil)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("shell_map not available: %w", err)
|
||
}
|
||
defer m.Close()
|
||
|
||
var state ShellState
|
||
if err := m.Lookup(cgroupID, &state); err != nil {
|
||
// No entry = UNBOUND. Fail closed: no routes.
|
||
return nil, nil
|
||
}
|
||
return &state, nil
|
||
}
|
||
```
|
||
|
||
### Route programming rules
|
||
|
||
| `allowed_mode` | Overlay routes | Underlay routes | Effect |
|
||
|-----------------|---------------|-----------------|--------|
|
||
| `0x0` (none) | No | No | Pod is network-isolated on `net1`. |
|
||
| `0x1` (OVERLAY) | Yes | No | Pod can reach remote clusters via WireGuard. Cannot reach infrastructure VLANs. |
|
||
| `0x2` (UNDERLAY) | No | Yes | Pod can reach infrastructure devices on local VLANs. Cannot reach remote clusters. |
|
||
| `0x3` (both) | Yes | Yes | Full access. Typical for admin-tier shells. |
|
||
|
||
### Fail-closed behavior
|
||
|
||
If `shell_map` is not pinned (e.g., the shell primitive is not yet loaded), the CNI plugin treats this as equivalent to `UNBOUND` — no routes are programmed. This is intentional: a pod without an active shell binding has no business reaching Kedge-managed networks.
|
||
|
||
If `shell_map` exists but contains no entry for the pod's `cgroup_id`, the same fail-closed behavior applies. The CNI plugin does not fall back to permissive routing.
|
||
|
||
---
|
||
|
||
## 5. DaemonSet: `shell_state` Watch and Lifecycle Coordination
|
||
|
||
The DaemonSet reacts to shell lifecycle transitions by adjusting tunnel state, Shellstream session handling, and telemetry.
|
||
|
||
### Event subscription
|
||
|
||
The DaemonSet calls `ShellProgrammer.subscribe_events()` (implemented via the `shell_events` BPF ring buffer) to receive shell state transitions. In Go, this is accessed through the `cilium/ebpf` ring buffer reader:
|
||
|
||
```go
|
||
reader, err := ringbuf.NewReader(shellEventsMap)
|
||
for {
|
||
record, err := reader.Read()
|
||
event := parseShellEvent(record.RawSample)
|
||
switch event.Type {
|
||
case EventBindShell: handleBind(event)
|
||
case EventFreezeShell: handleFreeze(event)
|
||
case EventDrainShell: handleDrain(event)
|
||
case EventDestroyShell: handleDestroy(event)
|
||
}
|
||
}
|
||
```
|
||
|
||
### Lifecycle event handling
|
||
|
||
| Event | DaemonSet action |
|
||
|-------|-----------------|
|
||
| `bind_shell` | Register the shell's `cgroup_id` in local session tracking. If the shell has `allowed_mode & OVERLAY`, ensure the associated WireGuard peers are configured. Log the binding with `sat_hash` and `accord_id`. |
|
||
| `freeze_shell` | Stop accepting new Shellstream handshakes for sessions associated with this `cgroup_id`. Existing established sessions continue (the shell is being rotated, not destroyed). Increment `kedge_shell_freeze_total` counter. |
|
||
| `drain_shell` | Stop accepting new handshakes AND begin tearing down established Shellstream sessions for this `cgroup_id`. The mesh manager marks associated peers as draining. Increment `kedge_shell_drain_total` counter. |
|
||
| `destroy_shell` | Remove all local state for this `cgroup_id`. Clean up any tunnel state that was exclusively serving this shell. Emit final telemetry. Increment `kedge_shell_destroy_total` counter. |
|
||
|
||
### SAT rotation coordination
|
||
|
||
When Bascule rotates a SAT, it freezes the shell, updates `sat_hash`, then unfreezes. The DaemonSet sees two events: `freeze_shell` followed by an `update_shell` (which carries the new `sat_hash`), then an implicit unfreeze (the `FROZEN` flag is cleared). During the freeze window, no new Shellstream handshakes are accepted, but existing sessions continue. This prevents a window where the old SAT is invalid but the new one hasn't propagated — the freeze ensures atomicity from the network's perspective.
|
||
|
||
---
|
||
|
||
## 6. Quartermaster Records for HFL Sessions
|
||
|
||
HFL-originated sessions produce the same `SessionTransitArtifact` and `NetworkMutationArtifact` records as any other Shellstream session. However, HFL sessions carry additional provenance that should be captured for auditability.
|
||
|
||
### Extended `SessionTransitArtifact`
|
||
|
||
The existing artifact (from `internal/quartermaster/session_transit.go`) is extended with optional HFL-specific fields:
|
||
|
||
```go
|
||
type SessionTransitArtifact struct {
|
||
// ... existing fields (session_id, sat_hash, source/dest cluster, etc.)
|
||
|
||
// HFL provenance (optional — zero values when session is not HFL-originated).
|
||
HFLModuleName string `json:"hfl_module_name,omitempty"`
|
||
HFLFunctionName string `json:"hfl_function_name,omitempty"`
|
||
HFLGrantHash []byte `json:"hfl_grant_hash,omitempty"`
|
||
HFLRequestTokenHash []byte `json:"hfl_request_token_hash,omitempty"`
|
||
}
|
||
```
|
||
|
||
| Field | Source | Purpose |
|
||
|-------|--------|---------|
|
||
| `hfl_module_name` | Shellstream frame header, set by HFL runtime | Identifies which WASM module initiated the session (e.g., `"telemetry-query"`). |
|
||
| `hfl_function_name` | Shellstream frame header, set by HFL runtime | Identifies which host function was called (e.g., `"query_flow_records"`). |
|
||
| `hfl_grant_hash` | SHA-256 of the `ModuleGrant` | Links the session back to the specific grant that authorized it. Enables audit queries like "show all sessions authorized by grant X". |
|
||
| `hfl_request_token_hash` | SHA-256 of the `RequestToken` | Links the session to the single-use per-request token. Since tokens are single-use, this provides a 1:1 mapping from token to session. |
|
||
|
||
These fields are populated by the Kedge DaemonSet's Shellstream listener when the `ATTEST-INIT` message includes HFL provenance headers. Non-HFL sessions (direct Shellstream connections) leave these fields empty.
|
||
|
||
### `NetworkMutationArtifact` — no HFL extension needed
|
||
|
||
Network mutations are dispatched through Bascule SDK, not through HFL host functions. The HFL can _propose_ mutations (via `submit-proposal` with `PROPOSE` capability), but the actual execution goes through Bascule's ceremony workflow. By the time Kedge records a `NetworkMutationArtifact`, the mutation has already been approved through a ceremony and dispatched by Bascule — the HFL origin is captured in the ceremony's proposal chain, not in the mutation artifact itself.
|
||
|
||
### Canonical serialization
|
||
|
||
The `CanonicalBytes()` method on `SessionTransitArtifact` must include the HFL fields when present. The RFC 8785 (JCS) canonical form ensures deterministic hashing regardless of field ordering. Fields with zero values (`omitempty`) are excluded from the canonical form to maintain backward compatibility with non-HFL session records.
|
||
|
||
---
|
||
|
||
## 7. Mode Authorization: End-to-End Flow
|
||
|
||
This section traces the complete authorization chain for an HFL-originated underlay session, showing how the three enforcement layers compose.
|
||
|
||
### Scenario
|
||
|
||
A WASM module `"proposal-builder"` calls `compile-dry-run` (local, no network — see §3.3), then calls `submit-proposal` to Quartermaster (overlay), then an operator approves the ceremony and Bascule dispatches the mutation (underlay). We trace the overlay leg (proposal submission) end to end.
|
||
|
||
### Step 1: SAT issuance (before Kedge)
|
||
|
||
The Vigil ceremony issues a SAT with:
|
||
- `capabilities`: `READ | PROPOSE` (0x3)
|
||
- `allowed_mode`: `OVERLAY` (0x1) — this module does not need underlay access
|
||
- `scope`: registries `["network-mutation", "session-transit"]`
|
||
|
||
Bascule calls `bind_shell(cgroup_id, shell_state)` with these values. The `shell_map` entry is created.
|
||
|
||
### Step 2: Grant attenuation (HFL)
|
||
|
||
The HFL runtime derives a `ModuleGrant` for `"proposal-builder"`:
|
||
|
||
```
|
||
ModuleGrant {
|
||
module_hash: sha256("proposal-builder.wasm"),
|
||
parent_sat_hash: shell_state.sat_hash,
|
||
capabilities: PROPOSE (0x2), // attenuated from READ|PROPOSE
|
||
registries: ["network-mutation"], // attenuated from full list
|
||
allowed_mode: OVERLAY (0x1), // same as SAT (cannot widen)
|
||
expires_at: min(sat.expires, now + 1h),
|
||
}
|
||
```
|
||
|
||
Attenuation is strictly narrowing: `grant.capabilities & sat.capabilities == grant.capabilities`. The grant cannot add `MUTATE` or `UNDERLAY` that the SAT does not carry.
|
||
|
||
### Step 3: Request token (HFL, per-call)
|
||
|
||
When the module calls `submit-proposal`, the HFL constructs:
|
||
|
||
```
|
||
RequestToken {
|
||
grant_hash: sha256(ModuleGrant),
|
||
operation: "submit-proposal",
|
||
parameters_hash: sha256(serialized_proposal),
|
||
single_use: true,
|
||
expires_at: now + 30s,
|
||
}
|
||
```
|
||
|
||
### Step 4: eBPF enforcement (kernel)
|
||
|
||
The `connect()` syscall triggers the `CGROUP_SOCK_ADDR` hook:
|
||
|
||
1. Look up `cgroup_id` in `shell_map` → finds the `shell_state`.
|
||
2. Check `flags`: `ACTIVE` is set, `FROZEN` is not → proceed.
|
||
3. Check destination IP/port against `allowed_targets[0..num_allowed_targets]` → match found.
|
||
4. Permit the connection.
|
||
|
||
If the destination were on an underlay VLAN (and `allowed_mode` only has `OVERLAY`), this check would still pass at the eBPF level — `allowed_targets` is an explicit whitelist, not mode-based. The mode enforcement happens at the routing level (step 5).
|
||
|
||
### Step 5: Kedge route enforcement (CNI)
|
||
|
||
The packet exits the pod via `net1`. Because the CNI plugin read `allowed_mode = OVERLAY` at `CmdAdd` time, only overlay (WireGuard) routes are programmed. The destination Quartermaster endpoint resolves to an overlay route through `wg0`. If the destination were on an underlay VLAN, there would be no route — the packet would be dropped by the kernel's routing table. This is Kedge's enforcement: not a firewall rule, but the absence of a route.
|
||
|
||
### Step 6: WireGuard transit (Kedge mesh)
|
||
|
||
The packet traverses the WireGuard tunnel to the remote cluster. Kedge's mesh manager maintains the peer configuration and monitors tunnel health.
|
||
|
||
### Step 7: Remote Shellstream handshake (Kedge DaemonSet)
|
||
|
||
The remote Kedge DaemonSet receives the connection on its Shellstream listener:
|
||
|
||
1. Read `ATTEST-INIT` (includes SAT token, HFL provenance headers).
|
||
2. Validate SAT against local SPIRE trust bundle.
|
||
3. Evaluate capability request against local accord policy (see `internal/shellstream/capability.go`).
|
||
4. Send `ATTEST-VERIFY` with granted capabilities.
|
||
5. Read `ATTEST-CONFIRM`.
|
||
6. Record `SessionTransitArtifact` with HFL fields populated.
|
||
|
||
### Step 8: Quartermaster validation (independent)
|
||
|
||
Quartermaster receives the `submit-proposal` RPC with the `RequestToken` in metadata:
|
||
|
||
1. Verify `RequestToken.signature` against the HFL runtime's signing key.
|
||
2. Check `single_use` — mark token as consumed.
|
||
3. Check `expires_at` — reject if expired.
|
||
4. Verify `grant_hash` chains back to a valid SAT.
|
||
5. Check that the grant's `capabilities` include `PROPOSE`.
|
||
6. Accept the proposal for ceremony processing.
|
||
|
||
### Enforcement summary
|
||
|
||
| Layer | What it checks | Where it runs | Failure mode |
|
||
|-------|---------------|---------------|-------------|
|
||
| HFL grant | Module has appropriate capability and registry scope | Userspace (HFL runtime, same pod) | Request rejected before network |
|
||
| eBPF shell | Destination is in `allowed_targets`, shell is `ACTIVE` | Kernel (`CGROUP_SOCK_ADDR` hook) | `connect()` returns `EACCES` |
|
||
| Kedge routes | `allowed_mode` permits the network path type | Kernel (routing table, programmed by CNI) | Packet dropped (no route to host) |
|
||
| Shellstream handshake | SAT is valid, capabilities match local accord | Userspace (remote Kedge DaemonSet) | Handshake rejected at `ATTEST-VERIFY` |
|
||
| Service token | `RequestToken` is valid, single-use, unexpired | Userspace (Quartermaster/Bascule) | RPC rejected with `PermissionDenied` |
|
||
|
||
Five independent checks, three distinct trust domains (pod kernel, pod userspace, remote service), two cryptographic validations (SAT, RequestToken). No single point of compromise grants unauthorized access.
|