feat: gsh governed shell — design exploration

DESIGN.md: complete architecture exploration for gsh, the GCAP governed shell binary. Two modes: Machine: headless JSON I/O, GSAP AC consumption, CR posting, exit code governance mapping. Auto-detected: no TTY → machine mode. What SK plugin and Logic Apps need now. Human: interactive, [governed] prompt, inline elevation, session-level AC. What Sam needs for daily ops. Architecture: gsh binary + libgsh library. common/: AC validation, CR posting, Chronicle env. machine/: headless executor. human/: reedline shell, prompt, interceptor. 6 open design questions documented. MVP: machine mode first (~200 lines Rust). Before building: resolve Q2 (session vs per-command AC) and Q6 (full shell vs bash wrapper). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 23:20:27 -04:00 · 2026-03-30 23:20:27 -04:00 · 3a2ed1ed42
commit 3a2ed1ed42
parent 6833d34e68
1 changed files with 232 additions and 0 deletions
--- a/DESIGN.md
+++ b/DESIGN.md
@ -0,0 +1,232 @@
+# gsh — GCAP Governed Shell
+
+## Design Exploration
+
+**Status:** Exploration (pre-build)
+**Spec:** GCAP-SPEC-SHELLBOUND-BROKER-0001 (Layer 3)
+**Language:** Rust
+**License:** Apache 2.0
+
+---
+
+## What gsh is
+
+gsh is the binary that makes GCAP tangible. Everything else in the stack — Chronicle, GSAP brokers, connectors, functions, specs — is infrastructure for the moment an operator types into gsh or an AI agent invokes it.
+
+Two modes. Same governance logic. Different UX.
+
+---
+
+## Mode 1: Machine (headless, GSAP-governed)
+
+**Invoked by:** `gsh --exec "command"` or piped JSON on stdin
+**Who uses it:** AI agents (SK plugin), CI/CD, Logic Apps, bxnet-ops
+**Auto-detected:** No TTY → machine mode automatically
+
+### Protocol
+
+```
+1. Read GSAP_AC from env var or --ac flag.
+   The AC was issued by the broker BEFORE gsh was invoked.
+   gsh does not request ACs. The caller does.
+   This separation is intentional:
+     SK plugin requests AC → gsh consumes it → gsh posts CR.
+
+2. Validate AC:
+   R-22: single-use (filesystem registry)
+   R-23: corpus_entry_cid matches installed version
+   R-24: parameters_cid matches input
+
+3. exec(sh, -c, command)
+   Capture stdout, stderr, exit code.
+   Read Chronicle env vars from eBPF companion:
+     CHRONICLE_SESSION_ID
+     CHRONICLE_BEHAVIOR_STATE
+     CHRONICLE_MERKLE_ROOT
+
+4. Post CR to broker.
+   outcome: completed (exit 0) | failed (exit != 0)
+   Include Chronicle evidence from env.
+
+5. Print JSON to stdout:
+   {
+     "success": true,
+     "exit_code": 0,
+     "output": "...",
+     "lineage_cid": "sha256:...",
+     "session_url": "https://broker/governance/session/{ctx}/"
+   }
+
+6. Exit with mapped code:
+   0 = success (CR: completed)
+   1 = execution failure (CR: failed)
+   2 = authorization failure (no valid AC)
+   3 = governance violation (CR: violated)
+   4 = broker unavailable (self-auth fallback)
+```
+
+### Why machine mode first
+
+The SK plugin needs it now. Logic Apps needs it now. CI/CD needs it now. Sam needs human mode — but Sam already has bxnet-ops for daily playbook runs. Machine mode is the unblocked path to the AI agent demo.
+
+---
+
+## Mode 2: Human (interactive, TTY-attached)
+
+**Invoked by:** `gsh` (no arguments, TTY detected)
+**Who uses it:** Sam, operators, developers
+**Feel:** bash with governance awareness
+
+### The prompt
+
+```
+[governed] sam@ffc.guildhouse.dev:~$
+```
+
+Governance states:
+- `[governed]` — Active accord. Chronicle live. Green.
+- `[elevated]` — PIM/elevation active. Yellow.
+- `[ungoverned]` — No broker configured. Yellow warning.
+- `[violated]` — eBPF detected violation. Red.
+
+### Session lifecycle
+
+```
+1. Shell starts. Reads GSAP_BROKER_URL.
+2. If broker: authenticate, establish session accord, Chronicle SESSION_STARTED.
+3. If no broker: yellow banner, ungoverned mode, still works.
+4. Operator types commands.
+5. Some commands need elevation → inline prompt.
+6. On exit: Chronicle SESSION_ENDED.
+```
+
+### Command categorization (not every command is governed)
+
+**Free commands** — ls, cat, echo, grep, cd, pwd.
+No governance overhead. No Chronicle. Standard POSIX.
+
+**Observed commands** — file writes, network connections matching declared endpoints.
+Chronicle records but does not gate. Passive lineage.
+
+**Governed commands** — playbook patterns, infrastructure mutations, privileged ops.
+AC required before exec. CR posted after. Full GSAP cycle.
+
+The determination is made by the session's accord template. The accord IS the shell policy.
+
+---
+
+## Architecture
+
+```
+gsh (binary crate)
+├── main.rs          — mode detection, CLI
+├── machine/
+│   ├── mod.rs       — machine mode entry
+│   ├── ac.rs        — AC validation (R-22/23/24)
+│   ├── exec.rs      — command execution + capture
+│   └── cr.rs        — CR posting
+├── human/
+│   ├── mod.rs       — human mode entry
+│   ├── shell.rs     — reedline loop
+│   ├── prompt.rs    — [governed] prompt
+│   └── intercept.rs — command categorization
+└── common/
+    ├── mod.rs
+    ├── chronicle.rs  — eBPF env reader
+    ├── accord.rs     — policy enforcement
+    └── gsap.rs       — AC/CR types + HTTP client
+
+libgsh (library crate) — extracted later
+  The governance logic. No UX. No I/O.
+  Designed for extraction to a reusable library.
+```
+
+### Relationship to bxnet-ops
+
+bxnet-ops has `gsap_client.rs` with AC validation and CR posting. gsh extracts this into a standalone binary. The `gsap_client` module becomes `common/gsap.rs` in gsh. bxnet-ops can eventually depend on libgsh instead of maintaining its own GSAP client.
+
+---
+
+## Open Design Questions
+
+### Q1: Command categorization strategy
+How does the shell determine which commands are governed?
+- Option A: Regex patterns in the accord (e.g. `ansible-playbook.*` → governed).
+- Option B: Binary whitelist in corpus map (only corpus binaries are governed).
+- Option C: Heuristic based on capability_mask + endpoint declarations.
+**Leaning:** B for strict mode, C for permissive mode. Accord chooses.
+
+### Q2: Session-level vs per-command AC
+Does the operator get one AC for the whole session, or one per governed command?
+- Session AC: faster UX, one auth for N commands. Risk: over-authorization.
+- Per-command AC: precise, each command separately authorized. Risk: slow for interactive.
+**Leaning:** Session AC for human mode (bound by session TTL). Per-command for machine mode (caller provides AC per invocation). Different modes, different answers.
+
+### Q3: Pipeline governance
+When the operator runs `cmd1 | cmd2 | cmd3`, which is governed?
+- Option A: The pipeline as a whole (one AC for the pipeline).
+- Option B: Each command individually (three ACs).
+- Option C: Only the first command (the rest inherit the session).
+**Leaning:** C. The pipeline is one operation. The first command is the intent.
+
+### Q4: Accord policy storage
+Where does gsh read the accord policy?
+- Option A: From the AC itself (the broker embeds it).
+- Option B: From a local policy file (fetched at session start).
+- Option C: From a K8s ConfigMap / environment variable.
+**Leaning:** A. The AC already carries accord_template. gsh resolves the template to a policy at session start. The broker is authoritative for what the accord means.
+
+### Q5: Long-running command AC expiry
+If a command takes 2 hours but the AC expires in 30 minutes?
+- Option A: AC expiry kills the command.
+- Option B: AC expiry flags but doesn't kill (CR records the overage).
+- Option C: gsh requests AC extension from broker mid-execution.
+**Leaning:** B. Don't kill running infra operations. Record the overage. The CR is honest.
+
+### Q6: Full shell vs bash wrapper
+Is gsh a full shell implementation or a wrapper around bash?
+- Full shell: custom parser, built-in command set. Maximum control. Massive scope.
+- Bash wrapper: exec(bash) with eBPF + GSAP wrapping. Minimal scope. Less control.
+**Leaning:** Bash wrapper for MVP. Full shell is a multi-year project. The governance logic is the value, not the shell parser. bash is the shell. gsh is the governance layer.
+
+---
+
+## MVP Scope (machine mode first)
+
+~200 lines of Rust. One week.
+
+```bash
+gsh --exec "ansible-playbook site.yml"
+```
+
+1. Read `GSAP_AC` from environment (JSON or base64).
+2. Validate AC (corpus CID, params CID, single-use).
+3. `exec(sh, -c, command)`.
+4. Capture stdout/stderr/exit code.
+5. Post CR to broker.
+6. Print JSON to stdout.
+7. Exit with mapped code.
+
+Human mode builds on top. Same governance logic. Different UX layer.
+
+---
+
+## The billing drain connection
+
+```
+gsh --exec → CR posted → lineage_cid in JSON
+  ↓
+SK plugin reads lineage_cid
+  ↓
+Chronicle: GSAP_CR_RECEIVED
+  ↓
+FunctionRuntime.dispatch("GSAP_CR_RECEIVED")
+  ↓
+BillingProcessor.handle()
+  ↓
+Invoice line item with Chronicle CID
+  ↓
+Auditor: the invoice IS the governance proof
+```
+
+This is the complete chain from operator keystroke to auditable invoice.