DESIGN.md: complete architecture exploration for gsh,
the GCAP governed shell binary.
Two modes:
Machine: headless JSON I/O, GSAP AC consumption,
CR posting, exit code governance mapping.
Auto-detected: no TTY → machine mode.
What SK plugin and Logic Apps need now.
Human: interactive, [governed] prompt,
inline elevation, session-level AC.
What Sam needs for daily ops.
Architecture: gsh binary + libgsh library.
common/: AC validation, CR posting, Chronicle env.
machine/: headless executor.
human/: reedline shell, prompt, interceptor.
6 open design questions documented.
MVP: machine mode first (~200 lines Rust).
Before building: resolve Q2 (session vs per-command AC)
and Q6 (full shell vs bash wrapper).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7.8 KiB
gsh — GCAP Governed Shell
Design Exploration
Status: Exploration (pre-build) Spec: GCAP-SPEC-SHELLBOUND-BROKER-0001 (Layer 3) Language: Rust License: Apache 2.0
What gsh is
gsh is the binary that makes GCAP tangible. Everything else in the stack — Chronicle, GSAP brokers, connectors, functions, specs — is infrastructure for the moment an operator types into gsh or an AI agent invokes it.
Two modes. Same governance logic. Different UX.
Mode 1: Machine (headless, GSAP-governed)
Invoked by: gsh --exec "command" or piped JSON on stdin
Who uses it: AI agents (SK plugin), CI/CD, Logic Apps, bxnet-ops
Auto-detected: No TTY → machine mode automatically
Protocol
1. Read GSAP_AC from env var or --ac flag.
The AC was issued by the broker BEFORE gsh was invoked.
gsh does not request ACs. The caller does.
This separation is intentional:
SK plugin requests AC → gsh consumes it → gsh posts CR.
2. Validate AC:
R-22: single-use (filesystem registry)
R-23: corpus_entry_cid matches installed version
R-24: parameters_cid matches input
3. exec(sh, -c, command)
Capture stdout, stderr, exit code.
Read Chronicle env vars from eBPF companion:
CHRONICLE_SESSION_ID
CHRONICLE_BEHAVIOR_STATE
CHRONICLE_MERKLE_ROOT
4. Post CR to broker.
outcome: completed (exit 0) | failed (exit != 0)
Include Chronicle evidence from env.
5. Print JSON to stdout:
{
"success": true,
"exit_code": 0,
"output": "...",
"lineage_cid": "sha256:...",
"session_url": "https://broker/governance/session/{ctx}/"
}
6. Exit with mapped code:
0 = success (CR: completed)
1 = execution failure (CR: failed)
2 = authorization failure (no valid AC)
3 = governance violation (CR: violated)
4 = broker unavailable (self-auth fallback)
Why machine mode first
The SK plugin needs it now. Logic Apps needs it now. CI/CD needs it now. Sam needs human mode — but Sam already has bxnet-ops for daily playbook runs. Machine mode is the unblocked path to the AI agent demo.
Mode 2: Human (interactive, TTY-attached)
Invoked by: gsh (no arguments, TTY detected)
Who uses it: Sam, operators, developers
Feel: bash with governance awareness
The prompt
[governed] sam@ffc.guildhouse.dev:~$
Governance states:
[governed]— Active accord. Chronicle live. Green.[elevated]— PIM/elevation active. Yellow.[ungoverned]— No broker configured. Yellow warning.[violated]— eBPF detected violation. Red.
Session lifecycle
1. Shell starts. Reads GSAP_BROKER_URL.
2. If broker: authenticate, establish session accord, Chronicle SESSION_STARTED.
3. If no broker: yellow banner, ungoverned mode, still works.
4. Operator types commands.
5. Some commands need elevation → inline prompt.
6. On exit: Chronicle SESSION_ENDED.
Command categorization (not every command is governed)
Free commands — ls, cat, echo, grep, cd, pwd. No governance overhead. No Chronicle. Standard POSIX.
Observed commands — file writes, network connections matching declared endpoints. Chronicle records but does not gate. Passive lineage.
Governed commands — playbook patterns, infrastructure mutations, privileged ops. AC required before exec. CR posted after. Full GSAP cycle.
The determination is made by the session's accord template. The accord IS the shell policy.
Architecture
gsh (binary crate)
├── main.rs — mode detection, CLI
├── machine/
│ ├── mod.rs — machine mode entry
│ ├── ac.rs — AC validation (R-22/23/24)
│ ├── exec.rs — command execution + capture
│ └── cr.rs — CR posting
├── human/
│ ├── mod.rs — human mode entry
│ ├── shell.rs — reedline loop
│ ├── prompt.rs — [governed] prompt
│ └── intercept.rs — command categorization
└── common/
├── mod.rs
├── chronicle.rs — eBPF env reader
├── accord.rs — policy enforcement
└── gsap.rs — AC/CR types + HTTP client
libgsh (library crate) — extracted later
The governance logic. No UX. No I/O.
Designed for extraction to a reusable library.
Relationship to bxnet-ops
bxnet-ops has gsap_client.rs with AC validation and CR posting. gsh extracts this into a standalone binary. The gsap_client module becomes common/gsap.rs in gsh. bxnet-ops can eventually depend on libgsh instead of maintaining its own GSAP client.
Open Design Questions
Q1: Command categorization strategy
How does the shell determine which commands are governed?
- Option A: Regex patterns in the accord (e.g.
ansible-playbook.*→ governed). - Option B: Binary whitelist in corpus map (only corpus binaries are governed).
- Option C: Heuristic based on capability_mask + endpoint declarations. Leaning: B for strict mode, C for permissive mode. Accord chooses.
Q2: Session-level vs per-command AC
Does the operator get one AC for the whole session, or one per governed command?
- Session AC: faster UX, one auth for N commands. Risk: over-authorization.
- Per-command AC: precise, each command separately authorized. Risk: slow for interactive. Leaning: Session AC for human mode (bound by session TTL). Per-command for machine mode (caller provides AC per invocation). Different modes, different answers.
Q3: Pipeline governance
When the operator runs cmd1 | cmd2 | cmd3, which is governed?
- Option A: The pipeline as a whole (one AC for the pipeline).
- Option B: Each command individually (three ACs).
- Option C: Only the first command (the rest inherit the session). Leaning: C. The pipeline is one operation. The first command is the intent.
Q4: Accord policy storage
Where does gsh read the accord policy?
- Option A: From the AC itself (the broker embeds it).
- Option B: From a local policy file (fetched at session start).
- Option C: From a K8s ConfigMap / environment variable. Leaning: A. The AC already carries accord_template. gsh resolves the template to a policy at session start. The broker is authoritative for what the accord means.
Q5: Long-running command AC expiry
If a command takes 2 hours but the AC expires in 30 minutes?
- Option A: AC expiry kills the command.
- Option B: AC expiry flags but doesn't kill (CR records the overage).
- Option C: gsh requests AC extension from broker mid-execution. Leaning: B. Don't kill running infra operations. Record the overage. The CR is honest.
Q6: Full shell vs bash wrapper
Is gsh a full shell implementation or a wrapper around bash?
- Full shell: custom parser, built-in command set. Maximum control. Massive scope.
- Bash wrapper: exec(bash) with eBPF + GSAP wrapping. Minimal scope. Less control. Leaning: Bash wrapper for MVP. Full shell is a multi-year project. The governance logic is the value, not the shell parser. bash is the shell. gsh is the governance layer.
MVP Scope (machine mode first)
~200 lines of Rust. One week.
gsh --exec "ansible-playbook site.yml"
- Read
GSAP_ACfrom environment (JSON or base64). - Validate AC (corpus CID, params CID, single-use).
exec(sh, -c, command).- Capture stdout/stderr/exit code.
- Post CR to broker.
- Print JSON to stdout.
- Exit with mapped code.
Human mode builds on top. Same governance logic. Different UX layer.
The billing drain connection
gsh --exec → CR posted → lineage_cid in JSON
↓
SK plugin reads lineage_cid
↓
Chronicle: GSAP_CR_RECEIVED
↓
FunctionRuntime.dispatch("GSAP_CR_RECEIVED")
↓
BillingProcessor.handle()
↓
Invoice line item with Chronicle CID
↓
Auditor: the invoice IS the governance proof
This is the complete chain from operator keystroke to auditable invoice.