gsh/DESIGN.md
Tyler J King 3a2ed1ed42 feat: gsh governed shell — design exploration
DESIGN.md: complete architecture exploration for gsh,
the GCAP governed shell binary.

Two modes:
  Machine: headless JSON I/O, GSAP AC consumption,
    CR posting, exit code governance mapping.
    Auto-detected: no TTY → machine mode.
    What SK plugin and Logic Apps need now.
  Human: interactive, [governed] prompt,
    inline elevation, session-level AC.
    What Sam needs for daily ops.

Architecture: gsh binary + libgsh library.
  common/: AC validation, CR posting, Chronicle env.
  machine/: headless executor.
  human/: reedline shell, prompt, interceptor.

6 open design questions documented.
MVP: machine mode first (~200 lines Rust).

Before building: resolve Q2 (session vs per-command AC)
and Q6 (full shell vs bash wrapper).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 23:20:27 -04:00

7.8 KiB

gsh — GCAP Governed Shell

Design Exploration

Status: Exploration (pre-build) Spec: GCAP-SPEC-SHELLBOUND-BROKER-0001 (Layer 3) Language: Rust License: Apache 2.0


What gsh is

gsh is the binary that makes GCAP tangible. Everything else in the stack — Chronicle, GSAP brokers, connectors, functions, specs — is infrastructure for the moment an operator types into gsh or an AI agent invokes it.

Two modes. Same governance logic. Different UX.


Mode 1: Machine (headless, GSAP-governed)

Invoked by: gsh --exec "command" or piped JSON on stdin Who uses it: AI agents (SK plugin), CI/CD, Logic Apps, bxnet-ops Auto-detected: No TTY → machine mode automatically

Protocol

1. Read GSAP_AC from env var or --ac flag.
   The AC was issued by the broker BEFORE gsh was invoked.
   gsh does not request ACs. The caller does.
   This separation is intentional:
     SK plugin requests AC → gsh consumes it → gsh posts CR.

2. Validate AC:
   R-22: single-use (filesystem registry)
   R-23: corpus_entry_cid matches installed version
   R-24: parameters_cid matches input

3. exec(sh, -c, command)
   Capture stdout, stderr, exit code.
   Read Chronicle env vars from eBPF companion:
     CHRONICLE_SESSION_ID
     CHRONICLE_BEHAVIOR_STATE
     CHRONICLE_MERKLE_ROOT

4. Post CR to broker.
   outcome: completed (exit 0) | failed (exit != 0)
   Include Chronicle evidence from env.

5. Print JSON to stdout:
   {
     "success": true,
     "exit_code": 0,
     "output": "...",
     "lineage_cid": "sha256:...",
     "session_url": "https://broker/governance/session/{ctx}/"
   }

6. Exit with mapped code:
   0 = success (CR: completed)
   1 = execution failure (CR: failed)
   2 = authorization failure (no valid AC)
   3 = governance violation (CR: violated)
   4 = broker unavailable (self-auth fallback)

Why machine mode first

The SK plugin needs it now. Logic Apps needs it now. CI/CD needs it now. Sam needs human mode — but Sam already has bxnet-ops for daily playbook runs. Machine mode is the unblocked path to the AI agent demo.


Mode 2: Human (interactive, TTY-attached)

Invoked by: gsh (no arguments, TTY detected) Who uses it: Sam, operators, developers Feel: bash with governance awareness

The prompt

[governed] sam@ffc.guildhouse.dev:~$

Governance states:

  • [governed] — Active accord. Chronicle live. Green.
  • [elevated] — PIM/elevation active. Yellow.
  • [ungoverned] — No broker configured. Yellow warning.
  • [violated] — eBPF detected violation. Red.

Session lifecycle

1. Shell starts. Reads GSAP_BROKER_URL.
2. If broker: authenticate, establish session accord, Chronicle SESSION_STARTED.
3. If no broker: yellow banner, ungoverned mode, still works.
4. Operator types commands.
5. Some commands need elevation → inline prompt.
6. On exit: Chronicle SESSION_ENDED.

Command categorization (not every command is governed)

Free commands — ls, cat, echo, grep, cd, pwd. No governance overhead. No Chronicle. Standard POSIX.

Observed commands — file writes, network connections matching declared endpoints. Chronicle records but does not gate. Passive lineage.

Governed commands — playbook patterns, infrastructure mutations, privileged ops. AC required before exec. CR posted after. Full GSAP cycle.

The determination is made by the session's accord template. The accord IS the shell policy.


Architecture

gsh (binary crate)
├── main.rs          — mode detection, CLI
├── machine/
│   ├── mod.rs       — machine mode entry
│   ├── ac.rs        — AC validation (R-22/23/24)
│   ├── exec.rs      — command execution + capture
│   └── cr.rs        — CR posting
├── human/
│   ├── mod.rs       — human mode entry
│   ├── shell.rs     — reedline loop
│   ├── prompt.rs    — [governed] prompt
│   └── intercept.rs — command categorization
└── common/
    ├── mod.rs
    ├── chronicle.rs  — eBPF env reader
    ├── accord.rs     — policy enforcement
    └── gsap.rs       — AC/CR types + HTTP client

libgsh (library crate) — extracted later
  The governance logic. No UX. No I/O.
  Designed for extraction to a reusable library.

Relationship to bxnet-ops

bxnet-ops has gsap_client.rs with AC validation and CR posting. gsh extracts this into a standalone binary. The gsap_client module becomes common/gsap.rs in gsh. bxnet-ops can eventually depend on libgsh instead of maintaining its own GSAP client.


Open Design Questions

Q1: Command categorization strategy

How does the shell determine which commands are governed?

  • Option A: Regex patterns in the accord (e.g. ansible-playbook.* → governed).
  • Option B: Binary whitelist in corpus map (only corpus binaries are governed).
  • Option C: Heuristic based on capability_mask + endpoint declarations. Leaning: B for strict mode, C for permissive mode. Accord chooses.

Q2: Session-level vs per-command AC

Does the operator get one AC for the whole session, or one per governed command?

  • Session AC: faster UX, one auth for N commands. Risk: over-authorization.
  • Per-command AC: precise, each command separately authorized. Risk: slow for interactive. Leaning: Session AC for human mode (bound by session TTL). Per-command for machine mode (caller provides AC per invocation). Different modes, different answers.

Q3: Pipeline governance

When the operator runs cmd1 | cmd2 | cmd3, which is governed?

  • Option A: The pipeline as a whole (one AC for the pipeline).
  • Option B: Each command individually (three ACs).
  • Option C: Only the first command (the rest inherit the session). Leaning: C. The pipeline is one operation. The first command is the intent.

Q4: Accord policy storage

Where does gsh read the accord policy?

  • Option A: From the AC itself (the broker embeds it).
  • Option B: From a local policy file (fetched at session start).
  • Option C: From a K8s ConfigMap / environment variable. Leaning: A. The AC already carries accord_template. gsh resolves the template to a policy at session start. The broker is authoritative for what the accord means.

Q5: Long-running command AC expiry

If a command takes 2 hours but the AC expires in 30 minutes?

  • Option A: AC expiry kills the command.
  • Option B: AC expiry flags but doesn't kill (CR records the overage).
  • Option C: gsh requests AC extension from broker mid-execution. Leaning: B. Don't kill running infra operations. Record the overage. The CR is honest.

Q6: Full shell vs bash wrapper

Is gsh a full shell implementation or a wrapper around bash?

  • Full shell: custom parser, built-in command set. Maximum control. Massive scope.
  • Bash wrapper: exec(bash) with eBPF + GSAP wrapping. Minimal scope. Less control. Leaning: Bash wrapper for MVP. Full shell is a multi-year project. The governance logic is the value, not the shell parser. bash is the shell. gsh is the governance layer.

MVP Scope (machine mode first)

~200 lines of Rust. One week.

gsh --exec "ansible-playbook site.yml"
  1. Read GSAP_AC from environment (JSON or base64).
  2. Validate AC (corpus CID, params CID, single-use).
  3. exec(sh, -c, command).
  4. Capture stdout/stderr/exit code.
  5. Post CR to broker.
  6. Print JSON to stdout.
  7. Exit with mapped code.

Human mode builds on top. Same governance logic. Different UX layer.


The billing drain connection

gsh --exec → CR posted → lineage_cid in JSON
  ↓
SK plugin reads lineage_cid
  ↓
Chronicle: GSAP_CR_RECEIVED
  ↓
FunctionRuntime.dispatch("GSAP_CR_RECEIVED")
  ↓
BillingProcessor.handle()
  ↓
Invoice line item with Chronicle CID
  ↓
Auditor: the invoice IS the governance proof

This is the complete chain from operator keystroke to auditable invoice.