Commit graph

21 commits

Author SHA256 Message Date
f810537581 feat(libgsh): Phase 0 — typed Did on AcPrincipal
`AcPrincipal.did: Option<String>` → `Option<guildhouse_did::Did>`.
The AuthorizationContext now carries a W3C-canonical typed DID;
malformed DIDs fail at deserialize time rather than propagating
into the corpus_check / session state.

SessionState.principal stays a String — it can also hold a Unix
username in ungoverned mode, so a typed Did would force
Option<Did> there and complicate the chain. The render at
SessionState::from_ac now goes Did → as_str() instead of cloning
the legacy String. Behaviour at the audit-leaf level is
unchanged when the AC carries a valid `did:web:...` payload.

Phase 0 of DESIGN-DID-INTEGRATION-2026-04-29 §5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Tyler J King <tking@guildhouse.dev>
2026-05-01 06:28:19 -04:00
91f027ae61 libgsh: complete scenario coverage for corpus_check execution paths
Adds the ReadFailed scenario (binary path resolves to a directory so
exists() succeeds but read() fails) and a scenarios coverage map at the
top of the test module. The map links each test to the audit fix
scenarios:

- valid CID, content matches: Allowed
- valid CID at admission, tampered content at execution: ContentMismatch
- missing binary where directory exists: Denied (sanity preserved)
- binary present but unreadable: ReadFailed (fail-closed)

Plus the existing sentinels for ungoverned-CID and corpus-not-mounted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Tyler J King <tking@guildhouse.dev>
2026-04-25 03:18:56 -04:00
13b393a7f1 libgsh: verify corpus binary content before allowing execution
corpus_check() previously returned Allowed as soon as it found a file by
name in the corpus directory keyed by CID. The CID acted as a directory
label, not a content commitment. An attacker with write access to the
corpus directory could plant a malicious binary under a legitimate CID
and it would execute with that CID's authorization.

This change hashes the binary at the resolved path and compares to the
CID its directory is named for. Mismatches return a new ContentMismatch
variant; unreadable binaries return ReadFailed. Both are execution-denied
states — main.rs handles each explicitly with exit code 3 (previously
used only for Denied).

Both error classes emit Chronicle-shaped structured tracing events
(target: "chronicle") with stable event_type constants from
libgsh::chronicle_events. The field shape matches what substrate-chronicle's
post-io_uring emission API is expected to require; migration to direct
Chronicle emission becomes a mechanical translation once that API
stabilizes.

The tamper signal is that the binary and its directory name disagree.
This closes the execution-path half of the CID-content verification
audit fix — admission (corpus-operator) rejects CID forgery before the
enforcement ConfigMap is written; execution (libgsh) rejects any tamper
that landed after admission. Defense in depth across both layers.

Kernel-layer CID verification (the third layer, where eBPF LSM hooks
authorize by binary name via FNV-1a hash of comm) is explicit backlog,
deferred to Bifrost where in-kernel hashing or a ring-buffer userspace
verifier can be evaluated properly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Tyler J King <tking@guildhouse.dev>
2026-04-25 03:02:37 -04:00
Tyler J King
d0b9ca0e6a feat: detect Windows Entra/local principal in WSL2
Session principal resolution chain:
  GSH_PRINCIPAL → BASCULE_DISPLAY_NAME → derive from DID → whoami()
  GSH_DID → BASCULE_USER_DID → whoami()

.gshrc Windows identity detection:
  Entra-joined: whoami /upn → tking@guildhouse.dev → DID
  Domain-joined: USERNAME@USERDNSDOMAIN → DID
  Local: USERNAME only (no DID)

Governed sessions (Bascule) override with authenticated identity.
Non-WSL2 environments fall back silently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 14:15:05 -04:00
Tyler J King
b363d1da3b feat: Substrate WSL2 distro builder (Fedora 41)
scripts/build-substrate-wsl2.sh — builds a custom Fedora WSL2 distro
with gsh as the default governed shell for the operator user.

Image contents (337MB):
  Fedora 41 + systemd
  gsh as login shell (/usr/local/bin/gsh)
  bascule-proxy for governed cluster connections
  kubectl + helm with corpus symlinks
  SSH aliases: dev.gsh, stg.gsh
  WSL2 config: systemd=true, default user=operator

Build: docker builds Fedora rootfs, exports as tar
Import: wsl --import substrate-gsh C:\WSL\substrate-gsh substrate-gsh.tar
Boot: wsl -d substrate-gsh → governed shell prompt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:59:03 -04:00
Tyler J King
02bcd58c99 feat: display DEFCON posture in banner + prompt
Reads BASCULE_DEFCON_LEVEL from env. At DEFCON <5:
  Banner: DEFCON level + label (RESTRICTED/CRITICAL/LOCKDOWN) + reason
  Prompt: [restricted] at DEFCON 3, [DEFCON] at ≤2

DEFCON 5 (peacetime): no DEFCON line in banner, normal prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:10:17 -04:00
Tyler J King
3c4042ce8e feat: WSL2 jumphost image builder
scripts/build-wsl2-image.sh — idempotent setup for governed jumphost.

Installs: gsh, kubectl, helm (all to ~/.local/bin, no sudo needed)
Configures: corpus directory, SSH aliases (dev.gsh, stg.gsh),
  .gshrc environment defaults
Export: --export flag prints wsl --export/import commands

No sudo required for gsh/corpus/config setup. System packages
(curl, git, etc.) prompt for manual install if sudo unavailable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:04:38 -04:00
Tyler J King
231bed5f92 feat: display name in banner + prompt
Banner shows human-readable principal and DID on separate lines:
  Principal: tking@guildhouse.dev
  DID:       did:web:guildhouse.dev/user/tking

Prompt uses short name: [governed] tking@gsh

Reads BASCULE_DISPLAY_NAME env. Fallback: parse DID to name@domain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:18:13 -04:00
Tyler J King
ff16b5642e feat: re-enable session lifecycle CRs with session_end outcome
Broker now supports session-scoped ACs that stay active across
multiple CRs. Session start posts 'completed' CR, session end
posts 'session_end' CR which consumes the AC.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 02:05:58 -04:00
Tyler J King
740fcdb3b5 fix: skip session lifecycle CRs, fix CR evidence schema
Session start/end CRs used invalid outcome values (session_started,
session_ended) not in broker's Outcome enum, causing 422. Also, broker
consumes AC on first CR, blocking subsequent per-command CRs.

Skipped session lifecycle CRs until session-scoped AC model is
implemented. Per-command CRs still post on governed command completion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:15:34 -04:00
Tyler J King
e7bc2ee2b4 fix: align CR format with broker CompleteRequest schema
- Add session_id field to CrEvidence (broker expects it)
- Change merkle_root to Option<String> (null vs empty string)
- Change events to Vec<serde_json::Value> (broker expects list[dict])
- Fixes 422 Unprocessable Entity on CR posting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:11:22 -04:00
Tyler J King
5f7f9c0ff7 feat: configurable corpus base dir + Bascule dev config
- corpus_check_with_base(): accepts explicit base directory
- corpus_check(): still defaults to /opt/substrate/corpus
- Improved corpus test with actual Allowed/Denied assertions
- Updated bascule-dev.toml with [gsap] section and shell_command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:46:27 -04:00
Tyler J King
fcc7758249 feat: dev Bascule + dual-cluster connectivity complete
Phase 4b: local dev Bascule for Docker Desktop K8s access.

Dev Bascule:
  Binary: substrate/target/release/bascule (14MB)
  Config: ~/.config/bascule/bascule-dev.toml
    Permissive auth, direct dispatch, localhost:2223
  Keys: ~/.config/bascule/keys/dev_{host,ca}_key
  Startup: scripts/start-dev-bascule.sh

Dual-cluster connectivity verified:
  ssh dev.gsh '!whoami'
    → session created, did:web:guildhouse.dev/user/tyler ✓
  ssh stg.gsh '!whoami'
    → session created, did:web:guildhouse.dev/user/tyler ✓

Topology:
  WSL2 → dev.gsh  (localhost:2223, permissive)
  WSL2 → stg.gsh  (178.104.110.197:30222, Hetzner)
  WSL2 → prod.gsh (178.104.110.197:30222, Hetzner)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:37:52 -04:00
Tyler J King
0adcf12e78 feat: Phase 4 — Bascule dual-cluster connectivity
Hetzner Bascule: already deployed (pod 756dccc486-wwg78, 5d uptime).
  Exposed via NodePort 30222 on all worker nodes.
  SSH responds: russh_0.46.0, session created, DID resolved.

Connectivity verified from WSL2:
  ssh stg.gsh '!whoami'
  → session: 019d4fd5-..., did: did:web:guildhouse.dev/user/tyler
  → tier: ReadOnly, roles: ["operator"]

Config files:
  config/bascule-dev.toml    — permissive auth, localhost:2223
  config/bascule-hetzner.toml — reference for Hetzner NodePort endpoints

bascule-proxy built and installed (~/.local/bin/).
  Config at ~/.config/bascule/config.toml
  Hosts: dev (localhost:2223), stg/prod (178.104.110.197:30222)

SSH config: stg.gsh and prod.gsh aliases configured.

The full chain: WSL2 → SSH → Bascule (Hetzner) → session + DID.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:14:51 -04:00
Tyler J King
63a6c0c520 feat: gsh human mode — interactive governed shell with reedline
Phase 3 / Sprint 2 finish line.

Human mode: reedline REPL with governed prompt.
  [governed] tyler@gsh:~$

Mode detection:
  --exec "cmd"              → machine mode (unchanged)
  --ungoverned --exec "cmd" → ungoverned machine (unchanged)
  (no --exec, TTY attached) → human mode (NEW)
  (no --exec, no TTY)       → error

Command classification per-keystroke (libgsh/classifier.rs):
  Free:       ls, cat, grep, echo, cd, git, ssh, curl — zero overhead
  Governed:   binaries in corpus dir — via org-ops wrapper, CR posted
  Ungoverned: not in corpus but on PATH — warn + execute
  Denied:     corpus manifest but removed — killswitch active

Session lifecycle:
  Start:  validate AC, post SESSION_STARTED CR, print banner
  Active: classify each command, governed ops post lightweight CRs
  End:    print summary (governed/free/denied/ungoverned), post SESSION_ENDED CR

Banner: principal, corpus, session ID, expiry, risk level
Prompt coloring from risk level:
  Baseline/Standard: green [governed]
  Elevated:          yellow [elevated]
  High/Critical:     red [HIGH]

New modules:
  libgsh/classifier.rs — command classification against corpus (4 tests)
  libgsh/session.rs    — session state tracking
  gsh/human.rs         — reedline REPL, prompt, banner, summary

Machine mode: zero changes (regression tested).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:44:34 -04:00
Tyler J King
919d8accde refactor: extract libgsh from monolith
Phase 2 of the WSL2 jumphost build.

Workspace: gsh/ (binary) + libgsh/ (library).

libgsh modules:
  ac.rs       — AC validation (R-22 single-use, R-23 corpus match, expiry)
  cr.rs       — CR construction + broker posting + inline AC request
  corpus.rs   — Corpus directory gate (killswitch)
  config.rs   — GshConfig from environment
  registry.rs — Filesystem-based consumed AC registry

gsh/src/main.rs: CLI only (~170 lines).
  Clap args, mode detection, calls libgsh, formats output.

11 unit tests in libgsh:
  ac: valid AC, expired, corpus mismatch, replay, missing context_id
  cr: broker URL formatting
  corpus: ungoverned skip, missing dir, command name extraction
  registry: consume and check
  config: default corpus_cid

Zero behavior change. Same JSON output, same exit codes,
same flags, same env vars, same broker interaction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 09:31:50 -04:00
Tyler J King
af11a797ee feat: per-session AC consumption + corpus gate + exit codes
Phase 1 of the WSL2 jumphost build.

Three execution models:
  1. Pre-issued AC: GSAP_AC='...' gsh --exec "cmd"
     Caller provides AC. gsh validates (R-22/23/24), executes, posts CR.
     For: Bascule, SK plugin, CI/CD.

  2. Inline AC request: GSAP_BROKER_URL=... gsh --exec "cmd"
     Backward compatible fallback.

  3. Ungoverned: gsh --ungoverned --exec "cmd"
     No AC, no CR, no corpus check. Dev mode.

AC validation (validate_pre_issued_ac):
  R-22: Single-use — filesystem registry at ~/.gsh/consumed/{context_id}
  R-23: Corpus match — AC corpus_entry_cid vs GSAP_CORPUS_CID env
  R-24: (parameters_cid field parsed, verification at broker)
  Expiry check — AC expires_at vs now
  Replay detection — consumed context_ids rejected

Corpus directory gate (corpus_check):
  /opt/substrate/corpus/{cid}/{command_name}
  If binary missing from corpus dir → denied (exit 3)
  The live killswitch: remove binary from corpus dir to revoke

Exit codes aligned with DESIGN.md:
  0 = success, 1 = exec failure, 2 = auth failure,
  3 = governance violation, 125 = gsh internal error

JSON output: new fields ac_mode ("pre-issued"|"inline"|"session"|"ungoverned"), corpus_cid

Tested against live fastapi-gsap broker:
  Inline AC: backward compat ✓
  Pre-issued AC from broker: validated + CR posted ✓
  Expired AC: exit 2 ✓
  Replay detection: exit 2 ✓
  Ungoverned mode: no governance overhead ✓

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 09:07:45 -04:00
Tyler J King
2f9401d3c4 feat: session mode — one AC for N commands
Per-invocation AC is the primitive for single governed ops.
Session mode is for scripts, pipelines, and interactive shells.

Per-invocation (unchanged):
  gsh --exec "cmd"  →  1 AC + 1 CR per command

Session mode (new):
  eval "$(gsh session-start --scope shell:session)"
  gsh --exec "cmd1"  # reuses session AC
  gsh --exec "cmd2"
  eval "$(gsh session-end)"

Detection: GSAP_SESSION_AC in environment.
Subcommands: session-start, session-end, session-status

Known gap: broker currently marks AC consumed after first CR.
Session commands 2+ get 404 on CR. This is a broker-side fix
(needs session AC type). gsh handles it gracefully.

Tested against live fastapi-gsap Spoke on Hetzner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 19:17:36 -04:00
Tyler J King
eab034f0cc feat: gsh machine mode — first governed shell execution
~200 lines of Rust. Every command: AC → exec → CR → CID.

Usage:
  gsh --exec "echo hello"
  gsh --exec "hcloud server list" --json
  gsh --exec "ansible-playbook site.yml" --dry-run

Flow:
  1. SHA-256 hash the command
  2. POST /governance/authorize/ → AC ID
  3. exec(sh, -c, command) → capture stdout/stderr/exit
  4. POST /governance/complete/ → receipt + Chronicle CID
  5. Print stdout (passthrough) or JSON (structured)
  6. Exit with command's exit code

Environment:
  GSAP_BROKER_URL   http://fastapi-gsap:8000
  GSAP_AGENT_DID    did:web:bxnet.../agent/platform-ops
  GSAP_TOKEN        Bearer token (optional)
  GSAP_CORPUS_CID   sha256:{image_digest} (optional)

Tested against live fastapi-gsap Spoke broker on Hetzner:
  dry-run: AC only ✓
  live exec: stdout passthrough + CID ✓
  JSON mode: ac_id + cr_id + chronicle_cid ✓
  exit code: 42 passed through ✓

The command_hash in the AC request means the broker knows
WHAT will be executed before authorizing. Not just "was
this agent allowed" but "was this exact command authorized."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 19:01:22 -04:00
Tyler J King
3a2ed1ed42 feat: gsh governed shell — design exploration
DESIGN.md: complete architecture exploration for gsh,
the GCAP governed shell binary.

Two modes:
  Machine: headless JSON I/O, GSAP AC consumption,
    CR posting, exit code governance mapping.
    Auto-detected: no TTY → machine mode.
    What SK plugin and Logic Apps need now.
  Human: interactive, [governed] prompt,
    inline elevation, session-level AC.
    What Sam needs for daily ops.

Architecture: gsh binary + libgsh library.
  common/: AC validation, CR posting, Chronicle env.
  machine/: headless executor.
  human/: reedline shell, prompt, interceptor.

6 open design questions documented.
MVP: machine mode first (~200 lines Rust).

Before building: resolve Q2 (session vs per-command AC)
and Q6 (full shell vs bash wrapper).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 23:20:27 -04:00
6833d34e68 Initial commit 2026-03-31 03:15:52 +00:00