Initial commit: Kedge network automation platform

Go-based network automation with YANG models, gRPC, Ansible,
Terraform, and Kubernetes integration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Tyler King 2026-02-26 12:09:30 -05:00
commit 6058e62348
109 changed files with 5152 additions and 0 deletions

8
.claude/settings.json Normal file
View file

@ -0,0 +1,8 @@
{
"permissions": {
"allow": [
"Read(//home/tking/projects/guildhouse/**)",
"Read(//home/tking/projects/guildhouse/services/guildhouse-proto/**)"
]
}
}

63
.gitignore vendored Normal file
View file

@ -0,0 +1,63 @@
# Binaries
bin/
kedge-cni
kedge-daemon
# Go
*.exe
*.exe~
*.dll
*.so
*.dylib
*.test
*.out
vendor/
# Proto generated code
api/quartermaster/v1/*.pb.go
# Python
__pycache__/
*.py[cod]
*$py.class
*.egg-info/
dist/
build/
.eggs/
*.egg
.venv/
venv/
yang/.venv/
# Secrets — NEVER commit
*.key
*.pem
vault_pass*
.env
.env.*
credentials.json
kubeconfig
*.kubeconfig
# Terraform
terraform/.terraform/
terraform/**/.terraform/
*.tfstate
*.tfstate.*
*.tfplan
.terraform.lock.hcl
# Ansible
*.retry
# IDE
.idea/
.vscode/
*.swp
*.swo
*~
.DS_Store
# Coverage
coverage.out
coverage.html

206
CLAUDE.md Normal file
View file

@ -0,0 +1,206 @@
# CLAUDE.md — Kedge
> See the full architecture specification in the initial project prompt. This file captures conventions and development workflow.
## Quick Reference
- **Language**: Go (CNI plugin + DaemonSet), Python (YANG compiler)
- **Module**: `github.com/guildhouse-co/kedge`
- **Binaries**: `cmd/kedge-cni/` (CNI plugin), `cmd/kedge-daemon/` (DaemonSet)
- **Build**: `make container-build` (production), `make dev && make build` (development)
## Conventions
- Go: Standard project layout. `cmd/` for entrypoints, `internal/` for non-exported.
- Explicit error returns, no panics.
- CNI plugin follows CNI spec exactly.
- Python: Type hints required. `pyang` for YANG validation.
- YAML: 2-space indent.
- Secrets: Never in repo. `.gitignore` excludes `*.key`, `*.pem`, `vault_pass*`.
- Testing: Go `_test.go` files. Python `pytest`. Ansible `molecule/`.
- Container engine: podman (dev), k3s (prod).
---
## Development Workflow
### Container-Based Development (podman-compose)
```bash
# Build the dev container
make dev-build
# Start dev environment
make dev
# Open a shell in the dev container
make dev-exec
# Build binaries (in container)
make build
# Run tests (in container)
make test
# Run linter (in container)
make lint
# Generate proto code (in container)
make proto-gen
# Resolve Go dependencies (in container)
make mod-tidy
# Tear down dev environment
make dev-down
```
### Production Container Images
```bash
# Build production images (CNI + DaemonSet)
make container-build
```
### YANG Compiler
```bash
# Validate YANG models
make yang-validate
# Run Python compiler tests
make python-test
```
Important: The dev environment uses podman-compose. Go, protoc, buf, golangci-lint, and wireguard-tools are all available inside the dev container. No local Go installation is required.
---
## Future: Dynamic Posture Framework
The Guildhouse ecosystem supports a dynamic governance posture system with five levels (5 = normal operations, 1 = lockdown). Each level has associated accords that define allowed capabilities, modes, and operational scope. Kedge enforces posture at the network layer.
**Kedge's role is network enforcement and mesh management during posture transitions.**
### DaemonSet Posture Behavior
The Kedge DaemonSet reads the current posture level from the same `posture_level` BPF map that the shell eBPF programs use (or receives it via notification from the Bascule governance daemon). On posture change, the DaemonSet adjusts:
**Overlay mode (WireGuard mesh):**
| Level | Mesh Behavior |
|-------|---------------|
| 5 | Full mesh. All tunnels active. Standard keepalive. |
| 4 | Full mesh. Increased health check frequency (detect degradation faster). |
| 3 | Full mesh maintained, but new Shellstream sessions require fresh SAT re-verification at handshake. DaemonSet rejects cached SAT validations. |
| 2 | Mesh maintained for management traffic only. Non-critical Shellstream sessions terminated. DaemonSet sends DRAINING signal to active non-essential sessions. |
| 1 | All remote overlay tunnels torn down. Cloud anchor disconnected. Only local loopback Shellstream permitted. The appliance is network-isolated except for physical console. |
**Underlay mode (physical infrastructure programming):**
| Level | Underlay Behavior |
|-------|-------------------|
| 5 | Normal operations. All authorized tiers can trigger infrastructure mutations via Bascule SDK dispatch. NetworkMutationArtifacts notarized per accord. |
| 4 | Normal operations. YANG drift detection frequency increased. |
| 3 | Underlay mutations restricted to master + site_owner tiers. Lower-tier SATs denied underlay mode at Shellstream handshake. Mutation proposals still accepted from all tiers but execution restricted. |
| 2 | Underlay mutations restricted to site_owner only. YANG compiler frozen — no new compilations. Only pre-approved emergency changes permitted. |
| 1 | All underlay operations frozen. No infrastructure mutations. Device configs frozen in place. VLAN interfaces maintained for monitoring but no changes dispatched. |
**Shellstream handshake behavior:**
The DaemonSet's Shellstream handshake termination adjusts per level:
```
Level 5: Standard 3-way handshake. SAT validated from cache. Full capability grant.
Level 4: Standard handshake. SAT validated from cache. Full capability grant.
Additional telemetry: log all new sessions (not just sampled).
Level 3: Strict handshake. SAT re-verified against SPIRE/Vigil (no cache).
Capability grant narrowed per posture accord (PROPOSE revoked for
lower tiers). Handshake latency increases (~50-100ms for re-verification)
but security posture tightens.
Level 2: Restricted handshake. SAT re-verified. Only critical session types
permitted (management, incident response, health monitoring).
Non-critical session types rejected at ATTEST-VERIFY step with
posture-level denial reason. Existing non-critical sessions receive
DRAIN signal.
Level 1: No remote handshakes. All inbound ATTEST-INIT rejected.
All outbound sessions blocked (eBPF shell also enforces this).
Only local Shellstream (loopback) permitted for console management.
```
**CNI plugin behavior:**
The CNI plugin's route programming adjusts on posture transitions:
| Level | Route Changes |
|-------|---------------|
| 5 | Full routes per `shell_state.allowed_targets` and `allowed_mode`. |
| 4 | Same as 5. No route changes. |
| 3 | Remove underlay routes for pods whose `shell_state` shows tiers below master. Overlay routes maintained. |
| 2 | Remove all non-critical routes. Only routes to management endpoints (Quartermaster, SPIRE, Prometheus) and incident response targets retained. |
| 1 | Remove all overlay routes (tunnels down). Remove all underlay routes except monitoring-only (read-only SNMP/health endpoints). |
Route changes on posture transition are triggered by the DaemonSet (which watches `posture_level` changes) notifying the CNI plugin's route manager. The CNI plugin does NOT poll the posture level — it receives route update commands from the DaemonSet.
### Quartermaster Notarization Fidelity
Kedge's DaemonSet adjusts `SessionTransitArtifact` and `NetworkMutationArtifact` submission behavior per level:
| Level | Notarization |
|-------|-------------|
| 5 | Sampled — 10% of routine session transits. All mutations. |
| 4 | Increased — 50% of session transits. All mutations. |
| 3 | Full — 100% of session transits. 100% of mutations. |
| 2 | Full + real-time — 100% with immediate anchor flushing (no batching). |
| 1 | Forensic — 100% with full payload hashing. Anchor flush on every record. Session metadata includes full Shellstream frame headers. |
At level 1, Kedge's DaemonSet also emits a `PostureForensicSnapshot` to Quartermaster containing the current mesh topology, all active tunnel states, all programmed routes, and all pending session state. This is the network-layer forensic evidence.
### VPP Governance Plugin Coordination (Appliance Deployments)
On Guildhouse SD-WAN appliance deployments where VPP handles the data plane, the Kedge DaemonSet coordinates posture changes with the VPP governance plugin:
```
Posture transition signal received
├─ DaemonSet updates its own behavior (overlay/underlay/handshake)
├─ DaemonSet notifies VPP governance plugin via shared memory
│ governance_state.posture_level = new_level
│ VPP plugin adjusts on next poll cycle (~microseconds):
│ Level 5: standard flow rules
│ Level 4: deep packet classification enabled
│ Level 3: non-essential inter-zone traffic deprioritized
│ Level 2: emergency zone policy (critical flows only)
│ Level 1: allowlist-only (everything not pre-approved dropped)
└─ Both DaemonSet and VPP plugin emit posture change telemetry
to their respective ring buffers → Quartermaster
```
The DaemonSet and VPP plugin must reach the new posture level within the same transition window. The DaemonSet updates VPP shared memory BEFORE adjusting its own Shellstream handshake behavior — this ensures the data plane tightens before (or simultaneously with) the session layer.
### Mesh-Wide Posture Visibility
In a multi-cluster deployment, each cluster's Kedge DaemonSet maintains its own posture level. Cross-cluster posture is communicated via Shellstream:
- Each DaemonSet includes the local posture level in Shellstream handshake metadata
- When a cloud anchor DaemonSet receives an ATTEST-INIT from a homelab DaemonSet at level 3, the cloud anchor knows to apply level 3 session restrictions even if the cloud anchor itself is at level 5
- The more restrictive posture wins at any boundary crossing: if the source is at level 3 and the destination is at level 5, the session is governed by level 3 rules
- This is enforced per-session, not per-cluster. A cluster at level 5 can host sessions from a level 3 peer — it just applies level 3 restrictions to those specific sessions
---
## Working Notes
- **Posture transition latency budget**: Kedge's network-level posture transition should complete within 5 seconds for the overlay path (tunnel adjustments, Shellstream session draining) and 10 seconds for the underlay path (route removal, VLAN interface adjustments). The VPP plugin path should complete within 1ms (shared memory update + next poll cycle). These targets ensure that a DEFCON 5→3 escalation restricts network access before an attacker can exploit the window.
- **Graceful session draining on escalation**: When escalating from level 5 to level 3, existing Shellstream sessions from lower-tier operators should not be abruptly terminated. The DaemonSet sends a DRAIN signal, allowing in-flight operations to complete (with a timeout), then tears down the session. Only level 2+ escalation does abrupt termination of non-critical sessions.
- **Posture-aware `SessionTransitArtifact`**: The `SessionTransitArtifact` should include a `posture_level_at_transit` field recording the posture level at the time of the session boundary crossing. This allows Quartermaster queries like "show all session transits that occurred while the cluster was at DEFCON 3" — critical for post-incident forensics.
- **De-escalation route restoration**: When de-escalating (e.g., level 3 → level 4 → level 5), the DaemonSet must restore routes that were removed during escalation. This requires the DaemonSet to maintain a "pre-escalation route table" snapshot so it can cleanly restore. The snapshot is taken at escalation time and stored locally (not in the BPF map — too large).
- **Level 1 cloud anchor behavior**: At level 1, the homelab cluster tears down overlay tunnels to the cloud anchor. The cloud anchor's Kedge DaemonSet should detect this as a posture-driven disconnection (not a failure) based on the DRAIN signal preceding the tunnel teardown. The cloud anchor should NOT attempt reconnection until it receives a de-escalation signal (via an out-of-band channel — physical console, SMS API, or pre-arranged re-attestation ceremony).
- **Testing posture transitions**: Add `posture-transition-test.yml` to Ansible playbooks. Should simulate escalation from 5→1 and back, verifying at each level: correct routes programmed, correct Shellstream handshake behavior, correct notarization fidelity, correct VPP plugin state (if applicable). This is a critical system-level integration test.

33
Containerfile.build Normal file
View file

@ -0,0 +1,33 @@
FROM docker.io/library/golang:1.23-bookworm AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /bin/kedge-cni ./cmd/kedge-cni
RUN CGO_ENABLED=0 go build -o /bin/kedge-daemon ./cmd/kedge-daemon
# --- CNI plugin image (installed into host /opt/cni/bin/) ---
FROM scratch AS cni
COPY --from=builder /bin/kedge-cni /kedge-cni
# --- DaemonSet image ---
FROM docker.io/library/debian:bookworm-slim AS daemon
RUN apt-get update && apt-get install -y --no-install-recommends \
wireguard-tools \
iproute2 \
iptables \
python3 \
python3-pip \
python3-venv \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
COPY --from=builder /bin/kedge-daemon /usr/local/bin/kedge-daemon
COPY yang/ /opt/kedge/yang/
RUN python3 -m venv /opt/kedge/venv \
&& /opt/kedge/venv/bin/pip install --no-cache-dir -r /opt/kedge/yang/requirements.txt
ENTRYPOINT ["kedge-daemon"]

27
Containerfile.dev Normal file
View file

@ -0,0 +1,27 @@
FROM docker.io/library/golang:1.23-bookworm
# Install system dependencies for CNI development.
RUN apt-get update && apt-get install -y --no-install-recommends \
wireguard-tools \
iproute2 \
iptables \
python3 \
python3-pip \
python3-venv \
protobuf-compiler \
ca-certificates \
curl \
git \
jq \
&& rm -rf /var/lib/apt/lists/*
# Install Go tools.
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go@latest \
&& go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest \
&& go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
# Install buf for proto management.
RUN curl -sSL "https://github.com/bufbuild/buf/releases/latest/download/buf-$(uname -s)-$(uname -m)" \
-o /usr/local/bin/buf && chmod +x /usr/local/bin/buf
WORKDIR /src

76
Makefile Normal file
View file

@ -0,0 +1,76 @@
.PHONY: all build build-cni build-daemon test lint proto-gen yang-validate \
container-build dev dev-build dev-exec clean
CONTAINER_ENGINE ?= podman
COMPOSE ?= $(CONTAINER_ENGINE)-compose
DEV_CONTAINER := kedge-dev
CNI_BINARY := bin/kedge-cni
DAEMON_BINARY := bin/kedge-daemon
# ---------- Container-based builds (default) ----------
all: container-build
container-build:
$(CONTAINER_ENGINE) build --target cni -t kedge-cni -f Containerfile.build .
$(CONTAINER_ENGINE) build --target daemon -t kedge-daemon -f Containerfile.build .
# ---------- Dev environment (podman-compose) ----------
dev-build:
$(COMPOSE) build
dev:
$(COMPOSE) up -d dev
dev-exec:
$(COMPOSE) exec dev bash
dev-down:
$(COMPOSE) down
# ---------- In-container Go commands (via dev container) ----------
build: build-cni build-daemon
build-cni:
$(COMPOSE) exec dev go build -o $(CNI_BINARY) ./cmd/kedge-cni
build-daemon:
$(COMPOSE) exec dev go build -o $(DAEMON_BINARY) ./cmd/kedge-daemon
test:
$(COMPOSE) exec dev go test ./...
lint:
$(COMPOSE) exec dev golangci-lint run ./...
proto-gen:
$(COMPOSE) exec dev buf generate
mod-tidy:
$(COMPOSE) exec dev go mod tidy
# ---------- YANG / Python ----------
yang-validate:
$(COMPOSE) exec yang-compiler pyang --strict models/sovereign-sdwan.yang
python-test:
$(COMPOSE) exec yang-compiler python -m pytest tests/
# ---------- Local (if Go is installed) ----------
build-local:
go build -o $(CNI_BINARY) ./cmd/kedge-cni
go build -o $(DAEMON_BINARY) ./cmd/kedge-daemon
test-local:
go test ./...
# ---------- Clean ----------
clean:
rm -rf bin/
$(COMPOSE) down -v 2>/dev/null || true

55
README.md Normal file
View file

@ -0,0 +1,55 @@
# Kedge
Dual-mode Kubernetes CNI plugin and node-level DaemonSet for attested connectivity between clusters and managed infrastructure. Part of the [Guildhouse](https://github.com/guildhouse-co/guildhouse) ecosystem.
## Modes
- **Overlay**: Tunnels over networks Kedge doesn't control (WireGuard/VXLAN). For cloud anchors, remote sites, MSP-managed customer environments.
- **Underlay**: Programs the actual physical network fabric via vendor SDK dispatch through Bascule. For sites where you own the iron.
Both modes coexist at the same site. The capability token in a Shellstream handshake determines whether a session gets tunnel access (overlay) or infrastructure mutation authority (underlay).
## Components
| Component | Language | Description |
|-----------|----------|-------------|
| CNI Plugin | Go | Multus secondary network attachment (`net1`), route programming |
| DaemonSet | Go | WireGuard mesh, Shellstream termination, QM notarization, VLAN management |
| YANG Compiler | Python | Device-agnostic policy → vendor-specific config (FortiOS, VyOS, UniFi) |
## Build
```bash
make build # Build both binaries
make test # Run Go tests
make lint # golangci-lint
make proto-gen # Generate gRPC client code from protos
make yang-validate # Validate YANG models with pyang
make python-test # Run YANG compiler tests
```
## Project Structure
```
cmd/ # Binary entry points (kedge-cni, kedge-daemon)
internal/ # Go internal packages
cni/ # CNI plugin logic
mesh/ # WireGuard tunnel lifecycle (overlay)
vlan/ # VLAN interface management (underlay)
shellstream/ # Handshake termination, SAT validation
quartermaster/ # QM gRPC client, artifact types
underlay/ # YANG watch, compilation trigger
health/ # Prometheus metrics
headscale/ # Peer discovery
topology/ # Shared topology state
config/ # Configuration types
yang/ # YANG models, site configs, Python compiler
k8s/ # Kubernetes manifests
ansible/ # Bootstrap provisioning roles
terraform/ # Cloud anchor, mesh topology IaC
monitoring/ # Prometheus, Grafana dashboards
```
## Current Status
**Phase 1** (active): Two-cluster connectivity (homelab k3s + cloud anchor k3s). Overlay: WireGuard mesh, Shellstream handshake, SessionTransitArtifact. Underlay: VLAN bridges, YANG model, FortiOS + VyOS compiler targets, NetworkMutationArtifact.

View file

@ -0,0 +1,17 @@
---
# Variables shared across all hosts.
kedge_version: "0.1.0"
kedge_wg_port: 51820
kedge_shellstream_port: 8443
kedge_metrics_port: 9090
# SPIRE
spire_trust_domain: "guildhouse.local"
spire_socket_path: "/run/spire/sockets/agent.sock"
# Quartermaster
qm_endpoint: "quartermaster.guildhouse.svc:50051"
# k3s
k3s_version: "v1.31.4+k3s1"

View file

@ -0,0 +1,10 @@
---
# Cloud anchor specific variables.
kedge_mode: overlay
kedge_overlay_enabled: true
kedge_underlay_enabled: false
# Headscale runs on the cloud anchor.
headscale_enabled: true
headscale_listen_addr: "0.0.0.0:8080"

View file

@ -0,0 +1,18 @@
---
# Homelab specific variables.
kedge_mode: both
kedge_overlay_enabled: true
kedge_underlay_enabled: true
# Managed devices.
kedge_devices:
- name: fortigate.transit.local
type: fortios
address: 172.16.0.2
- name: vyos.transit.local
type: vyos
address: 172.16.0.3
- name: udr7.local
type: unifi
address: 192.168.1.1

View file

@ -0,0 +1,21 @@
---
all:
children:
cloud_anchor:
hosts:
anchor01:
ansible_host: anchor.guildhouse.example.com
ansible_user: deploy
kedge_mode: overlay
kedge_cluster_id: cloud-anchor
homelab:
hosts:
homelab01:
ansible_host: 10.0.1.10
ansible_user: tking
kedge_mode: both
kedge_cluster_id: homelab
kedge_vlans:
- {id: 100, parent: eth0, subnet: "172.16.0.0/24"}
- {id: 10, parent: eth0, subnet: "10.0.1.0/24"}
- {id: 50, parent: eth0, subnet: "192.168.50.0/24"}

View file

@ -0,0 +1,15 @@
---
# Bootstrap a cloud anchor cluster (overlay mode only).
- name: Bootstrap cloud anchor
hosts: cloud_anchor
become: true
roles:
- base-hardening
- wireguard-bootstrap
- headscale
- k3s-bootstrap
- monitoring
vars:
kedge_mode: overlay
kedge_overlay_enabled: true
kedge_underlay_enabled: false

View file

@ -0,0 +1,14 @@
---
# Bootstrap a homelab cluster (overlay + underlay mode).
- name: Bootstrap homelab
hosts: homelab
become: true
roles:
- base-hardening
- wireguard-bootstrap
- k3s-bootstrap
- monitoring
vars:
kedge_mode: both
kedge_overlay_enabled: true
kedge_underlay_enabled: true

View file

@ -0,0 +1,31 @@
---
# Simulate primary WAN failure and verify OOB path survivability.
- name: WAN failover test
hosts: homelab
become: true
tasks:
- name: Record current WireGuard peer status
ansible.builtin.command:
cmd: wg show wg0
register: wg_before
changed_when: false
- name: Display pre-failover state
ansible.builtin.debug:
var: wg_before.stdout_lines
# NOTE: Actual failover simulation (iptables rules to block primary WAN)
# is too destructive for automated runs. This playbook verifies the
# monitoring and detection components are working.
- name: Check Kedge DaemonSet dead peer detection
ansible.builtin.uri:
url: "http://localhost:{{ kedge_metrics_port }}/metrics"
return_content: true
register: metrics
- name: Verify mesh peer metrics exist
ansible.builtin.assert:
that:
- "'kedge_mesh_peer_count' in metrics.content"
fail_msg: "Mesh peer metrics not found — DaemonSet may not be running"

View file

@ -0,0 +1,30 @@
---
# Verify overlay mesh connectivity between all clusters.
- name: Mesh health check
hosts: all
become: true
tasks:
- name: Check WireGuard interface
ansible.builtin.command:
cmd: wg show wg0
register: wg_status
changed_when: false
- name: Display WireGuard status
ansible.builtin.debug:
var: wg_status.stdout_lines
- name: Ping overlay peers
ansible.builtin.command:
cmd: "ping -c 3 -W 2 {{ item }}"
loop: "{{ wg_peers | default([]) | map(attribute='allowed_ips') | flatten | map('regex_replace', '/.*', '') | list }}"
register: ping_results
changed_when: false
failed_when: false
- name: Check Kedge DaemonSet health
ansible.builtin.uri:
url: "http://localhost:{{ kedge_metrics_port }}/healthz"
return_content: true
register: health
failed_when: health.status != 200

View file

@ -0,0 +1,31 @@
---
# Audit underlay device configurations against YANG desired state.
# Detects config drift for the insurance observability story.
- name: Underlay configuration audit
hosts: homelab
become: true
tasks:
- name: Validate YANG model
ansible.builtin.command:
cmd: pyang --strict /opt/kedge/yang/models/sovereign-sdwan.yang
register: yang_validation
changed_when: false
- name: Run YANG compiler for current site
ansible.builtin.command:
cmd: >
python3 /opt/kedge/yang/compiler/compile.py
--site-config /opt/kedge/yang/site-config/homelab.xml
--output-format json
register: compiled_config
changed_when: false
- name: Display compiled desired state
ansible.builtin.debug:
var: compiled_config.stdout | from_json
# TODO: Fetch actual device configs via Bascule SDK and compare
# against compiled desired state. Report drift.
- name: Report audit status
ansible.builtin.debug:
msg: "Underlay audit complete. Drift detection requires Bascule SDK integration."

View file

@ -0,0 +1,2 @@
---
base_hardening_ssh_port: 22

View file

@ -0,0 +1,5 @@
---
- name: restart sshd
ansible.builtin.service:
name: sshd
state: restarted

View file

@ -0,0 +1,59 @@
---
- name: Update apt cache
ansible.builtin.apt:
update_cache: true
cache_valid_time: 3600
when: ansible_os_family == "Debian"
- name: Install base packages
ansible.builtin.package:
name:
- ufw
- fail2ban
- unattended-upgrades
- wireguard-tools
- jq
- curl
state: present
- name: Configure SSH hardening
ansible.builtin.lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
loop:
- {regexp: '^#?PermitRootLogin', line: 'PermitRootLogin no'}
- {regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no'}
- {regexp: '^#?X11Forwarding', line: 'X11Forwarding no'}
notify: restart sshd
- name: Enable UFW with default deny
community.general.ufw:
state: enabled
default: deny
direction: incoming
- name: Allow SSH
community.general.ufw:
rule: allow
port: "22"
proto: tcp
- name: Allow WireGuard
community.general.ufw:
rule: allow
port: "{{ kedge_wg_port }}"
proto: udp
- name: Allow Kedge metrics
community.general.ufw:
rule: allow
port: "{{ kedge_metrics_port }}"
proto: tcp
src: "10.0.0.0/8"
- name: Enable fail2ban
ansible.builtin.service:
name: fail2ban
enabled: true
state: started

View file

@ -0,0 +1,2 @@
---
headscale_version: "0.23.0"

View file

@ -0,0 +1,43 @@
---
- name: Download Headscale binary
ansible.builtin.get_url:
url: "https://github.com/juanfont/headscale/releases/download/v{{ headscale_version }}/headscale_{{ headscale_version }}_linux_amd64"
dest: /usr/local/bin/headscale
mode: "0755"
- name: Create Headscale config directory
ansible.builtin.file:
path: /etc/headscale
state: directory
mode: "0755"
- name: Create Headscale data directory
ansible.builtin.file:
path: /var/lib/headscale
state: directory
mode: "0750"
- name: Deploy Headscale systemd unit
ansible.builtin.copy:
content: |
[Unit]
Description=Headscale mesh coordinator
After=network.target
[Service]
ExecStart=/usr/local/bin/headscale serve
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
dest: /etc/systemd/system/headscale.service
mode: "0644"
notify: restart headscale
- name: Enable and start Headscale
ansible.builtin.service:
name: headscale
enabled: true
state: started
daemon_reload: true

View file

@ -0,0 +1,2 @@
---
k3s_version: "v1.31.4+k3s1"

View file

@ -0,0 +1,37 @@
---
- name: Download k3s installer
ansible.builtin.get_url:
url: https://get.k3s.io
dest: /tmp/k3s-install.sh
mode: "0755"
- name: Install k3s
ansible.builtin.command:
cmd: /tmp/k3s-install.sh
creates: /usr/local/bin/k3s
environment:
INSTALL_K3S_VERSION: "{{ k3s_version }}"
K3S_KUBECONFIG_MODE: "644"
- name: Wait for k3s to be ready
ansible.builtin.command:
cmd: k3s kubectl get nodes
register: k3s_nodes
retries: 10
delay: 5
until: k3s_nodes.rc == 0
- name: Install Multus CNI
ansible.builtin.command:
cmd: k3s kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset-thick.yml
changed_when: true
- name: Create kedge namespace
ansible.builtin.command:
cmd: k3s kubectl create namespace kedge --dry-run=client -o yaml | k3s kubectl apply -f -
changed_when: true
- name: Deploy Kedge RBAC
ansible.builtin.command:
cmd: k3s kubectl apply -f /opt/kedge/k8s/rbac.yaml
changed_when: true

View file

@ -0,0 +1,2 @@
---
kedge_metrics_port: 9090

View file

@ -0,0 +1,30 @@
---
- name: Install node_exporter
ansible.builtin.package:
name: prometheus-node-exporter
state: present
when: ansible_os_family == "Debian"
- name: Enable node_exporter
ansible.builtin.service:
name: prometheus-node-exporter
enabled: true
state: started
- name: Create monitoring directory
ansible.builtin.file:
path: /etc/kedge/monitoring
state: directory
mode: "0755"
- name: Deploy Prometheus scrape config for Kedge
ansible.builtin.copy:
content: |
- job_name: kedge-daemon
static_configs:
- targets: ['localhost:{{ kedge_metrics_port }}']
labels:
cluster: '{{ kedge_cluster_id }}'
node: '{{ inventory_hostname }}'
dest: /etc/kedge/monitoring/prometheus-kedge.yml
mode: "0644"

View file

@ -0,0 +1,3 @@
---
wg_address: "10.100.0.1/24"
wg_peers: []

View file

@ -0,0 +1,43 @@
---
- name: Ensure WireGuard is installed
ansible.builtin.package:
name: wireguard-tools
state: present
- name: Create /etc/wireguard directory
ansible.builtin.file:
path: /etc/wireguard
state: directory
mode: "0700"
- name: Generate WireGuard private key
ansible.builtin.command:
cmd: wg genkey
creates: /etc/wireguard/private.key
register: wg_privkey
- name: Write private key
ansible.builtin.copy:
content: "{{ wg_privkey.stdout }}"
dest: /etc/wireguard/private.key
mode: "0600"
when: wg_privkey.changed
- name: Derive public key
ansible.builtin.shell:
cmd: cat /etc/wireguard/private.key | wg pubkey
register: wg_pubkey
changed_when: false
- name: Template WireGuard config
ansible.builtin.template:
src: wg0.conf.j2
dest: /etc/wireguard/wg0.conf
mode: "0600"
notify: restart wireguard
- name: Enable and start WireGuard
ansible.builtin.service:
name: "wg-quick@wg0"
enabled: true
state: started

View file

@ -0,0 +1,12 @@
[Interface]
PrivateKey = {{ lookup('file', '/etc/wireguard/private.key') }}
ListenPort = {{ kedge_wg_port }}
Address = {{ wg_address | default('10.100.0.1/24') }}
{% for peer in wg_peers | default([]) %}
[Peer]
PublicKey = {{ peer.public_key }}
Endpoint = {{ peer.endpoint }}
AllowedIPs = {{ peer.allowed_ips | join(', ') }}
PersistentKeepalive = 25
{% endfor %}

33
ansible/site.yml Normal file
View file

@ -0,0 +1,33 @@
---
# Kedge site bootstrap playbook.
# Provisions infrastructure for Kedge deployment at a new site.
- name: Apply base hardening to all nodes
hosts: all
become: true
roles:
- base-hardening
- name: Bootstrap WireGuard mesh
hosts: all
become: true
roles:
- wireguard-bootstrap
- name: Deploy Headscale coordinator
hosts: cloud_anchor
become: true
roles:
- headscale
- name: Bootstrap k3s clusters
hosts: all
become: true
roles:
- k3s-bootstrap
- name: Deploy monitoring stack
hosts: all
become: true
roles:
- monitoring

10
buf.gen.yaml Normal file
View file

@ -0,0 +1,10 @@
version: v2
plugins:
- remote: buf.build/protocolbuffers/go
out: api
opt:
- paths=source_relative
- remote: buf.build/grpc/go
out: api
opt:
- paths=source_relative

9
buf.yaml Normal file
View file

@ -0,0 +1,9 @@
version: v2
modules:
- path: proto
lint:
use:
- STANDARD
breaking:
use:
- FILE

42
compose.yaml Normal file
View file

@ -0,0 +1,42 @@
# Kedge development environment — podman-compose / docker-compose
# Usage: podman-compose up -d
services:
# Go build + dev container with hot reload support.
dev:
build:
context: .
dockerfile: Containerfile.dev
volumes:
- .:/src:z
- go-mod-cache:/go/pkg/mod
- go-build-cache:/root/.cache/go-build
working_dir: /src
command: sleep infinity
network_mode: host
privileged: true
cap_add:
- NET_ADMIN
- SYS_ADMIN
# YANG compiler dev container (Python).
yang-compiler:
image: docker.io/library/python:3.12-slim
volumes:
- ./yang:/src/yang:z
working_dir: /src/yang
command: >
bash -c "pip install -r requirements.txt && sleep infinity"
# Local Quartermaster mock for development.
# Uses the real QM image if available, otherwise a gRPC echo server.
quartermaster:
image: docker.io/library/golang:1.23-bookworm
command: >
bash -c "echo 'Quartermaster mock — replace with real QM image when available' && sleep infinity"
ports:
- "50051:50051"
volumes:
go-mod-cache:
go-build-cache:

10
docs/adding-a-site.md Normal file
View file

@ -0,0 +1,10 @@
# Adding a Site
## Steps
1. Provision infrastructure (Terraform `cloud-anchor` module or manual)
2. Run `ansible-playbook ansible/playbooks/bootstrap-<site>.yml`
3. Generate WireGuard keys (`scripts/generate-wireguard-keys.sh`)
4. Add peer to mesh ConfigMap (`k8s/configmap-mesh.yaml`)
5. If underlay: create site YANG instance data (`yang/site-config/<site>.xml`)
6. Deploy Kedge DaemonSet
7. Verify with `ansible-playbook ansible/playbooks/mesh-health.yml`

5
docs/architecture.md Normal file
View file

@ -0,0 +1,5 @@
# Architecture
Kedge is a dual-mode Kubernetes CNI plugin and DaemonSet providing attested connectivity between clusters and managed infrastructure.
See [CLAUDE.md](../CLAUDE.md) for the complete architecture specification.

12
docs/cni-plugin.md Normal file
View file

@ -0,0 +1,12 @@
# CNI Plugin
The Kedge CNI plugin is a Multus secondary network plugin that attaches a `net1` interface to pods requiring infrastructure access.
## Operation
- Invoked by Multus on pod ADD/DEL/CHECK
- Creates veth pair, moves one end into pod network namespace
- Programs routes for both overlay (WireGuard) and underlay (VLAN bridge) subnets
- Applies SVID-based network policy
## Configuration
Configured via Multus `NetworkAttachmentDefinition` — see `k8s/network-attachment.yaml`.

16
docs/failover-runbook.md Normal file
View file

@ -0,0 +1,16 @@
# Failover Runbook
## Primary WAN Failure
1. Dead peer detection triggers on WireGuard handshake timeout
2. If secondary circuit configured, mesh manager fails over
3. Overlay sessions are re-established over secondary path
4. Underlay operations pause — device management requires network path to device
## Verification
```bash
ansible-playbook ansible/playbooks/failover-test.yml
ansible-playbook ansible/playbooks/mesh-health.yml
```
## Recovery
When primary WAN recovers, mesh manager detects restored handshake and fails back.

View file

@ -0,0 +1,419 @@
# HFLKedge Network Integration
How Host Function Layer WASM module Shellstream sessions interact with Kedge's CNI plugin and DaemonSet.
> **Cross-references:**
> Substrate [`docs/shell-primitive.md`](../../substrate/docs/shell-primitive.md) — BPF map structures, eBPF hooks, shell lifecycle.
> Substrate [`docs/hfl-spec.md`](../../substrate/docs/hfl-spec.md) — WASM module grant attenuation, host function categories.
> Kedge [`internal/cni/plugin.go`](../internal/cni/plugin.go) — CNI `CmdAdd` flow.
> Kedge [`internal/cni/policy.go`](../internal/cni/policy.go) — SVID-scoped policy evaluation.
> Kedge [`internal/quartermaster/session_transit.go`](../internal/quartermaster/session_transit.go) — Session transit artifact type.
---
## 1. Overview
Kedge is the network data plane for the Guildhouse infrastructure. It operates as both a Kubernetes CNI plugin (creating the `net1` secondary interface via Multus) and a DaemonSet managing WireGuard tunnels, VLAN interfaces, and Shellstream session termination. The Host Function Layer (HFL) is a WASM-based runtime embedded in Bascule that exposes structured host functions — Quartermaster queries, telemetry reads, infrastructure state, YANG operations — to application workloads running inside attested shells.
When an HFL WASM module calls a host function that requires network access, the resulting Shellstream session traverses Kedge's network stack. The HFL does not interact with Kedge directly — it opens Shellstream connections that transit Kedge's `net1` interface like any other pod traffic. What makes HFL sessions distinct is the **three-layer enforcement model** that governs them:
1. **Kernel layer (eBPF shell):** The `shell_map` BPF map restricts which destinations and modes the pod's cgroup can reach. Kedge's CNI plugin reads `allowed_mode` from this map at interface creation time. The eBPF `CGROUP_SOCK_ADDR` hook enforces `allowed_targets` at connection time. These checks happen in the kernel — no userspace daemon is in the hot path.
2. **Application layer (HFL grants):** Before a host function opens any Shellstream session, the HFL runtime validates the WASM module's `ModuleGrant` against the requested operation. If the grant does not cover the operation (wrong capabilities, wrong registry, wrong mode), the request is rejected before it reaches the network.
3. **Cryptographic layer (SAT attenuation):** The per-request `RequestToken` carried in the Shellstream frame header is validated independently by the receiving service (Quartermaster, Bascule, Prometheus). Each layer operates without trusting the others.
These three layers compose but do not depend on each other. A compromised HFL runtime cannot bypass the eBPF shell. A compromised eBPF program cannot forge SAT tokens. This is defense in depth through independent enforcement points, not redundant checks in a single trust domain.
---
## 2. Shared BPF Map Architecture
Kedge and the shell primitive share state through `shell_map`, a `BPF_MAP_TYPE_HASH` map keyed by `cgroup_id`. There is no API intermediary, no sidecar, no IPC channel. Both the eBPF enforcement programs and Kedge's CNI plugin read from the same map.
### Map definition
```c
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, __u64); /* cgroup_id */
__type(value, struct shell_state);
__uint(max_entries, 1024);
} shell_map SEC(".maps");
```
### `shell_state` layout (212 bytes)
```c
struct shell_state {
__u8 sat_hash[32]; /* SHA-256 of the bound SAT */
__u32 capabilities; /* Capability bitmask */
__u32 allowed_mode; /* Overlay/underlay mode bitmask */
__u32 num_allowed_targets; /* Count of populated entries */
struct shell_target allowed_targets[16]; /* Destination whitelist */
__u64 accord_id; /* Hash of the governing accord */
__u64 session_counter; /* Monotonic counter for correlation IDs */
__u64 created_at; /* Nanosecond timestamp (CLOCK_MONOTONIC) */
__u64 last_updated; /* Nanosecond timestamp (CLOCK_MONOTONIC) */
__u32 flags; /* Shell state flags */
__u32 _padding; /* Alignment to 8-byte boundary */
};
```
```c
struct shell_target {
__be32 addr; /* IPv4 destination (network byte order) */
__be16 port; /* Destination port (0 = any port) */
__u8 protocol; /* IPPROTO_TCP (6), IPPROTO_UDP (17), 0 = any */
__u8 flags; /* Per-target flags */
};
```
### Fields relevant to Kedge
| Field | Kedge usage |
|-------|-------------|
| `allowed_mode` | CNI plugin reads this at `CmdAdd` time to decide which route types to program on `net1`. `OVERLAY` (0x1) → WireGuard routes. `UNDERLAY` (0x2) → VLAN bridge routes. Both bits → both route sets. |
| `allowed_targets` | eBPF `CGROUP_SOCK_ADDR` hook validates outbound connections against this whitelist at connect time. Kedge does not enforce targets — the kernel does. |
| `capabilities` | `READ` (0x1), `PROPOSE` (0x2), `MUTATE` (0x4), `ADMIN` (0x8). Kedge uses this when building `SessionTransitArtifact` records. The DaemonSet's health endpoint exposes capability distribution as Prometheus metrics. |
| `flags` | `ACTIVE` (0x1), `FROZEN` (0x2), `DRAINING` (0x4). DaemonSet watches flag transitions. When a shell enters `FROZEN`, the DaemonSet stops accepting new Shellstream handshakes for that session. When `DRAINING`, it tears down associated tunnel state. |
| `sat_hash` | Included in every `SessionTransitArtifact` for Quartermaster governance chain linkage. |
| `accord_id` | Used by the DaemonSet to look up the local accord policy that governs capability grants for incoming Shellstream sessions. |
### Access pattern
The BPF map is pinned at `/sys/fs/bpf/shell_map`. Kedge accesses it from Go via `cilium/ebpf` map operations:
- **CNI plugin** (`CmdAdd`): Looks up the pod's `cgroup_id`, reads `allowed_mode` and `flags`. This is a single `MapLookupElem` call — O(1), no syscall overhead beyond the BPF command. If no entry exists (shell is `UNBOUND`), the CNI plugin installs no routes on `net1`, effectively isolating the pod.
- **DaemonSet**: Subscribes to `shell_map` changes via the `shell_events` ring buffer (see `ShellProgrammer.subscribe_events()` in shell-primitive.md §7.1). Reacts to `bind_shell`, `freeze_shell`, `drain_shell`, and `destroy_shell` events.
---
## 3. Routing HFL Shellstream Sessions Through Kedge
An HFL-originated Shellstream session follows the same packet path as any other pod egress through `net1`. The difference is that three independent checks have already passed before the first SYN reaches the wire. This section traces the four common HFL session types.
### 3.1 Overlay: HFL → Quartermaster (cross-cluster query)
A WASM module calls `query-records` on the `session-transit` registry hosted by a remote Quartermaster instance.
```
WASM module
│ host.quartermaster.query_records("session-transit", filter, limit)
HFL Runtime
│ 1. Validate ModuleGrant: capabilities includes READ,
│ registries includes "session-transit",
│ allowed_mode includes OVERLAY
│ 2. Construct RequestToken (single_use=true, expires=30s)
│ 3. Open TCP connection to Quartermaster endpoint
eBPF CGROUP_SOCK_ADDR hook
│ 4. Look up cgroup_id in shell_map
│ 5. Check flags: ACTIVE (not FROZEN/DRAINING)
│ 6. Check destination against allowed_targets[0..N]
│ 7. Permit or deny at kernel level
net1 interface (Kedge CNI)
│ 8. Packet hits overlay route → forwarded to wg0
WireGuard tunnel (wg0)
│ 9. Encrypted transit to remote cluster
Remote Kedge DaemonSet (Shellstream listener)
│ 10. 3-way handshake: ATTEST-INIT / ATTEST-VERIFY / ATTEST-CONFIRM
│ 11. Validate SAT, evaluate capability against local accord
│ 12. Record SessionTransitArtifact
Quartermaster
│ 13. Validate RequestToken independently
│ 14. Execute query, return results
```
Steps 12 are HFL enforcement (application layer). Steps 47 are eBPF enforcement (kernel layer). Step 8 depends on Kedge having programmed overlay routes at `CmdAdd` time — which only happens if `shell_state.allowed_mode & OVERLAY != 0`. Steps 1013 are cryptographic enforcement (SAT + RequestToken).
### 3.2 Underlay: HFL → Bascule SDK dispatch (device read)
A WASM module calls `get-device-state` for a FortiGate on the local VLAN.
```
WASM module
│ host.infrastructure.get_device_state("fortigate.transit.local")
HFL Runtime
│ 1. Validate ModuleGrant: capabilities includes READ,
│ allowed_mode includes UNDERLAY
│ 2. Construct RequestToken
│ 3. Open Shellstream to Bascule (may be local or cross-cluster)
eBPF CGROUP_SOCK_ADDR hook
│ 4. Validate against shell_map (same as §3.1)
net1 interface (Kedge CNI)
│ 5. Packet hits underlay route → forwarded via VLAN bridge
VLAN bridge (br-mgmt / tagged interface)
│ 6. Frame reaches managed device on infrastructure VLAN
Bascule SDK dispatch
│ 7. Validate RequestToken
│ 8. Execute read-only SDK call (e.g., fortiosapi.get)
│ 9. Return device state to WASM module
```
The underlay path differs at step 5: traffic exits through the VLAN bridge rather than the WireGuard tunnel. The CNI plugin only programs underlay routes if `shell_state.allowed_mode & UNDERLAY != 0`.
### 3.3 Local: HFL → YANG compiler (no Shellstream)
YANG host functions (`validate-instance`, `compile-dry-run`, `diff-state`) call the Python YANG compiler within the same container. No Shellstream session is opened and no traffic traverses `net1`. The HFL runtime invokes the compiler as a local subprocess. Kedge is not involved.
This is worth documenting because it is the exception: HFL host functions do not always produce network traffic. The YANG category is purely local computation.
### 3.4 Local: HFL → Prometheus (telemetry query)
Telemetry queries (`query-prometheus`, `query-timescale`) may target a Prometheus instance on the same cluster. When the destination is local (same node or same cluster overlay), the path is:
```
WASM module → HFL Runtime → eBPF check → net1 → overlay route → wg0 → local Prometheus
```
Even local-cluster queries transit `net1` and the WireGuard tunnel because Kedge treats all overlay traffic uniformly. There is no "local bypass" optimization — this simplifies the enforcement model by ensuring every session, regardless of locality, passes through the same BPF map checks.
---
## 4. CNI Plugin: `shell_state`-Aware Route Programming
The CNI plugin's `CmdAdd` handler programs routes on the pod's `net1` interface based on `shell_state.allowed_mode`. This is the critical integration point: Kedge decides at pod creation time what network paths are available.
### Current `CmdAdd` flow (from `internal/cni/plugin.go`)
```
ParseNetConf → loadMeshTopology → createVethPair → attachOverlayRoutes
→ attachUnderlayRoutes → programPodRoutes → applySVIDPolicy
```
### Extended flow with `shell_state` integration
```
ParseNetConf → loadMeshTopology → createVethPair
→ lookupShellState(cgroup_id)
→ if shell_state is UNBOUND: return (no routes, pod is isolated)
→ if flags & ACTIVE == 0: return (shell not active, pod is isolated)
→ if allowed_mode & OVERLAY: attachOverlayRoutes
→ if allowed_mode & UNDERLAY: attachUnderlayRoutes
→ programPodRoutes (only for modes granted above)
→ applySVIDPolicy (SVID check is orthogonal to shell_state)
```
### `lookupShellState` pseudocode
```go
func lookupShellState(cgroupID uint64) (*ShellState, error) {
m, err := ebpf.LoadPinnedMap("/sys/fs/bpf/shell_map", nil)
if err != nil {
return nil, fmt.Errorf("shell_map not available: %w", err)
}
defer m.Close()
var state ShellState
if err := m.Lookup(cgroupID, &state); err != nil {
// No entry = UNBOUND. Fail closed: no routes.
return nil, nil
}
return &state, nil
}
```
### Route programming rules
| `allowed_mode` | Overlay routes | Underlay routes | Effect |
|-----------------|---------------|-----------------|--------|
| `0x0` (none) | No | No | Pod is network-isolated on `net1`. |
| `0x1` (OVERLAY) | Yes | No | Pod can reach remote clusters via WireGuard. Cannot reach infrastructure VLANs. |
| `0x2` (UNDERLAY) | No | Yes | Pod can reach infrastructure devices on local VLANs. Cannot reach remote clusters. |
| `0x3` (both) | Yes | Yes | Full access. Typical for admin-tier shells. |
### Fail-closed behavior
If `shell_map` is not pinned (e.g., the shell primitive is not yet loaded), the CNI plugin treats this as equivalent to `UNBOUND` — no routes are programmed. This is intentional: a pod without an active shell binding has no business reaching Kedge-managed networks.
If `shell_map` exists but contains no entry for the pod's `cgroup_id`, the same fail-closed behavior applies. The CNI plugin does not fall back to permissive routing.
---
## 5. DaemonSet: `shell_state` Watch and Lifecycle Coordination
The DaemonSet reacts to shell lifecycle transitions by adjusting tunnel state, Shellstream session handling, and telemetry.
### Event subscription
The DaemonSet calls `ShellProgrammer.subscribe_events()` (implemented via the `shell_events` BPF ring buffer) to receive shell state transitions. In Go, this is accessed through the `cilium/ebpf` ring buffer reader:
```go
reader, err := ringbuf.NewReader(shellEventsMap)
for {
record, err := reader.Read()
event := parseShellEvent(record.RawSample)
switch event.Type {
case EventBindShell: handleBind(event)
case EventFreezeShell: handleFreeze(event)
case EventDrainShell: handleDrain(event)
case EventDestroyShell: handleDestroy(event)
}
}
```
### Lifecycle event handling
| Event | DaemonSet action |
|-------|-----------------|
| `bind_shell` | Register the shell's `cgroup_id` in local session tracking. If the shell has `allowed_mode & OVERLAY`, ensure the associated WireGuard peers are configured. Log the binding with `sat_hash` and `accord_id`. |
| `freeze_shell` | Stop accepting new Shellstream handshakes for sessions associated with this `cgroup_id`. Existing established sessions continue (the shell is being rotated, not destroyed). Increment `kedge_shell_freeze_total` counter. |
| `drain_shell` | Stop accepting new handshakes AND begin tearing down established Shellstream sessions for this `cgroup_id`. The mesh manager marks associated peers as draining. Increment `kedge_shell_drain_total` counter. |
| `destroy_shell` | Remove all local state for this `cgroup_id`. Clean up any tunnel state that was exclusively serving this shell. Emit final telemetry. Increment `kedge_shell_destroy_total` counter. |
### SAT rotation coordination
When Bascule rotates a SAT, it freezes the shell, updates `sat_hash`, then unfreezes. The DaemonSet sees two events: `freeze_shell` followed by an `update_shell` (which carries the new `sat_hash`), then an implicit unfreeze (the `FROZEN` flag is cleared). During the freeze window, no new Shellstream handshakes are accepted, but existing sessions continue. This prevents a window where the old SAT is invalid but the new one hasn't propagated — the freeze ensures atomicity from the network's perspective.
---
## 6. Quartermaster Records for HFL Sessions
HFL-originated sessions produce the same `SessionTransitArtifact` and `NetworkMutationArtifact` records as any other Shellstream session. However, HFL sessions carry additional provenance that should be captured for auditability.
### Extended `SessionTransitArtifact`
The existing artifact (from `internal/quartermaster/session_transit.go`) is extended with optional HFL-specific fields:
```go
type SessionTransitArtifact struct {
// ... existing fields (session_id, sat_hash, source/dest cluster, etc.)
// HFL provenance (optional — zero values when session is not HFL-originated).
HFLModuleName string `json:"hfl_module_name,omitempty"`
HFLFunctionName string `json:"hfl_function_name,omitempty"`
HFLGrantHash []byte `json:"hfl_grant_hash,omitempty"`
HFLRequestTokenHash []byte `json:"hfl_request_token_hash,omitempty"`
}
```
| Field | Source | Purpose |
|-------|--------|---------|
| `hfl_module_name` | Shellstream frame header, set by HFL runtime | Identifies which WASM module initiated the session (e.g., `"telemetry-query"`). |
| `hfl_function_name` | Shellstream frame header, set by HFL runtime | Identifies which host function was called (e.g., `"query_flow_records"`). |
| `hfl_grant_hash` | SHA-256 of the `ModuleGrant` | Links the session back to the specific grant that authorized it. Enables audit queries like "show all sessions authorized by grant X". |
| `hfl_request_token_hash` | SHA-256 of the `RequestToken` | Links the session to the single-use per-request token. Since tokens are single-use, this provides a 1:1 mapping from token to session. |
These fields are populated by the Kedge DaemonSet's Shellstream listener when the `ATTEST-INIT` message includes HFL provenance headers. Non-HFL sessions (direct Shellstream connections) leave these fields empty.
### `NetworkMutationArtifact` — no HFL extension needed
Network mutations are dispatched through Bascule SDK, not through HFL host functions. The HFL can _propose_ mutations (via `submit-proposal` with `PROPOSE` capability), but the actual execution goes through Bascule's ceremony workflow. By the time Kedge records a `NetworkMutationArtifact`, the mutation has already been approved through a ceremony and dispatched by Bascule — the HFL origin is captured in the ceremony's proposal chain, not in the mutation artifact itself.
### Canonical serialization
The `CanonicalBytes()` method on `SessionTransitArtifact` must include the HFL fields when present. The RFC 8785 (JCS) canonical form ensures deterministic hashing regardless of field ordering. Fields with zero values (`omitempty`) are excluded from the canonical form to maintain backward compatibility with non-HFL session records.
---
## 7. Mode Authorization: End-to-End Flow
This section traces the complete authorization chain for an HFL-originated underlay session, showing how the three enforcement layers compose.
### Scenario
A WASM module `"proposal-builder"` calls `compile-dry-run` (local, no network — see §3.3), then calls `submit-proposal` to Quartermaster (overlay), then an operator approves the ceremony and Bascule dispatches the mutation (underlay). We trace the overlay leg (proposal submission) end to end.
### Step 1: SAT issuance (before Kedge)
The Vigil ceremony issues a SAT with:
- `capabilities`: `READ | PROPOSE` (0x3)
- `allowed_mode`: `OVERLAY` (0x1) — this module does not need underlay access
- `scope`: registries `["network-mutation", "session-transit"]`
Bascule calls `bind_shell(cgroup_id, shell_state)` with these values. The `shell_map` entry is created.
### Step 2: Grant attenuation (HFL)
The HFL runtime derives a `ModuleGrant` for `"proposal-builder"`:
```
ModuleGrant {
module_hash: sha256("proposal-builder.wasm"),
parent_sat_hash: shell_state.sat_hash,
capabilities: PROPOSE (0x2), // attenuated from READ|PROPOSE
registries: ["network-mutation"], // attenuated from full list
allowed_mode: OVERLAY (0x1), // same as SAT (cannot widen)
expires_at: min(sat.expires, now + 1h),
}
```
Attenuation is strictly narrowing: `grant.capabilities & sat.capabilities == grant.capabilities`. The grant cannot add `MUTATE` or `UNDERLAY` that the SAT does not carry.
### Step 3: Request token (HFL, per-call)
When the module calls `submit-proposal`, the HFL constructs:
```
RequestToken {
grant_hash: sha256(ModuleGrant),
operation: "submit-proposal",
parameters_hash: sha256(serialized_proposal),
single_use: true,
expires_at: now + 30s,
}
```
### Step 4: eBPF enforcement (kernel)
The `connect()` syscall triggers the `CGROUP_SOCK_ADDR` hook:
1. Look up `cgroup_id` in `shell_map` → finds the `shell_state`.
2. Check `flags`: `ACTIVE` is set, `FROZEN` is not → proceed.
3. Check destination IP/port against `allowed_targets[0..num_allowed_targets]` → match found.
4. Permit the connection.
If the destination were on an underlay VLAN (and `allowed_mode` only has `OVERLAY`), this check would still pass at the eBPF level — `allowed_targets` is an explicit whitelist, not mode-based. The mode enforcement happens at the routing level (step 5).
### Step 5: Kedge route enforcement (CNI)
The packet exits the pod via `net1`. Because the CNI plugin read `allowed_mode = OVERLAY` at `CmdAdd` time, only overlay (WireGuard) routes are programmed. The destination Quartermaster endpoint resolves to an overlay route through `wg0`. If the destination were on an underlay VLAN, there would be no route — the packet would be dropped by the kernel's routing table. This is Kedge's enforcement: not a firewall rule, but the absence of a route.
### Step 6: WireGuard transit (Kedge mesh)
The packet traverses the WireGuard tunnel to the remote cluster. Kedge's mesh manager maintains the peer configuration and monitors tunnel health.
### Step 7: Remote Shellstream handshake (Kedge DaemonSet)
The remote Kedge DaemonSet receives the connection on its Shellstream listener:
1. Read `ATTEST-INIT` (includes SAT token, HFL provenance headers).
2. Validate SAT against local SPIRE trust bundle.
3. Evaluate capability request against local accord policy (see `internal/shellstream/capability.go`).
4. Send `ATTEST-VERIFY` with granted capabilities.
5. Read `ATTEST-CONFIRM`.
6. Record `SessionTransitArtifact` with HFL fields populated.
### Step 8: Quartermaster validation (independent)
Quartermaster receives the `submit-proposal` RPC with the `RequestToken` in metadata:
1. Verify `RequestToken.signature` against the HFL runtime's signing key.
2. Check `single_use` — mark token as consumed.
3. Check `expires_at` — reject if expired.
4. Verify `grant_hash` chains back to a valid SAT.
5. Check that the grant's `capabilities` include `PROPOSE`.
6. Accept the proposal for ceremony processing.
### Enforcement summary
| Layer | What it checks | Where it runs | Failure mode |
|-------|---------------|---------------|-------------|
| HFL grant | Module has appropriate capability and registry scope | Userspace (HFL runtime, same pod) | Request rejected before network |
| eBPF shell | Destination is in `allowed_targets`, shell is `ACTIVE` | Kernel (`CGROUP_SOCK_ADDR` hook) | `connect()` returns `EACCES` |
| Kedge routes | `allowed_mode` permits the network path type | Kernel (routing table, programmed by CNI) | Packet dropped (no route to host) |
| Shellstream handshake | SAT is valid, capabilities match local accord | Userspace (remote Kedge DaemonSet) | Handshake rejected at `ATTEST-VERIFY` |
| Service token | `RequestToken` is valid, single-use, unexpired | Userspace (Quartermaster/Bascule) | RPC rejected with `PermissionDenied` |
Five independent checks, three distinct trust domains (pod kernel, pod userspace, remote service), two cryptographic validations (SAT, RequestToken). No single point of compromise grants unauthorized access.

12
docs/mesh-topology.md Normal file
View file

@ -0,0 +1,12 @@
# Mesh Topology
The WireGuard mesh connects clusters over overlay tunnels. Topology state is shared between the DaemonSet (writer) and CNI plugin (reader) via `/etc/kedge/mesh.json`.
## Peer Discovery
- Static peers from ConfigMap (`k8s/configmap-mesh.yaml`)
- Dynamic discovery via Headscale (Phase 3)
## Health Monitoring
- Periodic peer handshake checks
- Dead peer detection with configurable timeout
- Prometheus metrics: `kedge_mesh_peer_count`, `kedge_mesh_tunnel_up`

9
docs/overlay-mode.md Normal file
View file

@ -0,0 +1,9 @@
# Overlay Mode
WireGuard-based tunneling over networks Kedge doesn't control. Used for cloud anchors, remote sites, and MSP-managed customer environments.
## Components
- WireGuard mesh manager (`internal/mesh/`)
- Headscale peer discovery (`internal/headscale/`)
- Shellstream handshake termination (`internal/shellstream/`)
- SessionTransitArtifact notarization (`internal/quartermaster/`)

View file

@ -0,0 +1,13 @@
# Quartermaster Integration
Kedge DaemonSet is a Quartermaster client, submitting notarization records for all governed operations.
## Artifact Types
- **SessionTransitArtifact**: Recorded when a Shellstream session crosses the inter-cluster boundary (overlay)
- **NetworkMutationArtifact**: Recorded when physical infrastructure is mutated via Bascule SDK (underlay)
## Protocol
gRPC client connecting to `GovernanceService` (intent lifecycle) and `QuartermasterNotary` (merkle anchoring).
## Serialization
RFC 8785 (JCS) canonical JSON for deterministic hashing and merkle tree inclusion.

View file

@ -0,0 +1,11 @@
# Shellstream Boundary
Shellstream is the attested stream protocol for inter-cluster communication. Kedge DaemonSet terminates Shellstream handshakes at the cluster boundary.
## 3-Way Handshake
1. **ATTEST-INIT**: Remote peer sends SAT token with capability request
2. **ATTEST-VERIFY**: Local Kedge validates SAT, evaluates capabilities against accord policy, sends attenuated grant
3. **ATTEST-CONFIRM**: Session established with granted capabilities
## Mode Selection
The capability token determines overlay vs. underlay access, governed by MSP trust tier mappings in the local accord policy.

16
docs/underlay-mode.md Normal file
View file

@ -0,0 +1,16 @@
# Underlay Mode
Programs the actual physical network fabric via vendor SDK dispatch through Bascule. Used for sites where you own the infrastructure.
## Components
- VLAN interface manager (`internal/vlan/`)
- YANG watcher and compilation trigger (`internal/underlay/`)
- YANG compiler (`yang/compiler/`)
- NetworkMutationArtifact notarization (`internal/quartermaster/`)
## YANG Pipeline
1. YANG instance data changes (ConfigMap/CRD)
2. Watcher triggers compilation
3. Compiler produces vendor-specific payloads (FortiOS, VyOS, UniFi)
4. Payloads dispatched via Bascule SDK
5. Each mutation notarized via Quartermaster

17
docs/yang-compiler.md Normal file
View file

@ -0,0 +1,17 @@
# YANG Compiler
Transforms device-agnostic YANG site configuration into vendor-specific payloads.
## Pipeline
1. `sovereign-sdwan.yang` defines the schema
2. Per-site XML instance data (e.g., `homelab.xml`)
3. Compiler validates and produces vendor payloads
4. Vendor targets: FortiOS REST, VyOS NETCONF, UniFi API
## Usage
```bash
python3 -m compiler.compile --site-config site-config/homelab.xml --output-format json
```
## Adding a Vendor
Create a new `to_<vendor>.py` module in `yang/compiler/` implementing `compile_zones()`.

19
go.mod Normal file
View file

@ -0,0 +1,19 @@
module github.com/guildhouse-co/kedge
go 1.23
require (
github.com/containernetworking/cni v1.2.3
github.com/containernetworking/plugins v1.6.2
github.com/prometheus/client_golang v1.20.5
github.com/spiffe/go-spiffe/v2 v2.4.0
github.com/vishvananda/netlink v1.3.0
github.com/vishvananda/netns v0.0.5
go.uber.org/zap v1.27.0
golang.zx2c4.com/wireguard/wgctrl v0.0.0-20230429144221-925a1e7659e6
google.golang.org/grpc v1.69.4
google.golang.org/protobuf v1.36.3
k8s.io/api v0.32.1
k8s.io/apimachinery v0.32.1
k8s.io/client-go v0.32.1
)

37
internal/cni/bridge.go Normal file
View file

@ -0,0 +1,37 @@
package cni
import (
"fmt"
"github.com/vishvananda/netlink"
"github.com/guildhouse-co/kedge/internal/topology"
)
// attachUnderlayRoutes connects the host-side veth to VLAN bridge interfaces
// for underlay mode routing.
func attachUnderlayRoutes(hostVeth string, routes []SubnetRoute, topo *topology.MeshTopology) error {
if len(routes) == 0 {
return nil
}
hostLink, err := netlink.LinkByName(hostVeth)
if err != nil {
return fmt.Errorf("host veth %s not found: %w", hostVeth, err)
}
for _, route := range routes {
bridgeName := route.Via // e.g., "vlan100"
bridge, err := netlink.LinkByName(bridgeName)
if err != nil {
return fmt.Errorf("bridge %s not found for route %s: %w", bridgeName, route.Dst, err)
}
// Attach host veth to the VLAN bridge.
if err := netlink.LinkSetMaster(hostLink, bridge); err != nil {
return fmt.Errorf("failed to attach %s to bridge %s: %w", hostVeth, bridgeName, err)
}
}
return nil
}

36
internal/cni/config.go Normal file
View file

@ -0,0 +1,36 @@
package cni
import (
"encoding/json"
"github.com/containernetworking/cni/pkg/types"
)
// NetConf is the CNI network configuration parsed from the NetworkAttachmentDefinition.
type NetConf struct {
types.NetConf
// MeshConfig is the path to the mesh topology file (e.g., /etc/kedge/mesh.json).
MeshConfig string `json:"meshConfig"`
// UnderlayRoutes are subnets reachable via VLAN bridge interfaces (underlay mode).
UnderlayRoutes []SubnetRoute `json:"underlayRoutes,omitempty"`
// OverlayRoutes are subnets reachable via WireGuard/VXLAN tunnels (overlay mode).
OverlayRoutes []SubnetRoute `json:"overlayRoutes,omitempty"`
}
// SubnetRoute maps a destination CIDR to a network interface.
type SubnetRoute struct {
Dst string `json:"dst"` // e.g., "172.16.0.0/24"
Via string `json:"via"` // e.g., "vlan100" or "wg0"
}
// ParseNetConf parses the CNI network configuration from raw bytes.
func ParseNetConf(data []byte) (*NetConf, error) {
conf := &NetConf{}
if err := json.Unmarshal(data, conf); err != nil {
return nil, err
}
return conf, nil
}

126
internal/cni/netns.go Normal file
View file

@ -0,0 +1,126 @@
package cni
import (
"fmt"
"net"
"runtime"
"github.com/vishvananda/netlink"
"github.com/vishvananda/netns"
)
// createVethPair creates a veth pair with one end in the host namespace
// and the other moved into the pod's network namespace.
// Returns (hostVethName, podVethName, error).
func createVethPair(podNetns string, ifName string) (string, string, error) {
hostVethName := "kedge_h_" + ifName[:min(5, len(ifName))]
veth := &netlink.Veth{
LinkAttrs: netlink.LinkAttrs{Name: hostVethName},
PeerName: ifName,
}
if err := netlink.LinkAdd(veth); err != nil {
return "", "", fmt.Errorf("failed to create veth pair: %w", err)
}
// Get a handle to the pod's network namespace.
podNS, err := netns.GetFromPath(podNetns)
if err != nil {
netlink.LinkDel(veth)
return "", "", fmt.Errorf("failed to get pod netns: %w", err)
}
defer podNS.Close()
// Move the peer end into the pod namespace.
peer, err := netlink.LinkByName(ifName)
if err != nil {
netlink.LinkDel(veth)
return "", "", fmt.Errorf("failed to find peer veth: %w", err)
}
if err := netlink.LinkSetNsFd(peer, int(podNS)); err != nil {
netlink.LinkDel(veth)
return "", "", fmt.Errorf("failed to move peer to pod netns: %w", err)
}
// Bring up the host-side veth.
hostVeth, err := netlink.LinkByName(hostVethName)
if err != nil {
return "", "", fmt.Errorf("failed to find host veth: %w", err)
}
if err := netlink.LinkSetUp(hostVeth); err != nil {
return "", "", fmt.Errorf("failed to bring up host veth: %w", err)
}
// Bring up the pod-side veth inside the namespace.
if err := inNamespace(podNS, func() error {
link, err := netlink.LinkByName(ifName)
if err != nil {
return err
}
return netlink.LinkSetUp(link)
}); err != nil {
return "", "", fmt.Errorf("failed to bring up pod veth: %w", err)
}
return hostVethName, ifName, nil
}
// teardownInterface removes the named interface from the pod's network namespace.
func teardownInterface(podNetns string, ifName string) error {
ns, err := netns.GetFromPath(podNetns)
if err != nil {
// Namespace may already be gone during cleanup.
return nil
}
defer ns.Close()
return inNamespace(ns, func() error {
link, err := netlink.LinkByName(ifName)
if err != nil {
return nil // Already gone.
}
return netlink.LinkDel(link)
})
}
// verifyInterface checks that the named interface exists in the pod's network namespace.
func verifyInterface(podNetns string, ifName string) error {
ns, err := netns.GetFromPath(podNetns)
if err != nil {
return fmt.Errorf("failed to get pod netns: %w", err)
}
defer ns.Close()
return inNamespace(ns, func() error {
link, err := netlink.LinkByName(ifName)
if err != nil {
return fmt.Errorf("interface %s not found: %w", ifName, err)
}
if link.Attrs().Flags&net.FlagUp == 0 {
return fmt.Errorf("interface %s is down", ifName)
}
return nil
})
}
// inNamespace executes fn inside the given network namespace, restoring
// the original namespace afterward.
func inNamespace(ns netns.NsHandle, fn func() error) error {
runtime.LockOSThread()
defer runtime.UnlockOSThread()
origNS, err := netns.Get()
if err != nil {
return fmt.Errorf("failed to get current netns: %w", err)
}
defer origNS.Close()
if err := netns.Set(ns); err != nil {
return fmt.Errorf("failed to set netns: %w", err)
}
defer netns.Set(origNS) //nolint:errcheck
return fn()
}

93
internal/cni/plugin.go Normal file
View file

@ -0,0 +1,93 @@
package cni
import (
"fmt"
"github.com/containernetworking/cni/pkg/skel"
"github.com/containernetworking/cni/pkg/types"
current "github.com/containernetworking/cni/pkg/types/100"
)
// CmdAdd is called by the container runtime when a pod is created.
// It creates the net1 interface in the pod's network namespace and programs
// routes for both overlay (WireGuard) and underlay (VLAN bridge) subnets.
func CmdAdd(args *skel.CmdArgs) error {
conf, err := ParseNetConf(args.StdinData)
if err != nil {
return fmt.Errorf("failed to parse config: %w", err)
}
// Load mesh topology for current node state.
topo, err := loadMeshTopology(conf.MeshConfig)
if err != nil {
return fmt.Errorf("failed to load mesh topology: %w", err)
}
// Create veth pair and move one end into the pod's network namespace.
hostVeth, podVeth, err := createVethPair(args.Netns, args.IfName)
if err != nil {
return fmt.Errorf("failed to create veth pair: %w", err)
}
// Attach host-side veth to appropriate backend interfaces.
if err := attachOverlayRoutes(hostVeth, conf.OverlayRoutes, topo); err != nil {
return fmt.Errorf("failed to attach overlay routes: %w", err)
}
if err := attachUnderlayRoutes(hostVeth, conf.UnderlayRoutes, topo); err != nil {
return fmt.Errorf("failed to attach underlay routes: %w", err)
}
// Program routes inside the pod's network namespace.
if err := programPodRoutes(args.Netns, podVeth, conf.OverlayRoutes, conf.UnderlayRoutes); err != nil {
return fmt.Errorf("failed to program pod routes: %w", err)
}
// Apply SVID-based network policy.
if err := applySVIDPolicy(args, conf, topo); err != nil {
return fmt.Errorf("failed to apply SVID policy: %w", err)
}
// Build and return CNI result.
result := &current.Result{
CNIVersion: conf.CNIVersion,
Interfaces: []*current.Interface{
{Name: hostVeth, Sandbox: ""},
{Name: podVeth, Sandbox: args.Netns},
},
}
return types.PrintResult(result, conf.CNIVersion)
}
// CmdDel is called when a pod is deleted.
// Tears down the net1 interface and cleans up routes and bridge/tunnel attachments.
func CmdDel(args *skel.CmdArgs) error {
conf, err := ParseNetConf(args.StdinData)
if err != nil {
return fmt.Errorf("failed to parse config: %w", err)
}
if err := teardownInterface(args.Netns, args.IfName); err != nil {
return fmt.Errorf("failed to teardown interface: %w", err)
}
_ = conf // Routes are cleaned up with the interface
return nil
}
// CmdCheck verifies the net1 interface exists and routes are correct.
func CmdCheck(args *skel.CmdArgs) error {
conf, err := ParseNetConf(args.StdinData)
if err != nil {
return fmt.Errorf("failed to parse config: %w", err)
}
if err := verifyInterface(args.Netns, args.IfName); err != nil {
return fmt.Errorf("interface check failed: %w", err)
}
if err := verifyRoutes(args.Netns, conf.OverlayRoutes, conf.UnderlayRoutes); err != nil {
return fmt.Errorf("route check failed: %w", err)
}
return nil
}

52
internal/cni/policy.go Normal file
View file

@ -0,0 +1,52 @@
package cni
import (
"fmt"
"github.com/containernetworking/cni/pkg/skel"
"github.com/guildhouse-co/kedge/internal/topology"
)
// applySVIDPolicy enforces SVID-scoped network policy on the pod's net1 interface.
// Restricts which subnets are reachable and which mode (overlay/underlay) is authorized
// based on the pod's SPIFFE identity.
func applySVIDPolicy(args *skel.CmdArgs, conf *NetConf, topo *topology.MeshTopology) error {
// Phase 1: permissive — all Bascule runtime pods get full access.
// TODO: Fetch pod's SPIFFE SVID from SPIRE agent unix socket.
// TODO: Evaluate SVID against accord-defined subnet and mode policies.
// TODO: Program iptables/nftables rules scoped to the pod's veth pair.
_ = args
_ = conf
_ = topo
return nil
}
// PolicyRule defines a subnet access rule scoped to a SPIFFE identity.
type PolicyRule struct {
SVID string // SPIFFE Verifiable Identity Document URI
AllowedDst []string // Allowed destination CIDRs
Mode string // "overlay", "underlay", or "both"
Operations []string // "read", "mutate", "admin"
}
// EvaluatePolicy checks whether the given SVID is authorized for the requested
// subnets and mode per the local accord policy.
func EvaluatePolicy(svid string, rules []PolicyRule, requestedDst string, mode string) error {
for _, rule := range rules {
if rule.SVID != svid {
continue
}
if rule.Mode != "both" && rule.Mode != mode {
continue
}
for _, allowed := range rule.AllowedDst {
if allowed == requestedDst {
return nil
}
}
}
return fmt.Errorf("SVID %s not authorized for %s in mode %s", svid, requestedDst, mode)
}

84
internal/cni/routes.go Normal file
View file

@ -0,0 +1,84 @@
package cni
import (
"fmt"
"net"
"github.com/vishvananda/netlink"
"github.com/vishvananda/netns"
"github.com/guildhouse-co/kedge/internal/topology"
)
// programPodRoutes programs routes inside the pod's network namespace for
// both overlay and underlay destinations.
func programPodRoutes(podNetnsPath string, ifName string, overlay, underlay []SubnetRoute) error {
ns, err := netns.GetFromPath(podNetnsPath)
if err != nil {
return fmt.Errorf("failed to get pod netns: %w", err)
}
defer ns.Close()
return inNamespace(ns, func() error {
link, err := netlink.LinkByName(ifName)
if err != nil {
return fmt.Errorf("interface %s not found: %w", ifName, err)
}
allRoutes := append(overlay, underlay...)
for _, sr := range allRoutes {
_, dst, err := net.ParseCIDR(sr.Dst)
if err != nil {
return fmt.Errorf("invalid CIDR %s: %w", sr.Dst, err)
}
route := &netlink.Route{
LinkIndex: link.Attrs().Index,
Dst: dst,
}
if err := netlink.RouteAdd(route); err != nil {
return fmt.Errorf("failed to add route %s: %w", sr.Dst, err)
}
}
return nil
})
}
// verifyRoutes checks that expected routes exist in the pod's network namespace.
func verifyRoutes(podNetnsPath string, overlay, underlay []SubnetRoute) error {
ns, err := netns.GetFromPath(podNetnsPath)
if err != nil {
return fmt.Errorf("failed to get pod netns: %w", err)
}
defer ns.Close()
return inNamespace(ns, func() error {
routes, err := netlink.RouteList(nil, netlink.FAMILY_V4)
if err != nil {
return fmt.Errorf("failed to list routes: %w", err)
}
routeMap := make(map[string]bool)
for _, r := range routes {
if r.Dst != nil {
routeMap[r.Dst.String()] = true
}
}
allExpected := append(overlay, underlay...)
for _, sr := range allExpected {
if !routeMap[sr.Dst] {
return fmt.Errorf("missing route: %s", sr.Dst)
}
}
return nil
})
}
// loadMeshTopology reads the mesh topology file written by the DaemonSet.
func loadMeshTopology(meshConfigPath string) (*topology.MeshTopology, error) {
if meshConfigPath == "" {
meshConfigPath = "/etc/kedge/mesh.json"
}
return topology.LoadFromFile(meshConfigPath)
}

42
internal/cni/tunnel.go Normal file
View file

@ -0,0 +1,42 @@
package cni
import (
"fmt"
"net"
"github.com/vishvananda/netlink"
"github.com/guildhouse-co/kedge/internal/topology"
)
// attachOverlayRoutes programs routes on the host to direct overlay-destined
// traffic from the veth through the WireGuard tunnel interface.
func attachOverlayRoutes(hostVeth string, routes []SubnetRoute, topo *topology.MeshTopology) error {
if len(routes) == 0 {
return nil
}
for _, route := range routes {
tunnelIfName := route.Via // e.g., "wg0"
tunnelLink, err := netlink.LinkByName(tunnelIfName)
if err != nil {
return fmt.Errorf("tunnel interface %s not found for route %s: %w", tunnelIfName, route.Dst, err)
}
_, dst, err := net.ParseCIDR(route.Dst)
if err != nil {
return fmt.Errorf("invalid overlay CIDR %s: %w", route.Dst, err)
}
// Add route on host: dst → tunnel interface.
r := &netlink.Route{
LinkIndex: tunnelLink.Attrs().Index,
Dst: dst,
}
if err := netlink.RouteReplace(r); err != nil {
return fmt.Errorf("failed to add overlay route %s via %s: %w", route.Dst, tunnelIfName, err)
}
}
return nil
}

154
internal/config/config.go Normal file
View file

@ -0,0 +1,154 @@
package config
import (
"fmt"
"os"
"time"
)
// Config is the top-level configuration for the Kedge DaemonSet.
type Config struct {
NodeID string `json:"node_id"`
ClusterID string `json:"cluster_id"`
Mesh MeshConfig `json:"mesh"`
VLAN VLANConfig `json:"vlan"`
Shellstream ShellstreamConfig `json:"shellstream"`
Quartermaster QuartermasterConfig `json:"quartermaster"`
Underlay UnderlayConfig `json:"underlay"`
Health HealthConfig `json:"health"`
Headscale HeadscaleConfig `json:"headscale"`
}
// MeshConfig configures the WireGuard overlay mesh.
type MeshConfig struct {
Enabled bool `json:"enabled"`
InterfaceName string `json:"interface_name"`
ListenPort int `json:"listen_port"`
PrivateKeyPath string `json:"private_key_path"`
HealthInterval time.Duration `json:"health_interval"`
DeadPeerTimeout time.Duration `json:"dead_peer_timeout"`
InitialPeers []PeerEntry `json:"initial_peers"`
}
// PeerEntry defines a WireGuard peer from configuration.
type PeerEntry struct {
PublicKey string `json:"public_key"`
Endpoint string `json:"endpoint"`
AllowedIPs []string `json:"allowed_ips"`
}
// VLANConfig configures underlay VLAN interfaces.
type VLANConfig struct {
Enabled bool `json:"enabled"`
VLANs []VLANEntry `json:"vlans"`
}
// VLANEntry defines a VLAN to be managed on the node.
type VLANEntry struct {
ID int `json:"id"`
ParentInterface string `json:"parent_interface"`
Subnet string `json:"subnet"`
Description string `json:"description"`
}
// ShellstreamConfig configures the Shellstream handshake listener.
type ShellstreamConfig struct {
ListenAddr string `json:"listen_addr"`
ClusterID string `json:"cluster_id"`
TrustBundlePath string `json:"trust_bundle_path"`
AccordPolicyPath string `json:"accord_policy_path"`
}
// QuartermasterConfig configures the Quartermaster gRPC client.
type QuartermasterConfig struct {
Endpoint string `json:"endpoint"`
UseTLS bool `json:"use_tls"`
CertPath string `json:"cert_path,omitempty"`
KeyPath string `json:"key_path,omitempty"`
CAPath string `json:"ca_path,omitempty"`
}
// UnderlayConfig configures the YANG watcher and compilation trigger.
type UnderlayConfig struct {
Enabled bool `json:"enabled"`
Namespace string `json:"namespace"`
ConfigMapName string `json:"configmap_name"`
CompilerPath string `json:"compiler_path"`
}
// HealthConfig configures the health and metrics server.
type HealthConfig struct {
ListenAddr string `json:"listen_addr"`
}
// HeadscaleConfig configures Headscale peer discovery.
type HeadscaleConfig struct {
Enabled bool `json:"enabled"`
Endpoint string `json:"endpoint"`
APIKey string `json:"api_key"`
PollInterval time.Duration `json:"poll_interval"`
}
// Load reads the DaemonSet configuration from environment variables and
// the config file at /etc/kedge/config.json.
func Load() (*Config, error) {
cfg := &Config{
NodeID: getEnv("KEDGE_NODE_ID", ""),
ClusterID: getEnv("KEDGE_CLUSTER_ID", ""),
Mesh: MeshConfig{
Enabled: getEnv("KEDGE_OVERLAY_ENABLED", "true") == "true",
InterfaceName: getEnv("KEDGE_WG_INTERFACE", "wg0"),
ListenPort: 51820,
PrivateKeyPath: getEnv("KEDGE_WG_PRIVATE_KEY", "/etc/kedge/wg-private.key"),
HealthInterval: 30 * time.Second,
DeadPeerTimeout: 5 * time.Minute,
},
VLAN: VLANConfig{
Enabled: getEnv("KEDGE_UNDERLAY_ENABLED", "false") == "true",
},
Shellstream: ShellstreamConfig{
ListenAddr: getEnv("KEDGE_SHELLSTREAM_ADDR", ":8443"),
ClusterID: getEnv("KEDGE_CLUSTER_ID", ""),
TrustBundlePath: getEnv("KEDGE_TRUST_BUNDLE", "/run/spire/bundle/bundle.pem"),
AccordPolicyPath: getEnv("KEDGE_ACCORD_POLICY", "/etc/kedge/accord-policy.json"),
},
Quartermaster: QuartermasterConfig{
Endpoint: getEnv("KEDGE_QM_ENDPOINT", "quartermaster.guildhouse.svc:50051"),
},
Underlay: UnderlayConfig{
Enabled: getEnv("KEDGE_UNDERLAY_ENABLED", "false") == "true",
Namespace: getEnv("KEDGE_UNDERLAY_NAMESPACE", "kedge"),
ConfigMapName: getEnv("KEDGE_UNDERLAY_CONFIGMAP", "kedge-underlay"),
CompilerPath: getEnv("KEDGE_YANG_COMPILER", "/opt/kedge/yang/compiler/compile.py"),
},
Health: HealthConfig{
ListenAddr: getEnv("KEDGE_HEALTH_ADDR", ":9090"),
},
Headscale: HeadscaleConfig{
Enabled: getEnv("KEDGE_HEADSCALE_ENABLED", "false") == "true",
Endpoint: getEnv("KEDGE_HEADSCALE_ENDPOINT", ""),
PollInterval: 60 * time.Second,
},
}
if cfg.NodeID == "" {
hostname, err := os.Hostname()
if err != nil {
return nil, fmt.Errorf("KEDGE_NODE_ID not set and hostname unavailable: %w", err)
}
cfg.NodeID = hostname
}
if cfg.ClusterID == "" {
return nil, fmt.Errorf("KEDGE_CLUSTER_ID is required")
}
return cfg, nil
}
func getEnv(key, fallback string) string {
if v := os.Getenv(key); v != "" {
return v
}
return fallback
}

View file

@ -0,0 +1,68 @@
package headscale
import (
"context"
"fmt"
"time"
"go.uber.org/zap"
"github.com/guildhouse-co/kedge/internal/config"
)
// PeerUpdater is implemented by the mesh manager to receive peer updates.
type PeerUpdater interface {
AddPeer(pubKey string, endpoint string, allowedIPs []string) error
}
// Client connects to Headscale for mesh peer discovery.
type Client struct {
cfg config.HeadscaleConfig
peers PeerUpdater
log *zap.SugaredLogger
}
// NewClient creates a new Headscale peer discovery client.
func NewClient(cfg config.HeadscaleConfig, peers PeerUpdater, log *zap.SugaredLogger) *Client {
return &Client{cfg: cfg, peers: peers, log: log}
}
// Run starts the Headscale peer discovery loop.
func (c *Client) Run(ctx context.Context) error {
if !c.cfg.Enabled {
c.log.Info("headscale peer discovery disabled")
<-ctx.Done()
return nil
}
c.log.Infof("starting headscale peer discovery (endpoint: %s, interval: %s)",
c.cfg.Endpoint, c.cfg.PollInterval)
ticker := time.NewTicker(c.cfg.PollInterval)
defer ticker.Stop()
// Initial discovery.
if err := c.discoverPeers(ctx); err != nil {
c.log.Warnw("initial peer discovery failed", "error", err)
}
for {
select {
case <-ctx.Done():
return nil
case <-ticker.C:
if err := c.discoverPeers(ctx); err != nil {
c.log.Warnw("peer discovery failed", "error", err)
}
}
}
}
func (c *Client) discoverPeers(ctx context.Context) error {
// TODO: Call Headscale gRPC API to list nodes in the mesh namespace.
// For each node, extract WireGuard public key, endpoint, and allowed IPs,
// then call c.peers.AddPeer() to update the mesh manager.
_ = ctx
return fmt.Errorf("headscale peer discovery not yet implemented")
}

135
internal/health/health.go Normal file
View file

@ -0,0 +1,135 @@
package health
import (
"context"
"fmt"
"net/http"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"go.uber.org/zap"
"github.com/guildhouse-co/kedge/internal/config"
)
// PeerCounter is implemented by the mesh manager.
type PeerCounter interface {
PeerCount() int
}
// VLANCounter is implemented by the VLAN manager.
type VLANCounter interface {
VLANCount() int
}
var (
meshPeerGauge = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "kedge",
Subsystem: "mesh",
Name: "peer_count",
Help: "Number of active WireGuard mesh peers",
})
vlanInterfaceGauge = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "kedge",
Subsystem: "vlan",
Name: "interface_count",
Help: "Number of managed VLAN interfaces",
})
sessionTransitCounter = prometheus.NewCounter(prometheus.CounterOpts{
Namespace: "kedge",
Subsystem: "quartermaster",
Name: "session_transits_total",
Help: "Total SessionTransitArtifacts submitted to Quartermaster",
})
networkMutationCounter = prometheus.NewCounter(prometheus.CounterOpts{
Namespace: "kedge",
Subsystem: "quartermaster",
Name: "network_mutations_total",
Help: "Total NetworkMutationArtifacts submitted to Quartermaster",
})
tunnelStatusGauge = prometheus.NewGaugeVec(prometheus.GaugeOpts{
Namespace: "kedge",
Subsystem: "mesh",
Name: "tunnel_up",
Help: "WireGuard tunnel status (1=up, 0=down)",
}, []string{"peer"})
)
func init() {
prometheus.MustRegister(
meshPeerGauge,
vlanInterfaceGauge,
sessionTransitCounter,
networkMutationCounter,
tunnelStatusGauge,
)
}
// Server serves health checks and Prometheus metrics.
type Server struct {
cfg config.HealthConfig
peers PeerCounter
vlans VLANCounter
log *zap.SugaredLogger
}
// NewServer creates a new health and metrics server.
func NewServer(cfg config.HealthConfig, peers PeerCounter, vlans VLANCounter, log *zap.SugaredLogger) *Server {
return &Server{cfg: cfg, peers: peers, vlans: vlans, log: log}
}
// Run starts the HTTP server for health checks and metrics.
func (s *Server) Run(ctx context.Context) error {
mux := http.NewServeMux()
mux.HandleFunc("/healthz", s.handleHealthz)
mux.HandleFunc("/readyz", s.handleReadyz)
mux.Handle("/metrics", promhttp.Handler())
srv := &http.Server{
Addr: s.cfg.ListenAddr,
Handler: mux,
}
go func() {
<-ctx.Done()
srv.Close()
}()
s.log.Infof("health server listening on %s", s.cfg.ListenAddr)
if err := srv.ListenAndServe(); err != http.ErrServerClosed {
return fmt.Errorf("health server error: %w", err)
}
return nil
}
func (s *Server) handleHealthz(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
fmt.Fprintln(w, "ok")
}
func (s *Server) handleReadyz(w http.ResponseWriter, r *http.Request) {
// Update metrics.
if s.peers != nil {
meshPeerGauge.Set(float64(s.peers.PeerCount()))
}
if s.vlans != nil {
vlanInterfaceGauge.Set(float64(s.vlans.VLANCount()))
}
w.WriteHeader(http.StatusOK)
fmt.Fprintln(w, "ok")
}
// RecordSessionTransit increments the session transit counter.
func RecordSessionTransit() {
sessionTransitCounter.Inc()
}
// RecordNetworkMutation increments the network mutation counter.
func RecordNetworkMutation() {
networkMutationCounter.Inc()
}

153
internal/mesh/manager.go Normal file
View file

@ -0,0 +1,153 @@
package mesh
import (
"context"
"fmt"
"sync"
"time"
"go.uber.org/zap"
"golang.zx2c4.com/wireguard/wgctrl"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
"github.com/guildhouse-co/kedge/internal/config"
"github.com/guildhouse-co/kedge/internal/quartermaster"
)
// Manager manages WireGuard tunnel lifecycle for overlay mode.
type Manager struct {
cfg config.MeshConfig
qm *quartermaster.Client
log *zap.SugaredLogger
client *wgctrl.Client
mu sync.RWMutex
peers map[string]*Peer // keyed by public key
}
// NewManager creates a new WireGuard mesh manager.
func NewManager(cfg config.MeshConfig, qm *quartermaster.Client, log *zap.SugaredLogger) *Manager {
return &Manager{
cfg: cfg,
qm: qm,
log: log,
peers: make(map[string]*Peer),
}
}
// Run starts the mesh manager loop. It initializes the WireGuard interface,
// configures initial peers, and monitors tunnel health.
func (m *Manager) Run(ctx context.Context) error {
if !m.cfg.Enabled {
m.log.Info("overlay mode disabled, mesh manager idle")
<-ctx.Done()
return nil
}
var err error
m.client, err = wgctrl.New()
if err != nil {
return fmt.Errorf("failed to create wireguard client: %w", err)
}
defer m.client.Close()
// Ensure WireGuard interface exists.
if err := m.ensureInterface(); err != nil {
return fmt.Errorf("failed to ensure wg interface: %w", err)
}
// Configure initial peers from mesh config.
if err := m.configureInitialPeers(); err != nil {
return fmt.Errorf("failed to configure initial peers: %w", err)
}
// Health monitoring loop.
ticker := time.NewTicker(m.cfg.HealthInterval)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
m.log.Info("mesh manager shutting down")
return nil
case <-ticker.C:
m.checkPeerHealth()
}
}
}
// PeerCount returns the number of active peers.
func (m *Manager) PeerCount() int {
m.mu.RLock()
defer m.mu.RUnlock()
return len(m.peers)
}
// AddPeer adds or updates a WireGuard peer.
func (m *Manager) AddPeer(pubKey string, endpoint string, allowedIPs []string) error {
m.mu.Lock()
defer m.mu.Unlock()
key, err := wgtypes.ParseKey(pubKey)
if err != nil {
return fmt.Errorf("invalid public key: %w", err)
}
peer := &Peer{
PublicKey: pubKey,
Endpoint: endpoint,
AllowedIPs: allowedIPs,
State: PeerStateActive,
LastSeen: time.Now(),
}
m.peers[pubKey] = peer
_ = key // TODO: Apply to wgctrl config.
m.log.Infow("peer added", "pubkey", pubKey[:8]+"...", "endpoint", endpoint)
return nil
}
func (m *Manager) ensureInterface() error {
// TODO: Create WireGuard interface if it doesn't exist.
// Use netlink to create the interface, then wgctrl to configure it.
m.log.Infof("ensuring wireguard interface %s", m.cfg.InterfaceName)
return nil
}
func (m *Manager) configureInitialPeers() error {
for _, p := range m.cfg.InitialPeers {
if err := m.AddPeer(p.PublicKey, p.Endpoint, p.AllowedIPs); err != nil {
m.log.Warnw("failed to add initial peer", "pubkey", p.PublicKey[:8]+"...", "error", err)
}
}
return nil
}
func (m *Manager) checkPeerHealth() {
m.mu.Lock()
defer m.mu.Unlock()
device, err := m.client.Device(m.cfg.InterfaceName)
if err != nil {
m.log.Warnw("failed to get wg device", "error", err)
return
}
for _, wgPeer := range device.Peers {
pubKey := wgPeer.PublicKey.String()
peer, ok := m.peers[pubKey]
if !ok {
continue
}
if !wgPeer.LastHandshakeTime.IsZero() {
peer.LastSeen = wgPeer.LastHandshakeTime
peer.State = PeerStateActive
}
if time.Since(peer.LastSeen) > m.cfg.DeadPeerTimeout {
peer.State = PeerStateDead
m.log.Warnw("peer dead", "pubkey", pubKey[:8]+"...", "last_seen", peer.LastSeen)
}
}
}

21
internal/mesh/peer.go Normal file
View file

@ -0,0 +1,21 @@
package mesh
import "time"
// PeerState represents the connectivity state of a WireGuard peer.
type PeerState string
const (
PeerStateActive PeerState = "active"
PeerStateDead PeerState = "dead"
PeerStatePending PeerState = "pending"
)
// Peer represents a WireGuard mesh peer.
type Peer struct {
PublicKey string
Endpoint string
AllowedIPs []string
State PeerState
LastSeen time.Time
}

View file

@ -0,0 +1,102 @@
package quartermaster
import (
"context"
"fmt"
"time"
"go.uber.org/zap"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"github.com/guildhouse-co/kedge/internal/config"
)
// Client is a gRPC client for the Quartermaster governance and notary services.
type Client struct {
cfg config.QuartermasterConfig
conn *grpc.ClientConn
log *zap.SugaredLogger
// TODO: Add typed gRPC stub clients when proto generation is wired up.
// governance quartermasterv1.GovernanceServiceClient
// notary quartermasterv1.QuartermasterNotaryClient
}
// NewClient creates a new Quartermaster gRPC client.
func NewClient(cfg config.QuartermasterConfig) (*Client, error) {
log := zap.L().Sugar().Named("quartermaster")
// TODO: Use mTLS with SPIFFE SVID for production.
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
}
conn, err := grpc.NewClient(cfg.Endpoint, opts...)
if err != nil {
return nil, fmt.Errorf("failed to connect to quartermaster at %s: %w", cfg.Endpoint, err)
}
log.Infof("connected to quartermaster at %s", cfg.Endpoint)
return &Client{
cfg: cfg,
conn: conn,
log: log,
}, nil
}
// Close closes the gRPC connection.
func (c *Client) Close() error {
if c.conn != nil {
return c.conn.Close()
}
return nil
}
// SubmitSessionTransit submits a SessionTransitArtifact to Quartermaster for notarization.
// This records overlay session crossings between clusters.
func (c *Client) SubmitSessionTransit(ctx context.Context, artifact *SessionTransitArtifact) error {
artifact.Timestamp = time.Now().UTC()
// Serialize to canonical JSON (RFC 8785) for hashing.
canonical, err := artifact.CanonicalBytes()
if err != nil {
return fmt.Errorf("failed to serialize session transit artifact: %w", err)
}
c.log.Infow("submitting session transit",
"session_id", artifact.SessionID,
"source", artifact.SourceCluster,
"dest", artifact.DestCluster,
"mode", artifact.GrantedMode,
"bytes", len(canonical),
)
// TODO: Call governance.CreateIntent + notary.CreateAnchor via gRPC stubs.
// For Phase 1, the anchor buffer flush is async — operations are not blocked.
_ = canonical
return nil
}
// SubmitNetworkMutation submits a NetworkMutationArtifact to Quartermaster for notarization.
// This records underlay infrastructure config changes.
func (c *Client) SubmitNetworkMutation(ctx context.Context, artifact *NetworkMutationArtifact) error {
artifact.Timestamp = time.Now().UTC()
canonical, err := artifact.CanonicalBytes()
if err != nil {
return fmt.Errorf("failed to serialize network mutation artifact: %w", err)
}
c.log.Infow("submitting network mutation",
"mutation_id", artifact.MutationID,
"device", artifact.Device,
"operation", artifact.Operation,
"bytes", len(canonical),
)
// TODO: Call governance.CreateIntent + notary.CreateAnchor via gRPC stubs.
_ = canonical
return nil
}

View file

@ -0,0 +1,53 @@
package quartermaster
import (
"encoding/json"
"time"
)
// NetworkMutationArtifact is recorded when Kedge (via Bascule SDK dispatch)
// mutates physical infrastructure (underlay mode).
type NetworkMutationArtifact struct {
MutationID string `json:"mutation_id"`
SATHash []byte `json:"sat_hash"`
SessionID string `json:"session_id"` // Links to SessionTransitArtifact
Device string `json:"device"` // e.g., "fortigate.transit.local"
DeviceType string `json:"device_type"` // e.g., "fortios"
Operation string `json:"operation"` // e.g., "vlan_create", "zone_policy_set"
YANGSourceHash []byte `json:"yang_source_hash"` // Hash of YANG instance data
ConfigBeforeHash []byte `json:"config_before_hash"` // Device config before mutation
ConfigAfterHash []byte `json:"config_after_hash"` // Device config after mutation
SDKMethod string `json:"sdk_method"` // e.g., "fortiosapi.set('firewall','policy',...)"
Timestamp time.Time `json:"timestamp"`
}
// ArtifactID returns the unique identifier for this artifact.
func (a *NetworkMutationArtifact) ArtifactID() string {
return a.MutationID
}
// RegistryType returns the registry type discriminator.
func (a *NetworkMutationArtifact) RegistryType() string {
return "network-mutation"
}
// CanonicalBytes returns the RFC 8785 (JCS) canonical JSON serialization
// for deterministic hashing and merkle anchoring.
func (a *NetworkMutationArtifact) CanonicalBytes() ([]byte, error) {
canonical := map[string]any{
"mutation_id": a.MutationID,
"sat_hash": a.SATHash,
"session_id": a.SessionID,
"device": a.Device,
"device_type": a.DeviceType,
"operation": a.Operation,
"yang_source_hash": a.YANGSourceHash,
"config_before_hash": a.ConfigBeforeHash,
"config_after_hash": a.ConfigAfterHash,
"sdk_method": a.SDKMethod,
"timestamp": a.Timestamp.UTC().Format(time.RFC3339Nano),
}
// TODO: Use a proper JCS (RFC 8785) library for canonical serialization.
return json.Marshal(canonical)
}

View file

@ -0,0 +1,60 @@
package quartermaster
import (
"encoding/json"
"sort"
"time"
)
// SessionTransitArtifact is recorded when a Shellstream session crosses
// the inter-cluster boundary (overlay mode).
type SessionTransitArtifact struct {
SessionID string `json:"session_id"`
SATHash []byte `json:"sat_hash"`
SourceCluster string `json:"source_cluster"`
DestCluster string `json:"dest_cluster"`
TargetDevice string `json:"target_device"`
TargetSubnet string `json:"target_subnet"`
GrantedMode string `json:"granted_mode"` // "overlay" | "underlay" | "both"
GrantedOperations []string `json:"granted_operations"`
OperationType string `json:"operation_type"`
Timestamp time.Time `json:"timestamp"`
CapabilityTokenHash []byte `json:"capability_token_hash"`
}
// ArtifactID returns the unique identifier for this artifact.
func (a *SessionTransitArtifact) ArtifactID() string {
return a.SessionID
}
// RegistryType returns the registry type discriminator.
func (a *SessionTransitArtifact) RegistryType() string {
return "session-transit"
}
// CanonicalBytes returns the RFC 8785 (JCS) canonical JSON serialization
// for deterministic hashing and merkle anchoring.
func (a *SessionTransitArtifact) CanonicalBytes() ([]byte, error) {
// Sort granted operations for deterministic output.
ops := make([]string, len(a.GrantedOperations))
copy(ops, a.GrantedOperations)
sort.Strings(ops)
canonical := map[string]any{
"session_id": a.SessionID,
"sat_hash": a.SATHash,
"source_cluster": a.SourceCluster,
"dest_cluster": a.DestCluster,
"target_device": a.TargetDevice,
"target_subnet": a.TargetSubnet,
"granted_mode": a.GrantedMode,
"granted_operations": ops,
"operation_type": a.OperationType,
"timestamp": a.Timestamp.UTC().Format(time.RFC3339Nano),
"capability_token_hash": a.CapabilityTokenHash,
}
// TODO: Use a proper JCS (RFC 8785) library for canonical serialization.
// For now, json.Marshal with sorted keys provides deterministic output.
return json.Marshal(canonical)
}

View file

@ -0,0 +1,94 @@
package shellstream
import (
"encoding/json"
"fmt"
"os"
)
// AccordPolicy defines the local node's policy for granting capabilities
// to incoming Shellstream sessions, derived from Guildhouse accords.
type AccordPolicy struct {
TrustTiers map[string]TierPolicy `json:"trust_tiers"`
}
// TierPolicy maps a Guildhouse trust tier to allowed Kedge modes and operations.
type TierPolicy struct {
Mode []string `json:"mode"` // ["overlay"], ["overlay", "underlay"]
Operations []string `json:"operations"` // ["read"], ["read", "mutate", "admin"]
Subnets []string `json:"subnets"` // ["172.16.0.0/24", "10.0.1.0/24"]
}
// EvaluateCapability evaluates a capability request against the local accord policy.
// Returns the attenuated grant (requested capabilities intersected with policy).
func EvaluateCapability(req CapabilityRequest, policyPath string) (*CapabilityGrant, error) {
policy, err := loadAccordPolicy(policyPath)
if err != nil {
return nil, fmt.Errorf("failed to load accord policy: %w", err)
}
// TODO: Determine the remote peer's trust tier from its SAT token.
// For Phase 1, assume "journeyman" tier.
tier := "journeyman"
tierPolicy, ok := policy.TrustTiers[tier]
if !ok {
return nil, fmt.Errorf("unknown trust tier: %s", tier)
}
// Attenuate: grant the intersection of requested and allowed.
grantedMode := attenuateMode(req.Mode, tierPolicy.Mode)
if grantedMode == "" {
return nil, fmt.Errorf("requested mode %s not allowed for tier %s", req.Mode, tier)
}
grantedOps := attenuateOperations(req.Operations, tierPolicy.Operations)
if len(grantedOps) == 0 {
return nil, fmt.Errorf("no operations allowed for tier %s", tier)
}
return &CapabilityGrant{
Mode: grantedMode,
Operations: grantedOps,
}, nil
}
func loadAccordPolicy(path string) (*AccordPolicy, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, err
}
var policy AccordPolicy
if err := json.Unmarshal(data, &policy); err != nil {
return nil, err
}
return &policy, nil
}
func attenuateMode(requested string, allowed []string) string {
for _, m := range allowed {
if m == requested || m == "both" {
return requested
}
}
// If "both" was requested but only one mode allowed, grant the allowed one.
if requested == "both" && len(allowed) > 0 {
return allowed[0]
}
return ""
}
func attenuateOperations(requested, allowed []string) []string {
allowedSet := make(map[string]bool)
for _, op := range allowed {
allowedSet[op] = true
}
var granted []string
for _, op := range requested {
if allowedSet[op] {
granted = append(granted, op)
}
}
return granted
}

View file

@ -0,0 +1,151 @@
package shellstream
import (
"context"
"fmt"
"net"
"go.uber.org/zap"
"github.com/guildhouse-co/kedge/internal/config"
"github.com/guildhouse-co/kedge/internal/quartermaster"
)
// Listener accepts incoming Shellstream connections and performs the
// 3-way attestation handshake (ATTEST-INIT → ATTEST-VERIFY → ATTEST-CONFIRM).
type Listener struct {
cfg config.ShellstreamConfig
qm *quartermaster.Client
log *zap.SugaredLogger
}
// NewListener creates a new Shellstream handshake listener.
func NewListener(cfg config.ShellstreamConfig, qm *quartermaster.Client, log *zap.SugaredLogger) *Listener {
return &Listener{cfg: cfg, qm: qm, log: log}
}
// Run starts listening for incoming Shellstream connections.
func (l *Listener) Run(ctx context.Context) error {
listener, err := net.Listen("tcp", l.cfg.ListenAddr)
if err != nil {
return fmt.Errorf("failed to listen on %s: %w", l.cfg.ListenAddr, err)
}
defer listener.Close()
l.log.Infof("shellstream listener started on %s", l.cfg.ListenAddr)
go func() {
<-ctx.Done()
listener.Close()
}()
for {
conn, err := listener.Accept()
if err != nil {
select {
case <-ctx.Done():
return nil
default:
l.log.Warnw("accept error", "error", err)
continue
}
}
go l.handleConnection(ctx, conn)
}
}
func (l *Listener) handleConnection(ctx context.Context, conn net.Conn) {
defer conn.Close()
// Phase 1: Receive ATTEST-INIT from remote.
initMsg, err := readAttestInit(conn)
if err != nil {
l.log.Warnw("failed to read ATTEST-INIT", "remote", conn.RemoteAddr(), "error", err)
return
}
// Validate the SAT token.
if err := ValidateSAT(initMsg.SATToken, l.cfg.TrustBundlePath); err != nil {
l.log.Warnw("SAT validation failed", "remote", conn.RemoteAddr(), "error", err)
return
}
// Evaluate capability request against local accord policy.
grant, err := EvaluateCapability(initMsg.Capabilities, l.cfg.AccordPolicyPath)
if err != nil {
l.log.Warnw("capability evaluation failed", "remote", conn.RemoteAddr(), "error", err)
return
}
// Phase 2: Send ATTEST-VERIFY with granted capabilities.
if err := writeAttestVerify(conn, grant); err != nil {
l.log.Warnw("failed to write ATTEST-VERIFY", "error", err)
return
}
// Phase 3: Receive ATTEST-CONFIRM.
if err := readAttestConfirm(conn); err != nil {
l.log.Warnw("failed to read ATTEST-CONFIRM", "error", err)
return
}
// Record session transit artifact.
artifact := quartermaster.SessionTransitArtifact{
SourceCluster: initMsg.SourceCluster,
DestCluster: l.cfg.ClusterID,
TargetDevice: initMsg.TargetDevice,
GrantedMode: grant.Mode,
GrantedOperations: grant.Operations,
}
if err := l.qm.SubmitSessionTransit(ctx, &artifact); err != nil {
l.log.Warnw("failed to submit session transit", "error", err)
}
l.log.Infow("session established",
"remote", conn.RemoteAddr(),
"mode", grant.Mode,
"target", initMsg.TargetDevice,
)
}
// AttestInitMsg represents the ATTEST-INIT message from the remote peer.
type AttestInitMsg struct {
SATToken []byte
SourceCluster string
TargetDevice string
TargetSubnet string
Capabilities CapabilityRequest
}
// CapabilityRequest describes what mode and operations the remote peer is requesting.
type CapabilityRequest struct {
Mode string // "overlay", "underlay", or "both"
Targets []string // Target device addresses
Operations []string // "read", "mutate"
}
// CapabilityGrant describes what was actually granted after accord evaluation.
type CapabilityGrant struct {
Mode string
Operations []string
}
func readAttestInit(conn net.Conn) (*AttestInitMsg, error) {
// TODO: Implement Shellstream wire protocol parsing.
_ = conn
return nil, fmt.Errorf("not yet implemented")
}
func writeAttestVerify(conn net.Conn, grant *CapabilityGrant) error {
// TODO: Implement Shellstream wire protocol serialization.
_ = conn
_ = grant
return fmt.Errorf("not yet implemented")
}
func readAttestConfirm(conn net.Conn) error {
// TODO: Implement Shellstream wire protocol parsing.
_ = conn
return fmt.Errorf("not yet implemented")
}

View file

@ -0,0 +1,44 @@
package shellstream
import (
"context"
"fmt"
"github.com/spiffe/go-spiffe/v2/svid/x509svid"
"github.com/spiffe/go-spiffe/v2/workloadapi"
)
// ValidateSAT validates a Substrate Attestation Token against the SPIRE/Vigil trust bundle.
// Phase 1: Simplified to SPIFFE SVID verification only.
// Future: Full SAT validation with TPM attestation and capability semantics via Vigil.
func ValidateSAT(token []byte, trustBundlePath string) error {
if len(token) == 0 {
return fmt.Errorf("empty SAT token")
}
// Phase 1: Parse as X.509 SVID and verify against the SPIRE trust bundle.
// The token is expected to be a DER-encoded X.509 certificate chain.
svid, err := x509svid.Parse(token, nil)
if err != nil {
return fmt.Errorf("failed to parse SVID from SAT: %w", err)
}
// Verify the SVID against the local workload API trust bundle.
ctx := context.Background()
source, err := workloadapi.NewX509Source(ctx)
if err != nil {
return fmt.Errorf("failed to create X509Source: %w", err)
}
defer source.Close()
bundle, err := source.GetX509BundleForTrustDomain(svid.ID.TrustDomain())
if err != nil {
return fmt.Errorf("failed to get trust bundle for %s: %w", svid.ID.TrustDomain(), err)
}
_ = bundle
// TODO: Verify the certificate chain against the bundle.
// For Phase 1, presence of a parseable SVID from a known trust domain is sufficient.
return nil
}

View file

@ -0,0 +1,48 @@
package topology
// MeshTopology represents the current mesh state shared between
// the CNI plugin (reader) and DaemonSet (writer).
type MeshTopology struct {
// NodeID identifies this node in the mesh.
NodeID string `json:"node_id"`
// ClusterID identifies the cluster this node belongs to.
ClusterID string `json:"cluster_id"`
// Mode indicates the active modes: "overlay", "underlay", or "both".
Mode string `json:"mode"`
// Peers are the WireGuard mesh peers (overlay mode).
Peers []PeerInfo `json:"peers,omitempty"`
// OverlaySubnets are subnets reachable via WireGuard tunnels.
OverlaySubnets []SubnetRoute `json:"overlay_subnets,omitempty"`
// UnderlaySubnets are subnets reachable via VLAN bridges.
UnderlaySubnets []SubnetRoute `json:"underlay_subnets,omitempty"`
// WireGuard interface configuration (overlay).
WireGuard WireGuardConfig `json:"wireguard,omitempty"`
}
// PeerInfo describes a WireGuard mesh peer.
type PeerInfo struct {
PublicKey string `json:"public_key"`
Endpoint string `json:"endpoint"`
AllowedIPs []string `json:"allowed_ips"`
ClusterID string `json:"cluster_id,omitempty"`
}
// SubnetRoute maps a destination CIDR to an interface.
type SubnetRoute struct {
Dst string `json:"dst"` // e.g., "172.16.0.0/24"
Via string `json:"via"` // e.g., "vlan100" or "wg0"
Mode string `json:"mode"` // "overlay" or "underlay"
}
// WireGuardConfig holds the local node's WireGuard configuration.
type WireGuardConfig struct {
InterfaceName string `json:"interface_name"` // e.g., "wg0"
ListenPort int `json:"listen_port"`
PrivateKeyPath string `json:"private_key_path"`
}

View file

@ -0,0 +1,76 @@
package topology
import (
"encoding/json"
"fmt"
"os"
"sync"
)
// Store provides thread-safe access to the mesh topology state.
// The DaemonSet writes topology updates, and the CNI plugin reads them.
// State is persisted to a JSON file on disk.
type Store struct {
mu sync.RWMutex
filePath string
topo *MeshTopology
}
// NewStore creates a topology store backed by the given file path.
func NewStore(filePath string) *Store {
return &Store{
filePath: filePath,
topo: &MeshTopology{},
}
}
// LoadFromFile reads a MeshTopology from a JSON file.
// Used by the CNI plugin to read the DaemonSet-written state.
func LoadFromFile(path string) (*MeshTopology, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("failed to read topology file %s: %w", path, err)
}
var topo MeshTopology
if err := json.Unmarshal(data, &topo); err != nil {
return nil, fmt.Errorf("failed to parse topology file: %w", err)
}
return &topo, nil
}
// Get returns a copy of the current topology.
func (s *Store) Get() MeshTopology {
s.mu.RLock()
defer s.mu.RUnlock()
return *s.topo
}
// Update atomically updates the topology and persists to disk.
func (s *Store) Update(fn func(*MeshTopology)) error {
s.mu.Lock()
defer s.mu.Unlock()
fn(s.topo)
return s.persist()
}
func (s *Store) persist() error {
data, err := json.MarshalIndent(s.topo, "", " ")
if err != nil {
return fmt.Errorf("failed to marshal topology: %w", err)
}
// Atomic write: write to temp file, then rename.
tmpPath := s.filePath + ".tmp"
if err := os.WriteFile(tmpPath, data, 0644); err != nil {
return fmt.Errorf("failed to write topology file: %w", err)
}
if err := os.Rename(tmpPath, s.filePath); err != nil {
return fmt.Errorf("failed to rename topology file: %w", err)
}
return nil
}

View file

@ -0,0 +1,69 @@
package underlay
import (
"context"
"crypto/sha256"
"fmt"
"os/exec"
"go.uber.org/zap"
"github.com/guildhouse-co/kedge/internal/quartermaster"
)
// Dispatcher triggers YANG compilation and coordinates SDK dispatch through Bascule.
type Dispatcher struct {
qm *quartermaster.Client
log *zap.SugaredLogger
// CompilerPath is the path to the YANG compiler script.
CompilerPath string
}
// NewDispatcher creates a new underlay dispatch coordinator.
func NewDispatcher(qm *quartermaster.Client, log *zap.SugaredLogger) *Dispatcher {
return &Dispatcher{
qm: qm,
log: log,
CompilerPath: "/opt/kedge/yang/compiler/compile.py",
}
}
// CompileAndDispatch runs the YANG compiler for the given site config and
// dispatches the resulting vendor-specific payloads via Bascule SDK.
func (d *Dispatcher) CompileAndDispatch(ctx context.Context, siteConfigPath string, sessionID string) error {
// Hash the YANG source for the mutation artifact.
yangHash, err := hashFile(siteConfigPath)
if err != nil {
return fmt.Errorf("failed to hash YANG source: %w", err)
}
// Run the YANG compiler as a subprocess.
cmd := exec.CommandContext(ctx, "python3", d.CompilerPath,
"--site-config", siteConfigPath,
"--output-format", "json",
)
output, err := cmd.Output()
if err != nil {
return fmt.Errorf("YANG compilation failed: %w", err)
}
d.log.Infow("YANG compilation complete", "site_config", siteConfigPath, "output_bytes", len(output))
// TODO: Parse compiler output into vendor-specific payloads.
// TODO: For each device mutation, dispatch SDK call through Bascule,
// capture before/after config hashes, and submit NetworkMutationArtifact.
_ = yangHash
_ = sessionID
return nil
}
func hashFile(path string) ([]byte, error) {
data, err := exec.Command("cat", path).Output()
if err != nil {
return nil, err
}
hash := sha256.Sum256(data)
return hash[:], nil
}

View file

@ -0,0 +1,56 @@
package underlay
import (
"context"
"fmt"
"go.uber.org/zap"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
"github.com/guildhouse-co/kedge/internal/config"
"github.com/guildhouse-co/kedge/internal/quartermaster"
)
// Watcher watches for YANG instance data changes in ConfigMaps and triggers
// compilation and SDK dispatch for underlay device configuration.
type Watcher struct {
cfg config.UnderlayConfig
qm *quartermaster.Client
log *zap.SugaredLogger
clientset kubernetes.Interface
}
// NewWatcher creates a new underlay YANG watcher.
func NewWatcher(cfg config.UnderlayConfig, qm *quartermaster.Client, log *zap.SugaredLogger) *Watcher {
return &Watcher{cfg: cfg, qm: qm, log: log}
}
// Run starts watching for YANG instance data changes.
func (w *Watcher) Run(ctx context.Context) error {
if !w.cfg.Enabled {
w.log.Info("underlay mode disabled, watcher idle")
<-ctx.Done()
return nil
}
// Build in-cluster K8s client.
k8sConfig, err := rest.InClusterConfig()
if err != nil {
return fmt.Errorf("failed to get in-cluster config: %w", err)
}
w.clientset, err = kubernetes.NewForConfig(k8sConfig)
if err != nil {
return fmt.Errorf("failed to create k8s client: %w", err)
}
w.log.Infof("watching ConfigMap %s/%s for YANG instance data changes",
w.cfg.Namespace, w.cfg.ConfigMapName)
// TODO: Set up informer to watch ConfigMap for changes.
// On change: trigger YANG compilation, then dispatch SDK calls via Bascule.
// Each mutation is notarized via Quartermaster.
<-ctx.Done()
return nil
}

99
internal/vlan/manager.go Normal file
View file

@ -0,0 +1,99 @@
package vlan
import (
"context"
"fmt"
"github.com/vishvananda/netlink"
"go.uber.org/zap"
"github.com/guildhouse-co/kedge/internal/config"
)
// Manager manages VLAN interfaces on the node for underlay mode.
type Manager struct {
cfg config.VLANConfig
log *zap.SugaredLogger
}
// NewManager creates a new VLAN interface manager.
func NewManager(cfg config.VLANConfig, log *zap.SugaredLogger) *Manager {
return &Manager{cfg: cfg, log: log}
}
// Run starts the VLAN manager. It ensures all configured VLAN interfaces
// and bridges exist on the node.
func (m *Manager) Run(ctx context.Context) error {
if !m.cfg.Enabled {
m.log.Info("underlay mode disabled, vlan manager idle")
<-ctx.Done()
return nil
}
// Ensure all configured VLANs exist.
for _, v := range m.cfg.VLANs {
if err := m.ensureVLAN(v); err != nil {
m.log.Warnw("failed to ensure VLAN", "vlan_id", v.ID, "error", err)
}
}
m.log.Infof("vlan manager started, managing %d VLANs", len(m.cfg.VLANs))
<-ctx.Done()
return nil
}
// VLANCount returns the number of managed VLAN interfaces.
func (m *Manager) VLANCount() int {
return len(m.cfg.VLANs)
}
// ensureVLAN creates a VLAN interface and bridge if they don't already exist.
func (m *Manager) ensureVLAN(v config.VLANEntry) error {
bridgeName := fmt.Sprintf("vlan%d", v.ID)
// Check if bridge already exists.
if _, err := netlink.LinkByName(bridgeName); err == nil {
m.log.Debugw("bridge already exists", "name", bridgeName)
return nil
}
// Get the parent interface.
parent, err := netlink.LinkByName(v.ParentInterface)
if err != nil {
return fmt.Errorf("parent interface %s not found: %w", v.ParentInterface, err)
}
// Create VLAN sub-interface.
vlanLink := &netlink.Vlan{
LinkAttrs: netlink.LinkAttrs{
Name: fmt.Sprintf("%s.%d", v.ParentInterface, v.ID),
ParentIndex: parent.Attrs().Index,
},
VlanId: v.ID,
}
if err := netlink.LinkAdd(vlanLink); err != nil {
return fmt.Errorf("failed to create VLAN %d: %w", v.ID, err)
}
if err := netlink.LinkSetUp(vlanLink); err != nil {
return fmt.Errorf("failed to bring up VLAN %d: %w", v.ID, err)
}
// Create bridge for the VLAN.
bridge := &netlink.Bridge{
LinkAttrs: netlink.LinkAttrs{Name: bridgeName},
}
if err := netlink.LinkAdd(bridge); err != nil {
return fmt.Errorf("failed to create bridge %s: %w", bridgeName, err)
}
if err := netlink.LinkSetUp(bridge); err != nil {
return fmt.Errorf("failed to bring up bridge %s: %w", bridgeName, err)
}
// Add VLAN interface to bridge.
if err := netlink.LinkSetMaster(vlanLink, bridge); err != nil {
return fmt.Errorf("failed to add VLAN to bridge: %w", err)
}
m.log.Infow("VLAN created", "id", v.ID, "bridge", bridgeName, "parent", v.ParentInterface)
return nil
}

34
k8s/configmap-mesh.yaml Normal file
View file

@ -0,0 +1,34 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: kedge-mesh
namespace: kedge
data:
cluster_id: "homelab"
mesh.json: |
{
"node_id": "",
"cluster_id": "homelab",
"mode": "both",
"peers": [
{
"public_key": "PLACEHOLDER_CLOUD_PUBKEY",
"endpoint": "anchor.guildhouse.example.com:51820",
"allowed_ips": ["10.100.0.0/24"],
"cluster_id": "cloud-anchor"
}
],
"overlay_subnets": [
{"dst": "10.100.0.0/24", "via": "wg0", "mode": "overlay"}
],
"underlay_subnets": [
{"dst": "172.16.0.0/24", "via": "vlan100", "mode": "underlay"},
{"dst": "10.0.1.0/24", "via": "vlan10", "mode": "underlay"},
{"dst": "192.168.50.0/24", "via": "vlan50", "mode": "underlay"}
],
"wireguard": {
"interface_name": "wg0",
"listen_port": 51820,
"private_key_path": "/etc/kedge/wg-private.key"
}
}

View file

@ -0,0 +1,33 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: kedge-underlay
namespace: kedge
data:
device-inventory.json: |
{
"devices": [
{
"name": "fortigate.transit.local",
"type": "fortios",
"address": "172.16.0.2",
"sdk": "fortiosapi",
"managed_zones": ["transit", "tyler-lab", "dmz"]
},
{
"name": "vyos.transit.local",
"type": "vyos",
"address": "172.16.0.3",
"sdk": "ncclient",
"managed_zones": ["transit"]
},
{
"name": "udr7.local",
"type": "unifi",
"address": "192.168.1.1",
"sdk": "unifi-api",
"managed_zones": ["roommate", "shared"]
}
]
}
yang-site-config: "homelab"

115
k8s/daemonset.yaml Normal file
View file

@ -0,0 +1,115 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kedge-daemon
namespace: kedge
labels:
app: kedge
component: daemon
spec:
selector:
matchLabels:
app: kedge
component: daemon
template:
metadata:
labels:
app: kedge
component: daemon
spec:
serviceAccountName: kedge-daemon
hostNetwork: true
hostPID: false
tolerations:
- operator: Exists
containers:
- name: kedge-daemon
image: kedge-daemon:latest
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
capabilities:
add:
- NET_ADMIN
- SYS_ADMIN
ports:
- name: shellstream
containerPort: 8443
protocol: TCP
- name: metrics
containerPort: 9090
protocol: TCP
env:
- name: KEDGE_NODE_ID
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: KEDGE_CLUSTER_ID
valueFrom:
configMapKeyRef:
name: kedge-mesh
key: cluster_id
- name: KEDGE_OVERLAY_ENABLED
value: "true"
- name: KEDGE_UNDERLAY_ENABLED
value: "false"
- name: KEDGE_QM_ENDPOINT
value: "quartermaster.guildhouse.svc:50051"
volumeMounts:
- name: kedge-config
mountPath: /etc/kedge
readOnly: true
- name: kedge-run
mountPath: /var/run/kedge
- name: cni-bin
mountPath: /opt/cni/bin
- name: cni-conf
mountPath: /etc/cni/net.d
- name: spire-bundle
mountPath: /run/spire/bundle
readOnly: true
livenessProbe:
httpGet:
path: /healthz
port: metrics
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /readyz
port: metrics
initialDelaySeconds: 5
periodSeconds: 10
initContainers:
- name: install-cni
image: kedge-cni:latest
imagePullPolicy: IfNotPresent
command: ["cp", "/kedge-cni", "/opt/cni/bin/kedge-cni"]
volumeMounts:
- name: cni-bin
mountPath: /opt/cni/bin
volumes:
- name: kedge-config
projected:
sources:
- configMap:
name: kedge-mesh
- configMap:
name: kedge-underlay
optional: true
- name: kedge-run
hostPath:
path: /var/run/kedge
type: DirectoryOrCreate
- name: cni-bin
hostPath:
path: /opt/cni/bin
type: DirectoryOrCreate
- name: cni-conf
hostPath:
path: /etc/cni/net.d
type: Directory
- name: spire-bundle
configMap:
name: spire-bundle
optional: true

View file

@ -0,0 +1,21 @@
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: infra-mgmt
namespace: bascule
spec:
config: |
{
"cniVersion": "1.0.0",
"name": "kedge-infra",
"type": "kedge-cni",
"meshConfig": "/etc/kedge/mesh.json",
"underlayRoutes": [
{"dst": "172.16.0.0/24", "via": "vlan100"},
{"dst": "10.0.1.0/24", "via": "vlan10"},
{"dst": "192.168.50.0/24", "via": "vlan50"}
],
"overlayRoutes": [
{"dst": "10.100.0.0/24", "via": "wg0"}
]
}

45
k8s/rbac.yaml Normal file
View file

@ -0,0 +1,45 @@
apiVersion: v1
kind: Namespace
metadata:
name: kedge
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kedge-daemon
namespace: kedge
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kedge-daemon
rules:
# Watch ConfigMaps for mesh topology and YANG instance data.
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
# Read NetworkAttachmentDefinitions.
- apiGroups: ["k8s.cni.cncf.io"]
resources: ["network-attachment-definitions"]
verbs: ["get", "list", "watch"]
# List pods for SVID mapping.
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
# Read nodes for topology awareness.
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kedge-daemon
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kedge-daemon
subjects:
- kind: ServiceAccount
name: kedge-daemon
namespace: kedge

View file

@ -0,0 +1,18 @@
{
"dashboard": {
"title": "Kedge Device Mutations",
"uid": "kedge-device-mutations",
"panels": [
{
"title": "Network Mutations (rate)",
"type": "graph",
"targets": [{"expr": "rate(kedge_quartermaster_network_mutations_total[5m])"}]
},
{
"title": "Total Network Mutations",
"type": "stat",
"targets": [{"expr": "kedge_quartermaster_network_mutations_total"}]
}
]
}
}

View file

@ -0,0 +1,20 @@
{
"dashboard": {
"title": "Kedge Drift Detection",
"uid": "kedge-drift-detection",
"description": "Compares YANG desired state to actual device config. Populated by underlay-audit playbook.",
"panels": [
{
"title": "Managed VLAN Interfaces",
"type": "stat",
"targets": [{"expr": "kedge_vlan_interface_count"}]
},
{
"title": "Drift Events",
"type": "graph",
"description": "Placeholder — requires drift detection metrics from audit pipeline",
"targets": []
}
]
}
}

View file

@ -0,0 +1,18 @@
{
"dashboard": {
"title": "Kedge Mesh Health",
"uid": "kedge-mesh-health",
"panels": [
{
"title": "Active WireGuard Peers",
"type": "stat",
"targets": [{"expr": "kedge_mesh_peer_count"}]
},
{
"title": "Tunnel Status",
"type": "table",
"targets": [{"expr": "kedge_mesh_tunnel_up"}]
}
]
}
}

View file

@ -0,0 +1,18 @@
{
"dashboard": {
"title": "Kedge Session Transits",
"uid": "kedge-session-transits",
"panels": [
{
"title": "Session Transits (rate)",
"type": "graph",
"targets": [{"expr": "rate(kedge_quartermaster_session_transits_total[5m])"}]
},
{
"title": "Total Session Transits",
"type": "stat",
"targets": [{"expr": "kedge_quartermaster_session_transits_total"}]
}
]
}
}

View file

@ -0,0 +1,25 @@
---
# Health check definitions for Kedge components.
checks:
- name: kedge-daemon-health
endpoint: http://localhost:9090/healthz
interval: 30s
timeout: 5s
expected_status: 200
- name: kedge-daemon-ready
endpoint: http://localhost:9090/readyz
interval: 30s
timeout: 5s
expected_status: 200
- name: wireguard-interface
command: wg show wg0
interval: 60s
expect_exit_code: 0
- name: quartermaster-connectivity
command: grpcurl -plaintext quartermaster.guildhouse.svc:50051 list
interval: 120s
expect_exit_code: 0

View file

@ -0,0 +1,16 @@
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: kedge-daemon
static_configs:
- targets: ['localhost:9090']
labels:
component: daemon
- job_name: node-exporter
static_configs:
- targets: ['localhost:9100']
labels:
component: node

View file

@ -0,0 +1,119 @@
syntax = "proto3";
package quartermaster.v1;
import "google/protobuf/timestamp.proto";
// Governance service for intent lifecycle and SAT issuance.
service GovernanceService {
// Create a MutationIntent called by application at user-request time.
rpc CreateIntent(CreateIntentRequest) returns (CreateIntentResponse);
// Redeem a MutationIntent called by worker at execution time.
rpc RedeemIntent(RedeemIntentRequest) returns (RedeemIntentResponse);
// Revoke a MutationIntent called to cancel pending authorization.
rpc RevokeIntent(RevokeIntentRequest) returns (RevokeIntentResponse);
// Query intents for a tenant (admin/audit use).
rpc ListIntents(ListIntentsRequest) returns (ListIntentsResponse);
}
message CreateIntentRequest {
string registry_type = 1;
string verb = 2;
string artifact_scope = 3;
string tenant_id = 4;
// Identity claim one of these should be set.
oneof identity_claim {
string oidc_token = 5;
ExternalEventClaim external_event = 6;
}
uint32 ttl_seconds = 7;
uint32 max_redemptions = 8;
string idempotency_key = 9;
}
message ExternalEventClaim {
string source = 1;
string event_id = 2;
string event_type = 3;
string verification = 4;
}
message CreateIntentResponse {
string intent_id = 1;
google.protobuf.Timestamp expires_at = 2;
bytes intent_hash = 3;
string error = 4;
bool denied = 5;
string denial_reason = 6;
// If a governance ceremony is required, this field contains the
// ceremony ID. The intent status is "ceremony_pending" and cannot
// be redeemed until the ceremony resolves.
string ceremony_id = 7;
}
message RedeemIntentRequest {
string intent_id = 1;
}
message RedeemIntentResponse {
bool success = 1;
SatToken sat = 2;
int32 remaining_redemptions = 3;
string status = 4;
string error = 5;
}
message SatToken {
bytes sat_hash = 1;
string bearer_svid = 2;
repeated SatScopeMsg scopes = 3;
google.protobuf.Timestamp issued_at = 4;
google.protobuf.Timestamp expires_at = 5;
bytes signature = 6;
bytes sat_bytes = 7;
}
message SatScopeMsg {
string registry_type = 1;
repeated string verbs = 2;
string resource_pattern = 3;
}
message RevokeIntentRequest {
string intent_id = 1;
}
message RevokeIntentResponse {
bool success = 1;
string error = 2;
}
message ListIntentsRequest {
string tenant_id = 1;
string status_filter = 2;
int32 limit = 3;
}
message ListIntentsResponse {
repeated IntentSummary intents = 1;
}
message IntentSummary {
string intent_id = 1;
string registry_type = 2;
string verb = 3;
string artifact_scope = 4;
string tenant_id = 5;
string claim_type = 6;
string claim_subject = 7;
string status = 8;
int32 max_redemptions = 9;
int32 redeemed_count = 10;
google.protobuf.Timestamp authorized_at = 11;
google.protobuf.Timestamp expires_at = 12;
}

View file

@ -0,0 +1,48 @@
syntax = "proto3";
package quartermaster.v1;
import "google/protobuf/timestamp.proto";
service QuartermasterNotary {
rpc CreateAnchor (CreateAnchorRequest) returns (CreateAnchorResponse);
rpc GetLatestAnchor (GetLatestAnchorRequest) returns (GetLatestAnchorResponse);
rpc VerifyInclusion (VerifyInclusionRequest) returns (VerifyInclusionResponse);
}
message CreateAnchorRequest {
string cluster_id = 1;
repeated bytes leaves = 2;
int64 etcd_revision = 3; // 0 means not set
}
message CreateAnchorResponse {
string anchor_id = 1;
bytes merkle_root = 2;
bytes previous_root = 3;
int32 leaf_count = 4;
google.protobuf.Timestamp time = 5;
}
message GetLatestAnchorRequest {
string cluster_id = 1;
}
message GetLatestAnchorResponse {
string anchor_id = 1;
string cluster_id = 2;
bytes merkle_root = 3;
bytes previous_root = 4;
int64 etcd_revision = 5;
int32 leaf_count = 6;
google.protobuf.Timestamp time = 7;
}
message VerifyInclusionRequest {
string anchor_id = 1;
bytes leaf = 2;
repeated bytes proof = 3;
}
message VerifyInclusionResponse {
bool valid = 1;
}

View file

@ -0,0 +1,58 @@
syntax = "proto3";
package quartermaster.v1;
import "google/protobuf/timestamp.proto";
import "google/protobuf/struct.proto";
service QuartermasterRegistry {
rpc RegisterCluster (RegisterClusterRequest) returns (RegisterClusterResponse);
rpc GetCluster (GetClusterRequest) returns (GetClusterResponse);
rpc ListClusters (ListClustersRequest) returns (ListClustersResponse);
rpc UpdateClusterMetadata (UpdateClusterMetadataRequest) returns (UpdateClusterMetadataResponse);
}
message RegisterClusterRequest {
string cluster_name = 1;
string trust_domain = 2;
string role = 3; // "authority", "delegate", "peer"
string provisioned_by = 4; // optional cluster_id
}
message RegisterClusterResponse {
string cluster_id = 1;
string cluster_name = 2;
string trust_domain = 3;
string role = 4;
google.protobuf.Timestamp provisioned_at = 5;
}
message GetClusterRequest {
string cluster_id = 1;
}
message GetClusterResponse {
string cluster_id = 1;
string cluster_name = 2;
string trust_domain = 3;
string role = 4;
string provisioned_by = 5;
google.protobuf.Timestamp provisioned_at = 6;
string rmm_endpoint = 7;
google.protobuf.Struct metadata = 8;
}
message ListClustersRequest {}
message ListClustersResponse {
repeated GetClusterResponse clusters = 1;
}
message UpdateClusterMetadataRequest {
string cluster_id = 1;
google.protobuf.Struct metadata = 2;
}
message UpdateClusterMetadataResponse {
string cluster_id = 1;
google.protobuf.Struct metadata = 2;
}

29
scripts/bootstrap.sh Executable file
View file

@ -0,0 +1,29 @@
#!/usr/bin/env bash
set -euo pipefail
# Bootstrap a new Kedge site.
# Usage: ./bootstrap.sh <cloud|homelab>
SITE="${1:?Usage: $0 <cloud|homelab>}"
echo "=== Kedge Bootstrap: ${SITE} ==="
case "${SITE}" in
cloud)
echo "Bootstrapping cloud anchor (overlay only)..."
cd "$(dirname "$0")/../ansible"
ansible-playbook -i inventory/hosts.yml playbooks/bootstrap-cloud.yml
;;
homelab)
echo "Bootstrapping homelab (overlay + underlay)..."
cd "$(dirname "$0")/../ansible"
ansible-playbook -i inventory/hosts.yml playbooks/bootstrap-homelab.yml
;;
*)
echo "Unknown site type: ${SITE}"
echo "Usage: $0 <cloud|homelab>"
exit 1
;;
esac
echo "=== Bootstrap complete ==="

26
scripts/failover-test.sh Executable file
View file

@ -0,0 +1,26 @@
#!/usr/bin/env bash
set -euo pipefail
# Run WAN failover simulation.
# Usage: ./failover-test.sh
echo "=== Kedge Failover Test ==="
echo "1. Checking WireGuard status..."
wg show wg0 || echo "WARNING: WireGuard interface wg0 not found"
echo ""
echo "2. Checking Kedge DaemonSet health..."
curl -sf http://localhost:9090/healthz && echo " OK" || echo " FAILED"
echo ""
echo "3. Checking Kedge metrics..."
curl -sf http://localhost:9090/metrics | grep -c "kedge_" || echo "No kedge metrics found"
echo ""
echo "4. Running mesh health playbook..."
cd "$(dirname "$0")/../ansible"
ansible-playbook -i inventory/hosts.yml playbooks/mesh-health.yml
echo ""
echo "=== Failover test complete ==="

View file

@ -0,0 +1,19 @@
#!/usr/bin/env bash
set -euo pipefail
# Generate WireGuard keypair for a new site.
# Usage: ./generate-wireguard-keys.sh [output-dir]
OUTPUT_DIR="${1:-/etc/kedge}"
echo "Generating WireGuard keypair in ${OUTPUT_DIR}..."
mkdir -p "${OUTPUT_DIR}"
wg genkey | tee "${OUTPUT_DIR}/wg-private.key" | wg pubkey > "${OUTPUT_DIR}/wg-public.key"
chmod 600 "${OUTPUT_DIR}/wg-private.key"
chmod 644 "${OUTPUT_DIR}/wg-public.key"
echo "Private key: ${OUTPUT_DIR}/wg-private.key"
echo "Public key: $(cat "${OUTPUT_DIR}/wg-public.key")"

13
scripts/validate-yang.sh Executable file
View file

@ -0,0 +1,13 @@
#!/usr/bin/env bash
set -euo pipefail
# Validate YANG models with pyang --strict.
# Usage: ./validate-yang.sh
YANG_DIR="$(dirname "$0")/../yang"
echo "Validating YANG models..."
pyang --strict "${YANG_DIR}/models/sovereign-sdwan.yang"
echo "YANG validation passed."

View file

@ -0,0 +1,20 @@
module "wireguard_topology" {
source = "../../modules/wireguard-topology"
sites = [
{
name = "homelab"
public_key = var.homelab_wg_pubkey
endpoint = "${var.homelab_wan_ip}:51820"
allowed_ips = ["10.100.0.1/32", "172.16.0.0/24", "10.0.1.0/24"]
},
{
name = "cloud-anchor"
public_key = var.cloud_anchor_wg_pubkey
endpoint = "${var.cloud_anchor_ip}:51820"
allowed_ips = ["10.100.0.2/32"]
},
]
output_dir = "${path.module}/output"
}

View file

@ -0,0 +1,7 @@
# These values must be populated per-deployment.
# Never commit actual keys or IPs.
homelab_wg_pubkey = "PLACEHOLDER"
homelab_wan_ip = "PLACEHOLDER"
cloud_anchor_wg_pubkey = "PLACEHOLDER"
cloud_anchor_ip = "PLACEHOLDER"

View file

@ -0,0 +1,24 @@
module "cloud_anchor" {
source = "../../modules/cloud-anchor"
server_name = "kedge-anchor-prod"
server_type = "cpx11"
location = "ash"
ssh_keys = var.ssh_keys
admin_ips = var.admin_ips
}
module "dns" {
source = "../../modules/dns"
zone_id = var.cloudflare_zone_id
anchor_hostname = "anchor"
anchor_ip = module.cloud_anchor.server_ip
}
module "wireguard_topology" {
source = "../../modules/wireguard-topology"
sites = var.sites
output_dir = "${path.module}/output"
}

View file

@ -0,0 +1,8 @@
# Production environment configuration.
# Populate with actual values — never commit secrets.
ssh_keys = ["PLACEHOLDER"]
admin_ips = ["PLACEHOLDER/32"]
cloudflare_zone_id = "PLACEHOLDER"
sites = []

View file

@ -0,0 +1,59 @@
terraform {
required_providers {
hcloud = {
source = "hetznercloud/hcloud"
version = "~> 1.45"
}
}
}
resource "hcloud_server" "anchor" {
name = var.server_name
server_type = var.server_type
image = var.image
location = var.location
ssh_keys = var.ssh_keys
labels = {
role = "cloud-anchor"
cluster = "cloud-anchor"
kedge = "overlay-only"
}
user_data = <<-EOF
#!/bin/bash
apt-get update
apt-get install -y wireguard-tools
EOF
}
resource "hcloud_firewall" "anchor" {
name = "${var.server_name}-fw"
rule {
direction = "in"
protocol = "tcp"
port = "22"
source_ips = var.admin_ips
}
rule {
direction = "in"
protocol = "udp"
port = var.wireguard_port
source_ips = ["0.0.0.0/0", "::/0"]
}
rule {
direction = "in"
protocol = "tcp"
port = "6443"
source_ips = ["10.100.0.0/24"]
}
}
resource "hcloud_firewall_attachment" "anchor" {
firewall_id = hcloud_firewall.anchor.id
server_ids = [hcloud_server.anchor.id]
}

View file

@ -0,0 +1,9 @@
output "server_ip" {
description = "Public IPv4 address of the cloud anchor"
value = hcloud_server.anchor.ipv4_address
}
output "server_id" {
description = "Hetzner server ID"
value = hcloud_server.anchor.id
}

View file

@ -0,0 +1,39 @@
variable "server_name" {
description = "Name of the cloud anchor VPS"
type = string
default = "kedge-anchor-01"
}
variable "server_type" {
description = "Hetzner server type (2 vCPU minimum)"
type = string
default = "cpx11"
}
variable "image" {
description = "OS image"
type = string
default = "ubuntu-24.04"
}
variable "location" {
description = "Hetzner datacenter location"
type = string
default = "ash"
}
variable "ssh_keys" {
description = "SSH key names for access"
type = list(string)
}
variable "admin_ips" {
description = "Admin IP CIDRs for SSH access"
type = list(string)
}
variable "wireguard_port" {
description = "WireGuard listen port"
type = string
default = "51820"
}

View file

@ -0,0 +1,17 @@
terraform {
required_providers {
cloudflare = {
source = "cloudflare/cloudflare"
version = "~> 4.0"
}
}
}
resource "cloudflare_record" "anchor" {
zone_id = var.zone_id
name = var.anchor_hostname
content = var.anchor_ip
type = "A"
ttl = 300
proxied = false
}

View file

@ -0,0 +1,4 @@
output "fqdn" {
description = "Fully qualified domain name of the anchor"
value = cloudflare_record.anchor.hostname
}

View file

@ -0,0 +1,15 @@
variable "zone_id" {
description = "Cloudflare zone ID"
type = string
}
variable "anchor_hostname" {
description = "DNS hostname for the cloud anchor"
type = string
default = "anchor"
}
variable "anchor_ip" {
description = "Public IP of the cloud anchor"
type = string
}

View file

@ -0,0 +1,38 @@
terraform {
required_providers {
local = {
source = "hashicorp/local"
version = "~> 2.5"
}
}
}
# Declarative WireGuard mesh peer relationships.
# Generates peer configuration for each node in the mesh.
locals {
peer_pairs = flatten([
for i, site_a in var.sites : [
for j, site_b in var.sites : {
from = site_a
to = site_b
} if i < j
]
])
}
resource "local_file" "peer_config" {
for_each = { for site in var.sites : site.name => site }
filename = "${var.output_dir}/${each.key}-peers.json"
content = jsonencode({
site_id = each.value.name
peers = [
for site in var.sites : {
public_key = site.public_key
endpoint = site.endpoint
allowed_ips = site.allowed_ips
} if site.name != each.key
]
})
}

View file

@ -0,0 +1,4 @@
output "peer_config_files" {
description = "Map of site name to peer config file path"
value = { for k, v in local_file.peer_config : k => v.filename }
}

View file

@ -0,0 +1,15 @@
variable "sites" {
description = "List of sites in the WireGuard mesh"
type = list(object({
name = string
public_key = string
endpoint = string
allowed_ips = list(string)
}))
}
variable "output_dir" {
description = "Directory to write peer configuration files"
type = string
default = "./output"
}

View file

129
yang/compiler/compile.py Normal file
View file

@ -0,0 +1,129 @@
#!/usr/bin/env python3
"""
YANG compiler entry point.
Loads the sovereign-sdwan YANG schema, validates site instance data,
and dispatches to vendor-specific compiler targets to produce
device configuration payloads.
"""
import argparse
import json
import sys
from pathlib import Path
from typing import Any
from lxml import etree
from . import to_fortios, to_unifi, to_vyos
YANG_MODELS_DIR = Path(__file__).parent.parent / "models"
NAMESPACE = "urn:sovereign:sdwan"
def parse_site_config(path: str) -> etree._Element:
"""Parse and return the root element of a site config XML file."""
tree = etree.parse(path)
return tree.getroot()
def extract_zones(root: etree._Element) -> list[dict[str, Any]]:
"""Extract zone-policy entries from the site config."""
ns = {"s": NAMESPACE}
zones = []
for zone_elem in root.findall(".//s:zone-policy/s:zone", ns):
zone: dict[str, Any] = {
"name": zone_elem.findtext("s:name", default="", namespaces=ns),
"subnet": zone_elem.findtext("s:subnet", default="", namespaces=ns),
"vlan_id": int(zone_elem.findtext("s:vlan-id", default="0", namespaces=ns)),
"owner_device": zone_elem.findtext("s:owner-device", default="", namespaces=ns),
"policies": [],
}
for policy_elem in zone_elem.findall("s:policy", ns):
policy = {
"dst_zone": policy_elem.findtext("s:dst-zone", default="", namespaces=ns),
"action": policy_elem.findtext("s:action", default="deny", namespaces=ns),
"services": [
svc.text for svc in policy_elem.findall("s:services", ns) if svc.text
],
}
zone["policies"].append(policy)
zones.append(zone)
return zones
def extract_circuits(root: etree._Element) -> list[dict[str, Any]]:
"""Extract WAN circuit entries from the site config."""
ns = {"s": NAMESPACE}
circuits = []
for circuit_elem in root.findall(".//s:wan-circuits/s:circuit", ns):
circuit: dict[str, Any] = {
"name": circuit_elem.findtext("s:name", default="", namespaces=ns),
"type": circuit_elem.findtext("s:type", default="", namespaces=ns),
"interface": circuit_elem.findtext("s:interface-name", default="", namespaces=ns),
}
sla_elem = circuit_elem.find("s:sla", ns)
if sla_elem is not None:
circuit["sla"] = {
"latency_ms": int(sla_elem.findtext("s:latency-target-ms", default="0", namespaces=ns)),
"jitter_ms": int(sla_elem.findtext("s:jitter-target-ms", default="0", namespaces=ns)),
"loss_pct": float(sla_elem.findtext("s:loss-target-pct", default="0", namespaces=ns)),
}
circuits.append(circuit)
return circuits
def compile_site(site_config_path: str, output_format: str = "json") -> dict[str, Any]:
"""
Compile a site configuration into vendor-specific payloads.
Returns a dict mapping device type to its compiled payloads.
"""
root = parse_site_config(site_config_path)
zones = extract_zones(root)
circuits = extract_circuits(root)
result: dict[str, Any] = {"site_config": site_config_path, "devices": {}}
# Group zones by owner device and compile per-device payloads.
device_zones: dict[str, list[dict[str, Any]]] = {}
for zone in zones:
device = zone["owner_device"]
if device:
device_zones.setdefault(device, []).append(zone)
for device, dev_zones in device_zones.items():
if "fortigate" in device.lower():
result["devices"][device] = to_fortios.compile_zones(dev_zones)
elif "vyos" in device.lower():
result["devices"][device] = to_vyos.compile_zones(dev_zones)
elif "udr" in device.lower() or "unifi" in device.lower():
result["devices"][device] = to_unifi.compile_zones(dev_zones)
if circuits:
result["circuits"] = circuits
return result
def main() -> None:
parser = argparse.ArgumentParser(description="YANG site config compiler")
parser.add_argument("--site-config", required=True, help="Path to site config XML")
parser.add_argument("--output-format", default="json", choices=["json"], help="Output format")
args = parser.parse_args()
result = compile_site(args.site_config, args.output_format)
json.dump(result, sys.stdout, indent=2)
print()
if __name__ == "__main__":
main()

Some files were not shown because too many files have changed in this diff Show more