# cmd-chat → Collaborative Sandbox Sessions — Spec > **Status:** Draft v1 · **Date:** 2026-05-30 > **Scope:** Evolves `cmd-chat` from an E2E-encrypted terminal chat into a > multi-user collaborative session with a shared, sandboxed Linux environment. > **Baseline reviewed:** `cmd_chat/` @ `main` (commit `dc1b5e5`). --- ## 0. Decisions locked (from product owner) | # | Decision | Choice | |---|----------|--------| | A | Client language / TUI | **Rust + ratatui client**, **Python Sanic server unchanged**. Stable JSON-over-WebSocket wire protocol between them. | | B | Sandbox backend | **Pluggable backend interface. Multipass default, Docker secondary.** | | C | Shared-terminal model | **Single shared PTY** (collaborative, tmux-share style). Permissions = who may type. | | D | Permission model | **Two layers:** app-level RBAC (owner/admin/member) **+** real VM unix users & `sudo` delegation. | --- ## 1. Vision & goals Turn a cmd-chat "room" into a **shared workspace**: up to 4 people (infra for more) join one encrypted session, chat, drop files/dirs into a shared space, and collaboratively drive **one sandboxed Linux box** they can type commands into and run scripts in — with real Linux permissions and a clear owner who can delegate superuser rights. ### Goals - Preserve the existing security guarantees: **zero-knowledge server**, **E2E Fernet encryption**, **SRP auth**, **RAM-only server**, **no IP leaks**. - 4 concurrent users per session today; capacity is a single config constant, not an architectural limit. - A genuinely nice **ratatui** TUI: panes, themes, custom colour/layout config. - One-command launch of a **disposable sandbox** (Multipass VM or Docker container) that the whole room shares. - File **and directory** upload into the shared session. - Linux-grade permissions inside the sandbox + app-level roles governing the session itself. ### Non-goals (v1) - Persistence / chat history survival across server restart (server stays RAM-only). - Federation / multiple rooms per server process (one room per `serve`, as today). - Running the sandbox *on the server* (see §4 — it runs on the initiator's client). - Mobile / GUI clients. Terminal only. - Multi-VM topologies. One shared sandbox per session. --- ## 2. Baseline architecture (what exists today) ``` CLIENT (python+rich) SERVER (sanic, RAM-only) CLIENT SRP handshake ───────────────► /srp/init, /srp/verify ──────────► (relay only) HKDF(pw,salt) → room_key server NEVER Fernet(room_key) encrypt WSS /ws/chat (broadcast) sees plaintext ──── ciphertext ──────────────► ConnectionManager.broadcast ─────► decrypt(room_key) file xfer = _ft JSON over the same encrypted message channel (64KB chunks, SHA-256) ``` - **Server modules:** `factory.py` (DI/wiring), `server.py` (TLS + run), `routes.py`, `views.py` (SRP + ws + admin), `managers.py` (`ConnectionManager`), `stores.py` (`MessageStore`, `UserSessionStore`), `srp_auth.py`, `models.py`, `helpers.py` (`RateLimiter`, `get_client_ip`). - **Client:** single `client.py` — `Client` class, asyncio `receive_loop` + `input_loop`, rich console clear-and-reprint, `_ft` file-transfer protocol. - **Wire types today:** `init`, `message`, `user_left`. File transfer rides inside `message.text` as JSON beginning `{"_ft": …}` (offer/accept/reject/chunk/done). ### What we keep vs. change | Component | v1 plan | |---|---| | Sanic server, SRP, TLS, RateLimiter | **Keep**, extend with new relay message types + capacity check. | | `ConnectionManager` broadcast | **Keep**; add `broadcast(exclude_user)` usage and roster events. | | RAM-only / zero-knowledge | **Keep** — non-negotiable; everything new also rides as ciphertext. | | Python rich client | **Replace** with Rust ratatui client (protocol-compatible). | | File transfer `_ft` protocol | **Keep & extend** for directories (tar stream). | --- ## 3. Target architecture (high level) ``` ┌─────────────────────────── SESSION (one room) ───────────────────────────┐ │ │ │ OWNER CLIENT (Rust/ratatui) SERVER (Sanic, dumb relay) │ │ ┌───────────────────────────┐ ┌─────────────────────────┐ │ │ │ ratatui UI │ │ SRP / TLS / rate-limit │ │ │ │ chat | roster | sandbox │◄──WSS───►│ ConnectionManager │◄──┐ │ │ │ E2E Fernet(room_key) │ cipher │ broadcast(opaque bytes) │ │ │ │ │ ┌───────────────────────┐ │ text │ Stores (RAM only) │ │ │ │ │ │ SANDBOX BROKER (local)│ │ └─────────────────────────┘ │ │ │ │ │ Multipass | Docker │ │ │ │ │ │ │ PTY ⇄ encrypted frames│ │ MEMBER CLIENTS (Rust/ratatui) ─────┘ │ │ │ │ RBAC + unix-user map │ │ decrypt → render shared PTY pane │ │ │ └───────────────────────┘ │ send keystrokes (if permitted) │ │ └───────────────────────────┘ │ └───────────────────────────────────────────────────────────────────────────┘ ``` **Key principle — the server stays zero-knowledge.** The sandbox does **not** run on the server (the server has no room key and must never see plaintext). Instead: - The **client that launches the sandbox** (normally the owner) hosts a local **Sandbox Broker** that spawns the Multipass VM / Docker container and owns its PTY. - PTY output is Fernet-encrypted with the room key and relayed through the server as opaque ciphertext — identical trust model to file transfer. - Keystrokes from other clients travel encrypted to the broker, which is the **single policy-enforcement point** (it decides who may type / `sudo` / upload). This is the only design that satisfies *both* "shared sandbox" *and* "server can't read anything." It is documented as a constraint, not an accident. --- ## 4. Wire protocol (v2) ### 4.1 Versioning & envelope - Add a top-level relay type **`hello`** exchanged right after `init` carrying `protocol_version` (`2`). Server relays it untouched. Clients negotiate down / warn on mismatch. - All new collaborative payloads ride **inside the encrypted channel** the same way `_ft` does: the cleartext-to-server is a `message` frame whose decrypted `text` is JSON with a discriminator key. We generalise `_ft` to a namespaced envelope: ```jsonc // decrypted application payload (server only ever sees the ciphertext of this) { "v": 2, "kind": "chat" | "file" | "dir" | "sbx" | "perm" | "presence", "id": "uuid", // correlation id "from":"username", // asserted by sender, validated by broker for sbx/perm "ts": "iso8601", "body": { /* kind-specific */ } } ``` > Rationale: keeps the server's relay role unchanged (still just `message` > broadcast of opaque bytes) while giving the app a clean, extensible schema. > Existing `_ft` messages map onto `kind:"file"`. ### 4.2 New server-visible relay types (cleartext metadata only) The server learns nothing it shouldn't; these carry no plaintext content. - `roster` — authoritative presence list + capacity (server-generated; see §5). - `capacity_full` — sent on connect rejection (HTTP 409 / ws close code). - Everything else stays `message` (opaque ciphertext). ### 4.3 Sandbox sub-protocol (`kind:"sbx"`, encrypted, broker-authoritative) | `body.op` | Direction | Meaning | |---|---|---| | `launch` | client→broker | request to start sandbox (owner/admin only) | | `status` | broker→all | `starting`/`ready`/`stopped`/`error`, backend, image, specs | | `pty_data` | broker→all | base64 PTY output chunk (the shared terminal stream) | | `pty_input` | client→broker | keystrokes; broker enforces "may type" ACL | | `resize` | client→broker | cols/rows; broker applies if sender is the active driver | | `run_script` | client→broker | upload+exec a script in the VM (perm-gated) | | `stop` | owner→broker | tear down sandbox | ### 4.4 Permission sub-protocol (`kind:"perm"`, broker-authoritative) | `body.op` | Meaning | |---|---| | `grant` / `revoke` | change app role (admin/member) — owner/admin only | | `sudo_grant` / `sudo_revoke` | add/remove a user from VM sudoers — superuser only | | `acl` | broker broadcasts the current authoritative ACL snapshot | --- ## 5. Feature 1 — Multi-user sessions (4 now, N later) **Current state:** `ConnectionManager` already supports arbitrarily many websocket connections; `username_exists` blocks dup names. No capacity cap, no real roster broadcast (only `init` snapshot + `user_left`). **Changes** 1. **Capacity constant** `SESSION_MAX_USERS = 4` in `factory.py` (env override `CMD_CHAT_MAX_USERS`). Enforced in `srp_verify` *and* on ws connect: reject with `409 Username/Room full` / ws close code `4004` when `session_store.count() >= max`. 2. **Authoritative roster.** Server emits a `roster` event (join, leave, role-change) so all clients converge on one presence list with roles. Replaces the ad-hoc `user_left` patching (kept for back-comp, superseded by `roster`). 3. **`user_joined` event** added (today only `user_left` exists) for live roster. 4. **Infra-for-more:** capacity is data, not code. Document that >4 needs (a) UI roster scroll, (b) broadcast fan-out is O(N) per message — fine to low double digits; note ceiling in README. No protocol change required to raise the cap. **Acceptance** - 5th join attempt to a 4-cap room is cleanly refused with a user-visible reason. - Roster in every client matches server truth within one broadcast round-trip. - Raising `CMD_CHAT_MAX_USERS=8` works with zero code changes. --- ## 6. Feature 2 — Rust ratatui client (enhanced + themeable) **New crate:** `cmd-chat-tui/` (Rust 2021). Talks the v2 protocol to the existing Sanic server. The Python client remains in-tree as a reference/fallback until the Rust client reaches parity, then is deprecated. ### 6.1 Crate layout ``` cmd-chat-tui/ Cargo.toml src/ main.rs # CLI (clap): connect/serve-shim, flags mirror python app.rs # App state, event loop (tokio + crossterm) net/ srp.rs # SRP-6a client (crate: srp + sha2) — matches python params ws.rs # tokio-tungstenite WS client crypto.rs # HKDF-SHA256 → Fernet (crate: fernet) room key proto.rs # v2 envelope (serde) types ui/ layout.rs # ratatui layout regions chat.rs # chat pane (scrollback, msg styling) roster.rs # users + roles + sandbox status sandbox.rs # shared PTY pane (vt100 parse: crate `vt100`) input.rs # input box, command palette (/send /run /grant …) theme.rs # theme model + loader sandbox/ broker.rs # local PTY broker (owner side) — see §8 backend.rs # trait SandboxBackend multipass.rs # default backend docker.rs # secondary backend files.rs # upload (file + dir/tar), SHA-256, chunking perms.rs # client-side ACL view + enforcement hints themes/ default.toml nord.toml gruvbox.toml mono.toml ``` ### 6.2 Crypto parity (must match Python exactly) - SRP-6a, group **RFC5054 / SHA-256**, identity `b"chat"` (matches `srp_auth.py`). Verify interop against the live Sanic server early — this is the #1 integration risk. - Room key: `HKDF(SHA256, len=32, salt=room_salt, info=b"cmd-chat-room-key")` over the password, then `Fernet(urlsafe_b64(key))` — byte-for-byte as `client.py::srp_authenticate`. - WS auth: `ws_token` echoed from `/srp/verify`; HMAC token already server-side. ### 6.3 UI / layout Default layout (resizable, ratatui `Layout` constraints): ``` ┌ cmd-chat ── room: ── 🔒 E2E ── users 3/4 ──────────────┐ │ ┌─ chat ───────────────────────────┐ ┌─ roster ─────────────┐ │ │ │ 12:01 alice: hey │ │ ● alice (owner,root)│ │ │ │ 12:01 bob: yo │ │ ● bob (admin,sudo)│ │ │ │ …scrollback… │ │ ○ carol (member) │ │ │ └──────────────────────────────────┘ │ sandbox: ● ready │ │ │ ┌─ sandbox (shared PTY · driver: alice) ─────────────────────┐│ │ │ ubuntu@sbx:~$ ./build.sh ││ │ │ …live vt100 output for everyone… ││ │ └────────────────────────────────────────────────────────────┘│ │ > type message · /send /run /sbx /grant (F2 toggle PTY focus)│ └────────────────────────────────────────────────────────────────┘ ``` - **Panes:** chat, roster (with roles + sandbox status), shared PTY, input. - **Focus model:** Tab cycles panes; `F2` toggles "drive the PTY" (keystrokes go to sandbox) vs "chat input". Visible indicator of who currently holds the PTY driver token (see §8.3). - **Command palette** (`/`): `/send`, `/sendd `, `/run