The server already ships the full RAM message backlog in the init frame; the
agent was discarding it. _seed_transcript now decrypts that history with the
room key (skipping our own lines, control frames, and undecryptable blobs) so
the agent has context the moment it joins instead of starting amnesiac.
_window() replaces the fixed last-12 slice on both the answer and sandbox
paths: it walks newest-to-oldest and keeps messages up to --token-budget
(approx, ~4 chars/token), still capped at --context-window count. Keeps small
local models inside their effective context. Nothing touches disk.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make connecting any model a config step, not a code change:
- models.toml named profiles (api_key_env names an env var, never the key)
- providers gain available_models(); add preflight + --list-models/--check
- /ai list and /ai models in-room; client probes local Ollama for
/ai models when no agent is running, and /ai list hints to summon one
- docs/providers.md provider guide + examples/echo_provider.py
- README: command table, AI section, layout updated
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add cmd_chat/agent: a headless client that joins a room via SRP, decrypts
broadcasts, and answers /ai <question> through a pluggable model provider
(ollama default + anthropic + openai-compatible + module:Class). Server and
zero-knowledge guarantees unchanged; the agent is just another encrypted client.
Also pin the lets-hack demo to a detached worktree of main (default) so running
it from dev still demos stable main without touching the working checkout.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>