hack-house

Author	SHA1	Message	Date
leetcrypt	ca1666fbbb	docs(sbx): VirtualBox backend spec, crypto pay-gate, save/load PoC Add the VirtualBox sandbox design spec (headless 4th backend + share-an- appliance GUI mode with detect-first install), the crypto pay-to-join gate design, and the save/load PoC writeup with its demo/film driver scripts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-03 10:10:44 -07:00
leetcrypt	69bce5ead8	feat(ai): stream agent replies token-by-token to the room Closes the cross-language half of token streaming (perf-plan A3). On the CPU-only box perceived latency is time-to-first-token, so showing the reply as it generates makes a slow model feel live. - Agent: OllamaProvider.stream() runs on a worker thread; bridge relays cumulative previews as throttled (~5/sec) `_ai:"stream"` control frames, then a `done` frame clears the preview as the final persisted chat message is posted. Providers without stream() fall back to blocking complete(). - Rust client: new Net::AiStream variant + parse_ai branch; App.ai_stream map holds the in-progress text per agent; draw_chat renders it as a dim, italic preview bubble below history. Cleared on done and on agent leave. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 22:42:08 -07:00
leetcrypt	26c651e9ac	perf(ai): CPU-tuned local inference + qwen2.5-coder sandbox path Tier A/B/C wins for the CPU-only Ollama box (no GPU → optimize TTFT and tokens/sec, not VRAM): - Separate qwen2.5-coder provider for the sandbox `!task` path; chat keeps the general model. Auto-selected when chat is Ollama and a coder build is present, override with --code-model. - OllamaProvider num_ctx default 8192→4096 (8192 was a GPU-mindset default that inflates prefill/TTFT on CPU); expose num_thread; add --num-ctx, --num-thread, --num-predict. token_budget default 3000→2000 to fit. - OllamaProvider.stream() generator over Ollama's stream=True chat endpoint (provider half of token streaming; agent/Rust rendering is a follow-up). - Few-shot request→shell exemplars in SANDBOX_SYSTEM to anchor the small model's fenced-command output. - Matryoshka embedding truncation: OllamaEmbedder truncate_dim=256 (--embed-dim) for faster pure-Python cosine and less RAM; query+stored share the dim. - docs/ai-perf-plan.md records all 8 items with status and the server-side env (OLLAMA_NUM_PARALLEL=1, keep_alive) that must be set where ollama serve runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 22:37:59 -07:00
leetcrypt	e5e1ad8dee	feat(ai): in-RAM semantic recall (RAG) for conversation context Give the agent recall of things said beyond the verbatim window, without breaking the RAM-only philosophy — nothing is persisted to disk. - MemoryIndex: a capped, in-memory pool of embedded messages with pure-Python cosine search (no numpy). Retains far more than the rolling transcript so old lines can be surfaced on demand; oldest evicted past the cap to bound RAM. - OllamaEmbedder: local embeddings via nomic-embed-text, on by default and independent of the chat provider (reuses the Ollama host when chat is Ollama). - Bridge: captured room messages (live + backfilled) are embedded on a background worker so a slow embedder can't stall frame draining. On a /ai question the agent retrieves top-k relevant lines, drops weak (<min_score) and windowed-duplicate hits, and prepends them as a clearly-fenced "recalled context" preamble — kept at user role, never elevated to system, so untrusted room text informs without instructing. Falls back to recency-only if the embedder is unreachable. - CLI: --no-rag, --embed-model, --embed-host, --rag-top-k. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 17:59:01 -07:00
leetcrypt	bbb9e82425	docs: plan for AI agent context + local-perf improvements Roadmap for deepening the /ai agent's conversational context while keeping the RAM-only philosophy, plus Ollama latency wins. Marks Tier 1 (backfill, token-budget window) and the perf tuning as in-scope now; RAG and in-RAM compaction staged next. Grounded in public Anthropic docs, not leaked source. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 17:43:02 -07:00
leetcrypt	5e8a409ec2	docs: higher-quality demo GIF (1280px, 15fps) Bump from 960px/12fps to 1280px/15fps with floyd-steinberg dithering for crisper, retina-legible terminal text — 7.4MB, under GitHub's 10MB inline-render limit. Exceeds the upstream example.gif (800px/15fps). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-01 15:56:01 -07:00
leetcrypt	40c9a72186	docs: embed demo GIF (multipass sandbox share) in README 4.7MB looping GIF rendered from the latest demo capture (alice+bob sharing a multipass box: summon, drive, per-user sudo). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-01 15:27:29 -07:00
leetcrypt	65df12de9e	feat(ai): model profiles, capability discovery, and agentless /ai list\|models Make connecting any model a config step, not a code change: - models.toml named profiles (api_key_env names an env var, never the key) - providers gain available_models(); add preflight + --list-models/--check - /ai list and /ai models in-room; client probes local Ollama for /ai models when no agent is running, and /ai list hints to summon one - docs/providers.md provider guide + examples/echo_provider.py - README: command table, AI section, layout updated Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-01 15:25:07 -07:00
leetcrypt	700e33e3b1	docs: AI agent bridge spec (model-agnostic, /ai command, owner-gated ops) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-01 01:24:48 -07:00
leetcrypt	82a04f3e12	feat(coven): SRP/Fernet crypto parity + multi-user coven foundation ⛧ Begin the coven evolution of cmd-chat (see docs/spec-collaborative-sandbox.md): a Rust/ratatui client for the unchanged Python Sanic server, plus the multi-user + zero-knowledge groundwork. P0 — crypto parity (the spec's #1 risk), proven three ways: - Hand-rolled SRP-6a (NG_2048, SHA-256, rfc5054 padding) matching pysrp byte-for-byte, incl. the fixed b"chat" SRP identity and minimal-vs-256B width quirks. Golden-vector unit test + offline selftest. - Live handshake against the running server (H_AMK verified). - Cross-language E2E: Python client decrypts a Rust-encrypted Fernet message. P2 — multi-user coven (server): - CMD_CHAT_MAX_USERS capacity cap (default 4, infra-for-more). - Authoritative roster + user_joined broadcasts. - Free the slot/username on ws disconnect (was held until 1h stale sweep). Also: fix requirements.txt (was UTF-16, unparseable by pip). coven/ : Rust crate (crypto.rs proven; main.rs spike CLI: selftest/handshake/srpm) docs/ : full feature spec for the 6 requested features. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 11:47:25 -07:00

10 Commits