hack-house

Author	SHA1	Message	Date
leetcrypt	07e9c30846	feat(sbx,ui): VM snapshot save/load + collapsible clustered help menu - /sbx save\|load\|snaps: docker commit → hh-snap:<label> image that survives /sbx stop; load relaunches a fresh sandbox from it; multipass delegates to `multipass snapshot`. Local backend unsupported. - Help overlay redesigned into topical clusters (SANDBOX, AI AGENTS, PERMISSIONS, FILES, APPEARANCE, KEYS, ROSTER GLYPHS), collapsed by default; up/down highlight a cluster, left/right/Enter expand-collapse it (tmux-style), PgUp/PgDn scroll overflow, Esc closes. - docstring: example uses --model qwen2.5:3b (the locally-pulled model), not llama3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 23:03:00 -07:00
leetcrypt	26c651e9ac	perf(ai): CPU-tuned local inference + qwen2.5-coder sandbox path Tier A/B/C wins for the CPU-only Ollama box (no GPU → optimize TTFT and tokens/sec, not VRAM): - Separate qwen2.5-coder provider for the sandbox `!task` path; chat keeps the general model. Auto-selected when chat is Ollama and a coder build is present, override with --code-model. - OllamaProvider num_ctx default 8192→4096 (8192 was a GPU-mindset default that inflates prefill/TTFT on CPU); expose num_thread; add --num-ctx, --num-thread, --num-predict. token_budget default 3000→2000 to fit. - OllamaProvider.stream() generator over Ollama's stream=True chat endpoint (provider half of token streaming; agent/Rust rendering is a follow-up). - Few-shot request→shell exemplars in SANDBOX_SYSTEM to anchor the small model's fenced-command output. - Matryoshka embedding truncation: OllamaEmbedder truncate_dim=256 (--embed-dim) for faster pure-Python cosine and less RAM; query+stored share the dim. - docs/ai-perf-plan.md records all 8 items with status and the server-side env (OLLAMA_NUM_PARALLEL=1, keep_alive) that must be set where ollama serve runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 22:37:59 -07:00
leetcrypt	e5e1ad8dee	feat(ai): in-RAM semantic recall (RAG) for conversation context Give the agent recall of things said beyond the verbatim window, without breaking the RAM-only philosophy — nothing is persisted to disk. - MemoryIndex: a capped, in-memory pool of embedded messages with pure-Python cosine search (no numpy). Retains far more than the rolling transcript so old lines can be surfaced on demand; oldest evicted past the cap to bound RAM. - OllamaEmbedder: local embeddings via nomic-embed-text, on by default and independent of the chat provider (reuses the Ollama host when chat is Ollama). - Bridge: captured room messages (live + backfilled) are embedded on a background worker so a slow embedder can't stall frame draining. On a /ai question the agent retrieves top-k relevant lines, drops weak (<min_score) and windowed-duplicate hits, and prepends them as a clearly-fenced "recalled context" preamble — kept at user role, never elevated to system, so untrusted room text informs without instructing. Falls back to recency-only if the embedder is unreachable. - CLI: --no-rag, --embed-model, --embed-host, --rag-top-k. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 17:59:01 -07:00
leetcrypt	9b85255d80	feat(ai): backfill context on join + token-budget window The server already ships the full RAM message backlog in the init frame; the agent was discarding it. _seed_transcript now decrypts that history with the room key (skipping our own lines, control frames, and undecryptable blobs) so the agent has context the moment it joins instead of starting amnesiac. _window() replaces the fixed last-12 slice on both the answer and sandbox paths: it walks newest-to-oldest and keeps messages up to --token-budget (approx, ~4 chars/token), still capped at --context-window count. Keeps small local models inside their effective context. Nothing touches disk. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-02 17:43:02 -07:00
leetcrypt	65df12de9e	feat(ai): model profiles, capability discovery, and agentless /ai list\|models Make connecting any model a config step, not a code change: - models.toml named profiles (api_key_env names an env var, never the key) - providers gain available_models(); add preflight + --list-models/--check - /ai list and /ai models in-room; client probes local Ollama for /ai models when no agent is running, and /ai list hints to summon one - docs/providers.md provider guide + examples/echo_provider.py - README: command table, AI section, layout updated Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-01 15:25:07 -07:00
leetcrypt	54b7637ec8	feat(agent): model-agnostic AI agent bridge (PoC) + pin lets-hack demo to main Add cmd_chat/agent: a headless client that joins a room via SRP, decrypts broadcasts, and answers /ai <question> through a pluggable model provider (ollama default + anthropic + openai-compatible + module:Class). Server and zero-knowledge guarantees unchanged; the agent is just another encrypted client. Also pin the lets-hack demo to a detached worktree of main (default) so running it from dev still demos stable main without touching the working checkout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-01 02:05:48 -07:00

6 Commits