hack-house

Files

T

leetcrypt 69bce5ead8 feat(ai): stream agent replies token-by-token to the room

Closes the cross-language half of token streaming (perf-plan A3). On the
CPU-only box perceived latency is time-to-first-token, so showing the reply
as it generates makes a slow model feel live.

- Agent: OllamaProvider.stream() runs on a worker thread; bridge relays
  cumulative previews as throttled (~5/sec) `_ai:"stream"` control frames,
  then a `done` frame clears the preview as the final persisted chat message
  is posted. Providers without stream() fall back to blocking complete().
- Rust client: new Net::AiStream variant + parse_ai branch; App.ai_stream
  map holds the in-progress text per agent; draw_chat renders it as a dim,
  italic preview bubble below history. Cleared on done and on agent leave.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-06-02 22:42:08 -07:00

__init__.py

feat(agent): model-agnostic AI agent bridge (PoC) + pin lets-hack demo to main

2026-06-01 02:05:48 -07:00

__main__.py

perf(ai): CPU-tuned local inference + qwen2.5-coder sandbox path

2026-06-02 22:37:59 -07:00

bridge.py

feat(ai): stream agent replies token-by-token to the room

2026-06-02 22:42:08 -07:00

memory.py

feat(ai): in-RAM semantic recall (RAG) for conversation context