From 966b7e575796ff69558fa10b0da71dc363cc4efa Mon Sep 17 00:00:00 2001
From: "K. Hodges" <khodges42@gmail.com>
Date: Sun, 24 May 2026 16:05:24 -0700
Subject: [PATCH] Explainer and readme update. we are working.

---
 README.md                   | 434 +++++++++++++++++++------------
 docs/technical-explainer.md | 500 ++++++++++++++++++++++++++++++++++++
 2 files changed, 769 insertions(+), 165 deletions(-)
 create mode 100644 docs/technical-explainer.md

diff --git a/README.md b/README.md
index f42989f..5ecf6f3 100644
--- a/README.md
+++ b/README.md
@@ -1,228 +1,332 @@
 # Glassmind
 
-> Local-first semantic retrieval for Obsidian-like markdown knowledge bases and AI workflows.
+> Local-first retrieval for Obsidian-like markdown knowledge bases and AI workflows.
 
-* This is in development, it doesn't run yet. Want to help? Get in contact! * 
+Glassmind turns a folder of markdown notes into searchable local memory for humans, agents, and local model workflows.
 
-Glassmind turns folders of markdown notes into searchable semantic memory for AI tools and humans.
+It works well with Obsidian vaults, but Obsidian is not required. A plain directory of `.md` files is enough.
 
-It works especially well with Obsidian vaults, but Obsidian is not required.
+Your notes stay local. Markdown stays canonical. The SQLite database is a rebuildable cache.
 
-It indexes markdown, understands links/tags/headings, performs hybrid semantic retrieval, and exposes context through a CLI, HTTP API, and MCP tools.
+## Current Status
 
-Your notes stay local.
-Your vault stays canonical.
-The database is rebuildable.
-No cloud required.
+Glassmind now runs as a Rust CLI MVP.
 
----
+It can:
 
-## What is this?
+- scan a markdown vault
+- parse headings, paragraphs, lists, code blocks, tags, and wikilinks
+- split notes into heading-based retrieval chunks
+- store metadata and chunks in SQLite
+- index chunks with SQLite FTS5 keyword search
+- generate local deterministic embeddings
+- score results with keyword, semantic, recency, tag, and wikilink signals
+- build context bundles with token budgets
+- expose a small localhost HTTP API
+- expose MCP-style command output
+- write agent-owned memories, tasks, and decisions under `.agent/`
+- skip unchanged files with content hashes
+- audit retrievals for debugging
 
-Glassmind is **not**:
+Some pieces are still intentionally lightweight:
 
-* a chatbot
-* an obsidian plugin
-* an autonomous agent
-* a replacement for Obsidian
-* a SaaS startup trying to ingest your second brain into a valuation event
+- the Ollama backend has the right interface, but does not call Ollama over HTTP yet
+- vectors are stored as JSON in SQLite, not native `sqlite-vec` yet
+- the HTTP server is a small standard-library server, not Axum yet
+- MCP support is command-shaped, not a full MCP protocol server yet
+- watch mode is simple polling
 
-Glassmind is a **memory and retrieval layer**.
+The core local retrieval flow is in place and usable for testing.
 
-Think:
+## What Glassmind Is
+
+Glassmind is not:
+
+- a chatbot
+- an Obsidian plugin
+- an autonomous agent
+- a replacement for Obsidian
+- a cloud memory service
+
+Glassmind is a memory and retrieval layer.
 
 ```text
-Claude / Codex / Hermes / local model
-                ↓
+Claude / Codex / local model / your tooling
+                |
            Glassmind
-                ↓
-         Your Obsidian vault
+                |
+       your markdown vault
 ```
 
 The goal is simple:
 
-> “Given this task, what context from my vault actually matters?”
+> Given this task, what context from my vault actually matters?
 
----
+## Quick Start
 
-# Features
+Build it:
 
-## Current / Planned
+```powershell
+cargo build
+```
 
-* Markdown vault indexing
-* Semantic search
-* Hybrid retrieval
+Index the current repo:
 
-  * embeddings
-  * keyword search
-  * tags
-  * wikilinks
-  * recency
-* Context bundle generation
-* MCP integration
-* HTTP API
-* Local-first operation
-* Rebuildable indexes
-* Incremental indexing
-* Agent-safe `.agent/` workspace
-* Obsidian-compatible by default
+```powershell
+cargo run -- index --embeddings
+```
 
----
+Search:
 
-# Philosophy
+```powershell
+cargo run -- search "local memory" --debug-scores
+```
 
-Glassmind treats your vault like memory, not files.
+Build a context bundle:
+
+```powershell
+cargo run -- context "continue glassmind" --budget 3000
+```
+
+Use a personal Obsidian vault:
+
+```powershell
+cargo run -- --vault "E:\notes\Brain" index --embeddings
+cargo run -- --vault "E:\notes\Brain" search "project ideas" --debug-scores
+cargo run -- --vault "E:\notes\Brain" context "what was I thinking about local agents?"
+```
+
+If your vault path has spaces, keep the quotes.
+
+## Configuration
+
+Glassmind reads `glassmind.toml` by default.
+
+Useful defaults:
+
+```toml
+[vault]
+path = "."
+
+[database]
+path = ".agent/cache/glassmind.sqlite3"
+
+[index]
+include_agent_dir = true
+ignore_dirs = [".git", ".obsidian", ".trash", ".agent/cache"]
+chunk_target_tokens = 500
+chunk_overlap_tokens = 80
+
+[embeddings]
+backend = "ollama"
+model = "nomic-embed-text"
+url = "http://localhost:11434"
+
+[server]
+host = "127.0.0.1"
+port = 7331
+```
+
+The database path is inside `.agent/cache` so it stays out of Git and can be rebuilt.
+
+## CLI Commands
+
+Initialize config and agent workspace:
+
+```powershell
+cargo run -- init
+```
+
+Index once:
+
+```powershell
+cargo run -- index
+```
+
+Index and generate missing embeddings:
+
+```powershell
+cargo run -- index --embeddings
+```
+
+Poll and reindex every five seconds:
+
+```powershell
+cargo run -- index --watch
+```
+
+Search:
+
+```powershell
+cargo run -- search "obsidian rag memory"
+```
+
+Search with score breakdown:
+
+```powershell
+cargo run -- search "obsidian rag memory" --debug-scores
+```
+
+JSON search:
+
+```powershell
+cargo run -- search "obsidian rag memory" --output json
+```
+
+Context bundle:
+
+```powershell
+cargo run -- context "help me continue the Glassmind project" --budget 6000
+```
+
+Stats:
+
+```powershell
+cargo run -- stats
+```
+
+## Agent Memory
+
+Glassmind owns `.agent/`.
 
 ```text
-Obsidian markdown = source of truth
-SQLite = rebuildable index/cache
-Embeddings = semantic retrieval layer
+.agent/
+  memories/
+  summaries/
+  tasks/
+  decisions/
+  logs/
+  cache/
 ```
 
-Your notes remain human-readable markdown.
+Capture generated memory:
 
-Glassmind exists to make retrieval useful, fast, and agent-friendly without turning your vault into proprietary soup.
-
----
-
-# Example
-
-```bash
-glassmind index
-
-glassmind search "local memory tool ideas"
-
-glassmind context "help me continue the Glassmind project"
-
-glassmind serve
+```powershell
+cargo run -- capture memory --project Glassmind --text "Markdown remains canonical."
+cargo run -- capture task --project Glassmind --text "Wire real Ollama HTTP embeddings."
+cargo run -- capture decision --project Glassmind --text "SQLite is rebuildable cache."
 ```
 
----
+Those files are markdown and are indexed on the next run.
 
-# Why?
+## HTTP API
 
-Because existing “AI memory” systems tend to be one of:
+Start the local server:
 
-* cloud-first
-* opaque
-* startup-shaped
-* agent-shaped
-* overengineered
-* weirdly hostile to user ownership
+```powershell
+cargo run -- serve
+```
 
-Meanwhile, many of us are already using Obsidian as informal long-term memory.
-
-Glassmind formalizes that idea.
-
----
-
-# Documentation
-
-* [Design Document](docs/design.md)
-* [FAQ](docs/faq.md)
-* [HUH? (Beginners ELI5 guide)](docs/huh.md)
-
----
-
-# Architecture
+Default bind:
 
 ```text
-Obsidian Vault
-  ↓
-Indexer
-  ↓
-SQLite + Vector Search
-  ↓
-CLI / HTTP / MCP
-  ↓
-Agents and local models
+127.0.0.1:7331
 ```
 
----
+Endpoints:
 
-# Tech Stack
+- `GET /health`
+- `GET /stats`
+- `POST /search`
+- `POST /context`
+- `GET /notes/{path}`
 
-Planned v1 stack:
+Example:
+
+```powershell
+curl http://127.0.0.1:7331/health
+```
+
+## MCP-Style Commands
+
+List tools:
+
+```powershell
+cargo run -- mcp tools
+```
+
+Search:
+
+```powershell
+cargo run -- mcp search "local memory"
+```
+
+Context:
+
+```powershell
+cargo run -- mcp context "continue glassmind"
+```
+
+Read:
+
+```powershell
+cargo run -- mcp read "README.md"
+```
+
+## Architecture
 
 ```text
-Rust
-SQLite
-sqlite-vec
-Ollama embeddings
-Axum
-MCP
+Markdown vault
+  -> scanner
+  -> parser
+  -> heading chunker
+  -> SQLite metadata cache
+  -> FTS keyword index
+  -> embedding cache
+  -> hybrid retriever
+  -> CLI / HTTP / MCP-style tools
 ```
 
----
+Core principle:
 
-# Status
+```text
+markdown = source of truth
+sqlite = rebuildable cache
+embeddings = derived retrieval data
+.agent/ = Glassmind-owned workspace
+```
 
-Early development.
+## Documentation
 
-Currently building:
+- [Design Document](docs/design.md)
+- [FAQ](docs/faq.md)
+- [HUH? Beginners Guide](docs/huh.md)
+- [Technical Explainer](docs/technical-explainer.md)
 
-* vault indexer
-* chunking
-* semantic retrieval
-* context generation
-
----
-
-# Security / Privacy
-
-Glassmind is designed to run locally.
+## Security And Privacy
 
 By default:
 
-* binds to localhost
-* keeps notes local
-* avoids modifying user notes
-* stores indexes separately
-* treats markdown as canonical
+- runs locally
+- binds HTTP to localhost
+- keeps notes on disk
+- avoids modifying normal user notes
+- writes generated data under `.agent/`
+- stores indexes under `.agent/cache`
+- does not require cloud APIs
+- has no telemetry
 
-No telemetry is planned.
+## Tech Stack
 
-No cloud dependency is required.
+Current:
 
-No “AI-enhanced knowledge monetization platform” nonsense.
+- Rust
+- SQLite
+- SQLite FTS5
+- `rusqlite`
+- `clap`
+- `serde`
+- `pulldown-cmark`
+- `tracing`
 
-No enshitification ever. I stake my professional reputation on it.
+Planned improvements:
 
----
-
-# Name
-
-Why “Glassmind”?
-
-Because it’s supposed to feel like peering through semantic glass into your own thoughts.
-
-Also because `brainworm` felt a little aggressive for a tool people may actually deploy at work. 
-
-
----
-
-# Contributing
-
-Eventually.
-
-Right now the project is still in the “rapid architectural mutation” phase.
-
-If you want to throw me a PR or two I'll give you one (1) really good compliment.
-
----
-
-# Legal
-
-Glassmind is an independent project and is not affiliated with or endorsed by [Obsidian](https://obsidian.md).
-
----
-
-# I am a recruiter
-
-Hi.
-
-You may also enjoy:
-
-* [LinkedIn / khodges42](https://linkedin.com/in/khodges42?utm_source=chatgpt.com)
+- real Ollama HTTP embeddings
+- native `sqlite-vec`
+- Axum HTTP server
+- full MCP transport
+- filesystem watcher
 
+## Legal
 
+Glassmind is an independent project and is not affiliated with or endorsed by Obsidian.
diff --git a/docs/technical-explainer.md b/docs/technical-explainer.md
new file mode 100644
index 0000000..a11b11e
--- /dev/null
+++ b/docs/technical-explainer.md
@@ -0,0 +1,500 @@
+# Glassmind Technical Explainer
+
+This document explains how Glassmind works for someone who is comfortable with software development, but new to RAG, embeddings, vector search, Obsidian-style markdown indexing, or MCP-style tool surfaces.
+
+Glassmind is a local retrieval layer for a directory of markdown files. It treats your notes as the source of truth, builds a rebuildable SQLite cache from them, and then uses that cache to answer search and context requests.
+
+The short version:
+
+```text
+markdown files
+  -> scanner
+  -> markdown parser
+  -> chunker
+  -> SQLite cache
+  -> keyword index
+  -> embedding cache
+  -> hybrid retriever
+  -> search results / context bundle / HTTP / MCP-style tools
+```
+
+## What Problem Glassmind Solves
+
+Large language models do not automatically know what is in your local notes. Even if an AI tool can read files, dumping an entire vault into a prompt is slow, expensive, noisy, and usually impossible because context windows are limited.
+
+RAG means Retrieval-Augmented Generation. The idea is simple:
+
+```text
+user asks something
+system retrieves relevant source material
+LLM receives only that relevant context
+LLM answers with better grounding
+```
+
+Glassmind is the retrieval part. It does not try to be the chatbot. It finds the pieces of your markdown vault that matter.
+
+## Core Design Rule
+
+Markdown is canonical.
+
+That means:
+
+- your `.md` files are the real data
+- SQLite is only a cache
+- embeddings are only derived data
+- deleting the database should not delete knowledge
+- indexing should be repeatable
+
+By default, Glassmind only writes generated project data under `.agent/`. Normal notes are read, not edited.
+
+## Vault Scanning
+
+The scanner walks the configured vault path and finds markdown files.
+
+By default it skips folders such as:
+
+- `.git`
+- `.obsidian`
+- `.trash`
+- `.agent/cache`
+
+It intentionally does not skip all of `.agent/`, because generated memories, decisions, and task notes should be searchable. It only skips `.agent/cache`, which is where the SQLite database lives.
+
+For each markdown file, Glassmind records metadata:
+
+- relative path
+- filename
+- title
+- modified timestamp
+- file size
+- SHA256 content hash
+
+The hash is important. It lets Glassmind tell whether a note changed since the last index run.
+
+## Markdown Parsing
+
+After reading a file, Glassmind parses useful markdown structure.
+
+It extracts:
+
+- headings
+- paragraphs
+- code blocks
+- list items
+- Obsidian-style wikilinks
+- tags
+
+Supported wikilinks include:
+
+```text
+[[note]]
+[[note|alias]]
+[[folder/note]]
+```
+
+Tags can come from inline markdown:
+
+```md
+This is about #rust and #local-first tooling.
+```
+
+Or frontmatter:
+
+```md
+---
+tags: [rust, retrieval, notes]
+---
+```
+
+Tags are normalized to lowercase and deduplicated.
+
+## Chunking
+
+Search works better on smaller pieces of text than whole files. Those pieces are called chunks.
+
+Glassmind currently chunks by heading section first. For example:
+
+```md
+# Project
+
+Intro text.
+
+## Design
+
+Design text.
+
+## Tasks
+
+Task text.
+```
+
+This becomes separate retrieval chunks for the top-level section and child sections. Each chunk keeps its heading path, so results can point back to where they came from:
+
+```text
+Project > Design
+Project > Tasks
+```
+
+If a section is too large, Glassmind splits it into smaller overlapping chunks. Overlap helps avoid cutting useful context exactly at a boundary.
+
+Each chunk stores:
+
+- note id
+- chunk index
+- heading path
+- content
+- chunk type
+- start line
+- end line
+- rough token estimate
+- chunk content hash
+
+The token estimate is currently simple word counting. It is not perfect, but it is good enough for budgeting context bundles.
+
+## SQLite Cache
+
+The local database lives here by default:
+
+```text
+.agent/cache/glassmind.sqlite3
+```
+
+It is ignored by Git and can be rebuilt from markdown.
+
+The main tables are:
+
+- `notes`: one row per markdown note
+- `chunks`: retrieval chunks
+- `tags`: normalized tag names
+- `note_tags`: many-to-many join table
+- `links`: wikilinks from notes
+- `embeddings`: vector cache for chunks
+- `retrieval_audit`: search history for debugging retrieval behavior
+- `memory_events`: generated memory records
+- `migrations`: schema bootstrap marker
+
+On index, Glassmind compares the current file hash with the hash stored in `notes`.
+
+If the hash matches and the index version matches, the note is skipped.
+
+If the note changed, Glassmind rewrites its child rows:
+
+```text
+old chunks
+old FTS rows
+old embeddings
+old tags mapping
+old links
+```
+
+Then it inserts the fresh metadata.
+
+If a file was deleted from the vault, the indexer removes that note and its derived rows from the cache.
+
+## Keyword Search With FTS
+
+SQLite includes a full-text search engine called FTS5. Glassmind creates an FTS table for chunk content.
+
+When chunks are written, matching FTS rows are written too.
+
+A keyword search runs roughly like this:
+
+```sql
+SELECT chunk metadata, snippet, rank
+FROM chunks_fts
+JOIN chunks
+JOIN notes
+WHERE chunks_fts MATCH query
+ORDER BY bm25 rank
+```
+
+FTS gives Glassmind:
+
+- fast local keyword search
+- ranked results
+- snippets with matched terms highlighted
+
+This is the most reliable baseline search mode because it does not require a model.
+
+## Embeddings
+
+Embeddings are numeric representations of text meaning.
+
+Conceptually:
+
+```text
+"local memory for agents"
+  -> [0.12, -0.04, 0.77, ...]
+```
+
+Texts with similar meaning should produce vectors that are close to each other.
+
+Glassmind has an `EmbeddingBackend` trait:
+
+```text
+text in
+vector out
+```
+
+Right now there is a deterministic local embedding backend. It is not a real language model embedding, but it lets the full pipeline work locally and predictably while the storage and retrieval flow stabilizes.
+
+There is also an Ollama-shaped backend stub. The code has the right boundary for an Ollama implementation, but the current version does not call Ollama over HTTP yet.
+
+Embeddings are stored in SQLite as JSON arrays in the `embeddings` table.
+
+This is not the final high-performance vector storage design. The intended future path is native `sqlite-vec`. The current implementation keeps everything runnable with plain SQLite while preserving the architecture.
+
+## Semantic Search
+
+Semantic search compares the query embedding to chunk embeddings.
+
+The comparison uses cosine similarity:
+
+```text
+1.0  = very similar
+0.0  = unrelated
+-1.0 = opposite direction
+```
+
+In practice, Glassmind:
+
+1. embeds the query
+2. loads candidate chunks
+3. compares query vector to chunk vectors
+4. assigns a semantic score
+
+The current semantic path is useful as plumbing and scoring infrastructure. Search quality will improve when the Ollama or sqlite-vec pieces become real model-backed vector search.
+
+## Hybrid Retrieval
+
+Pure keyword search is brittle. Pure semantic search can be fuzzy or surprising.
+
+Glassmind combines multiple scoring signals:
+
+- keyword score
+- semantic score
+- recency score
+- tag score
+- wikilink score
+
+The config has weights:
+
+```toml
+[search]
+semantic_weight = 0.55
+keyword_weight = 0.25
+recency_weight = 0.10
+link_weight = 0.05
+tag_weight = 0.05
+```
+
+The final score is a weighted blend.
+
+You can inspect the pieces with:
+
+```powershell
+cargo run -- search "local memory" --debug-scores
+```
+
+That makes retrieval behavior less magical. If a result is weird, you can see whether it came from keyword matching, semantic similarity, recency, tags, or links.
+
+## Context Bundles
+
+Search results are useful for humans, but agents usually need a compact context packet.
+
+That is what `glassmind context` builds.
+
+Example:
+
+```powershell
+cargo run -- context "continue glassmind" --budget 4000
+```
+
+The context builder:
+
+1. runs retrieval
+2. takes the highest-scoring chunks
+3. respects the token budget
+4. outputs markdown by default
+5. includes source paths
+
+The result is meant to be pasted into an LLM prompt or returned to an agent.
+
+There is also a summarizer hook. It is disabled right now, but the interface exists so local summarization can be added later without changing the bundle format.
+
+## Agent Workspace
+
+Glassmind owns `.agent/`.
+
+The current structure is:
+
+```text
+.agent/
+  memories/
+  summaries/
+  tasks/
+  decisions/
+  logs/
+  cache/
+```
+
+Capture commands append markdown into this workspace:
+
+```powershell
+cargo run -- capture memory --project Glassmind --text "SQLite is rebuildable cache."
+cargo run -- capture task --project Glassmind --text "Wire real Ollama embeddings."
+cargo run -- capture decision --project Glassmind --text "Markdown remains canonical."
+```
+
+These generated files are indexed like normal markdown. That gives agents a place to write memory without touching user-owned notes.
+
+## HTTP API
+
+`glassmind serve` starts a small localhost HTTP server.
+
+Default bind:
+
+```text
+127.0.0.1:7331
+```
+
+Endpoints:
+
+- `GET /health`
+- `GET /stats`
+- `POST /search`
+- `POST /context`
+- `GET /notes/{path}`
+
+Example:
+
+```powershell
+curl http://127.0.0.1:7331/health
+```
+
+Search request:
+
+```json
+{
+  "query": "local memory",
+  "limit": 5
+}
+```
+
+Context request:
+
+```json
+{
+  "query": "continue glassmind",
+  "limit": 8,
+  "budget": 6000
+}
+```
+
+The current server is intentionally simple and uses the Rust standard library. A later version can swap this for Axum without changing the core indexing and retrieval flow.
+
+## MCP-Style Tool Commands
+
+MCP means Model Context Protocol. It is a way for AI tools to call external tools.
+
+Glassmind currently has an MCP-style command surface:
+
+```powershell
+cargo run -- mcp tools
+cargo run -- mcp search "local memory"
+cargo run -- mcp context "continue glassmind"
+cargo run -- mcp read "README.md"
+```
+
+This is not a full MCP transport yet. It is the command and response shape that a real MCP server can reuse later.
+
+## Watch Mode
+
+There is a simple polling watch mode:
+
+```powershell
+cargo run -- index --watch
+```
+
+It reindexes every five seconds.
+
+This is intentionally plain. A real filesystem watcher can replace it later, but the current loop proves the live indexing behavior without another moving part.
+
+## Retrieval Audit Logging
+
+Hybrid searches write audit rows into SQLite.
+
+The audit log stores:
+
+- query
+- returned paths
+- timestamp
+- client label
+
+This is for tuning. Retrieval systems are hard to improve if you cannot inspect what they returned and why.
+
+## Typical Local Workflow
+
+For this repo:
+
+```powershell
+cargo run -- index --embeddings
+cargo run -- search "glassmind local memory" --debug-scores
+cargo run -- context "continue glassmind" --budget 3000
+```
+
+For a personal Obsidian vault:
+
+```powershell
+cargo run -- --vault "E:\notes\Brain" index --embeddings
+cargo run -- --vault "E:\notes\Brain" search "project ideas" --debug-scores
+cargo run -- --vault "E:\notes\Brain" context "what was I thinking about local agents?"
+```
+
+If the path has spaces, keep the quotes.
+
+## What Is Real Now
+
+The working spine is:
+
+```text
+scan
+parse
+chunk
+hash
+cache
+FTS search
+embedding cache
+hybrid scoring
+context bundles
+HTTP surface
+MCP-style commands
+agent memory capture
+audit logging
+```
+
+That is enough to test the product shape end to end.
+
+## What Is Still Placeholder Or Lightweight
+
+Some pieces are intentionally MVP-level:
+
+- the Ollama backend has the right interface but does not call Ollama yet
+- vector storage is SQLite JSON, not native `sqlite-vec`
+- semantic search is brute-force over candidate chunks
+- HTTP uses a tiny standard-library server, not Axum
+- MCP is command-shaped, not a full MCP protocol server
+- watch mode is polling, not filesystem events
+
+These are implementation swaps, not architecture rewrites.
+
+The main architecture is already pointing in the right direction:
+
+```text
+markdown source of truth
+rebuildable local cache
+inspectable retrieval
+agent-safe writes
+human-readable output
+```
+