documentation update

2026-06-14 10:08:37 +00:00 · 2026-05-17 10:16:26 -07:00 · 2026-05-17 10:16:26 -07:00 · 9e3b56b214
commit 9e3b56b214
parent a8616a1062
3 changed files with 175 additions and 450 deletions
--- a/README.md
+++ b/README.md
@ -4,26 +4,31 @@
  <img src="docs/images/logo.png" width="220">
 </p>
 Auditable local-first AI coding pipelines.
-NightShift is a deterministic pipeline runner for long-running AI-assisted coding workflows. It runs one markdown task at a time through a declarative YAML pipeline, records the important artifacts, and leaves the user with a reviewable work package.
+NightShift is a deterministic pipeline runner for AI-assisted coding work. It reads markdown tasks, builds bounded context, asks configured agents for plans or patches, validates and applies those patches through explicit stages, runs checks, and leaves a human-reviewable artifact trail.
 NightShift is not an autonomous software engineer. It is an orchestration layer that treats AI agents as unreliable workers inside bounded, testable, auditable workflows.
-## MVP Status
+## Current Status
-The core MVP is implemented:
+NightShift now supports the full local patch workflow:
- `nightshift init` creates starter config, task, and agent prompt files.
+- `nightshift init`, `validate`, `status`, `run`, `run --task`, `run --all`, and `web`.
- `nightshift validate` checks config structure, prompt paths, task parsing, scoped paths, and command safety.
+- Markdown task parsing with dependencies.
- `nightshift run` executes the next incomplete task.
+- Command, Ollama, and OpenAI-compatible agent backends.
- `nightshift run --task TASK-001` executes a specific task.
+- Per-agent model settings such as `temperature`.
- Command-backed agents receive compact prompt bundles on stdin.
+- Repo lookup tools: scoped `list_files`, `read_file`, and `grep`.
- Ollama-backed agents can call local models with `backend: ollama`.
+- Planner lookup requests with `files-inspected.md` artifacts.
- Command stages run through allowlist and forbidden-fragment checks.
+- `repo_context` stage for `context-pack.md`.
- Runs create `.nightshift/` artifacts, task context, retry context, command output, agent output, final notes, and run summaries.
+- Project context chart generation at `.nightshift/project-context-chart.md`.
- Unit tests cover config, safety, tasks, artifacts, commands, agents, pipeline retries, context, and reports.
+- `code_writer` stage that requires unified diff output.
 - `patch_normalizer`, `patch_validator`, and `patch_apply` stages.
 - Patch dry-run and apply modes.
 - Test/static failure repair loops through existing retry routing.
 - Run logs, dashboard log tails, git status artifacts, diffs, stage summaries, and final reports.
 The default posture remains local-first and review-first: agents propose; NightShift validates, applies, tests, and records.
 ## What NightShift Is
@ -32,10 +37,11 @@ NightShift is built for reviewable automation:
 - local-first execution
 - declarative pipeline stages
 - markdown task files
- command-backed agent wrappers
+- command-backed and model-backed agent wrappers
 - explicit retry limits
 - scoped repository lookup
 - patch validation before mutation
 - command allowlists
 - scoped path checks
 - durable markdown/text artifacts
 - compact context handoff
 - final reports for human review
@ -44,7 +50,7 @@ The goal is to wake up to useful artifacts and a repository state you can inspec
 ## What NightShift Is Not
-NightShift does not try to autonomously ship code. It does not push branches, deploy software, run arbitrary hooks, execute parallel task swarms, or grant agents unlimited repository access. Human review remains the final authority.
+NightShift does not push branches, deploy software, run unbounded task swarms, or grant agents unlimited repository access. Human review remains the final authority.
 ## Install
@ -60,38 +66,37 @@ You can also run the CLI module directly from a checkout:
 python -m nightshift.cli --help
 ```
-NightShift currently uses the Python standard library for runtime behavior. PyYAML is used automatically if installed, but the starter config works with the built-in YAML subset parser.
+NightShift uses the Python standard library for runtime behavior where practical. PyYAML is used automatically if installed, but starter configs work with the built-in YAML subset parser.
 ## Quickstart
-Create starter files:
+Validate the included end-to-end patch example:
 ```bash
 python -m nightshift.cli validate --config examples/quickstart-lisp/nightshift.yaml
 ```
 Run the first task against a copy of the example project. The pipeline uses `patch_apply mode: apply`, so running it directly against `examples/quickstart-lisp/` will modify those files.
 ```bash
 cp -r examples/quickstart-lisp /tmp/nightshift-quickstart
 python -m nightshift.cli run --config /tmp/nightshift-quickstart/nightshift.yaml --task TASK-001
 ```
 For a new project:
 ```bash
 nightshift init
 ```
 Validate the project:
 ```bash
 nightshift validate
-```
+nightshift status
 Run the next incomplete task:
 ```bash
 nightshift run
 ```
 Run a specific task:
 ```bash
 nightshift run --task TASK-001
 ```
-Review artifacts:
+Open the read-only artifact dashboard:
-```text
+```bash
-.nightshift/runs/<run-id>/
+pip install flask
 nightshift web
 ```
 ## Task File Example
@ -101,114 +106,106 @@ Tasks live in markdown checklist format:
 ```markdown
 # Tasks
- [ ] TASK-001: Add YAML config loading
+- [ ] TASK-001: Add parser support
 Description:
-Implement config loading for NightShift.
+Implement parsing for the target language.
 Acceptance Criteria:
- Loads `nightshift.yaml`
+- Parses numbers
- Validates required fields
+- Parses symbols
- Returns typed config objects
+- Parses nested lists
- Includes tests
+- Includes unit tests
 ```
-NightShift parses task id, title, completion state, description, acceptance criteria, optional dependency bullets, and raw task markdown.
+NightShift parses task id, title, completion state, description, acceptance criteria, dependency bullets, and raw task markdown.
-## Config Example
+## Pipeline Example
 ```yaml
 project:
  name: example-project
  root: .
  task_file: tasks.md
  artifact_dir: .nightshift
 safety:
  require_clean_worktree: false
  scoped_paths:
    - .
  allowed_commands:
    - python -m unittest
  forbidden_commands:
    - rm -rf
    - git push
    - curl | bash
 agents:
  planner:
    backend: command
    command: echo
    system_prompt: agents/planner.md
  implementer:
    backend: command
    command: echo
    system_prompt: agents/implementer.md
  reviewer:
    backend: command
    command: echo
    system_prompt: agents/reviewer.md
 pipeline:
-  max_task_retries: 3
+  max_task_retries: 2
  continue_on_task_failure: false
  stages:
    - id: plan
      type: agent
      agent: planner
      output: plan.md
    - id: context
      type: repo_context
      output: context-pack.md
    - id: implement
-      type: agent
+      type: code_writer
      agent: implementer
-      output: implementation-log.md
+      output: proposed.patch
    - id: normalize
      type: patch_normalizer
      output: normalized.patch
    - id: validate_patch
      type: patch_validator
      output: patch-validation.md
      max_files: 8
      max_lines: 800
    - id: apply_patch
      type: patch_apply
      mode: apply
      output: patch-apply-output.txt
    - id: test
      type: command
      commands:
-        - python -m unittest
+        - python -m unittest discover -v
      output: test-output.txt
      on_fail: implement
    - id: review
      type: agent_review
      agent: reviewer
      on_fail: implement
      output: review.md
    - id: summarize
      type: summarize
      output: final-notes.md
 ```
 Use `mode: dry_run` for patch applicability checks without modifying files. Use `mode: apply` to write the validated patch to the target project.
 ## Agent Backends
-NightShift supports `backend: command` and `backend: ollama`.
+NightShift supports:
-NightShift builds a prompt bundle containing:
+- `backend: command`
 - `backend: ollama`
 - `backend: openai_compatible`
- system prompt
+Example Ollama agent:
 - stage id and type
 - task markdown
 - acceptance criteria
 - project context
 - task context
 - previous stage output
 - retry notes
 - output contract
 The prompt is passed to the configured command or local Ollama model on stdin. stdout, stderr, exit code, duration, and the prompt are persisted as artifacts.
 Ollama example:
 ```yaml
 agents:
-  planner:
+  implementer:
    backend: ollama
    model: qwen2.5-coder:14b
-    system_prompt: agents/planner.md
+    temperature: 0.2
    system_prompt: agents/implementer.md
 ```
 Example OpenAI-compatible agent:
 ```yaml
 agents:
  implementer:
    backend: openai_compatible
    model: local-model
    base_url: http://localhost:11434/v1
    api_key_env: OPENAI_API_KEY
    temperature: 0.2
    system_prompt: agents/implementer.md
 ```
 NightShift passes prompt bundles to agents and persists stdout, stderr, exit code, duration, and prompt artifacts. Code writer agents should return unified diffs.
 Review agents should emit:
 ```yaml
@ -220,14 +217,14 @@ context_update: <compact useful note>
 ## Safety Model
-NightShift validates paths and commands before execution.
+NightShift validates paths, commands, and patches before mutation.
 Path safety:
 - project roots are resolved with `pathlib`
- task files and prompt files must stay inside the project root
+- task and prompt files must stay inside the project root
 - artifact paths cannot escape `.nightshift/`
- task artifact writes cannot escape the task directory
+- repo lookup tools are constrained by `safety.scoped_paths`
 Command safety:
@ -236,7 +233,13 @@ Command safety:
 - command output and exit codes are recorded
 - command stages stop at the first failing or timed-out command
-The MVP does not push, deploy, create branches, or execute arbitrary Python hooks.
+Patch safety:
 - code changes are represented as unified diffs
 - patches are normalized and validated before apply
 - path traversal and forbidden paths are rejected
 - scoped paths, max files, and max changed lines are enforced
 - `patch_apply` records apply output and git status artifacts
 ## Artifact Layout
@ -245,24 +248,38 @@ A run creates human-readable artifacts:
 ```text
 .nightshift/
  project-context.md
  project-context-chart.md
  nightshift.log
  runs/
    <run-id>/
      run.log
      run-summary.md
      config.snapshot.yaml
      run-metadata.md
      prompts/
        <agent-id>.md
      tasks/
        TASK-001/
          task.md
          context.md
          files-inspected.md
          context-pack.md
          plan.md
-          implementation-log.md
+          proposed.patch
          normalized.patch
          patch-validation.md
          applied.patch
          patch-apply-output.txt
          test-output.txt
          review.md
          stage-results.md
          context-out.md
          task-completion.md
          diff.patch
          final-notes.md
 ```
-Artifacts are written even when a stage fails where possible.
+Exact artifact names depend on configured stage `output` values.
 ## Development
@ -278,29 +295,14 @@ Compile-check modules:
 python -m compileall nightshift tests
 ```
 Optional read-only dashboard:
 ```bash
 pip install flask
 nightshift web
 ```
 Additional docs:
 - [Quickstart](QUICKSTART.md)
 - [Config reference](docs/config-reference.md)
 - [Artifact review workflow](docs/artifact-review.md)
 - [Troubleshooting](docs/troubleshooting.md)
 - [Quickstart](QUICKSTART.md)
 - [Quickstart Lisp example](examples/quickstart-lisp/)
 ## Roadmap
-Next major work:
+The active roadmap now lives in [docs/design.md](docs/design.md). Completed phase checklists are cleared from that document so it stays focused on the current platform shape and the next important work.
 - richer local backend support beyond Ollama
 - optional branch isolation
 - live dashboard enhancements
 - stronger structured command definitions
 - longer-run reporting and resumability
 NightShift remains oriented around reviewable output, not blind autonomy.
--- a/docs/design.md
+++ b/docs/design.md
@ -846,7 +846,7 @@ Mitigation:
 # 16. Implemented Baseline
-The MVP and post-MVP phases through phase 22 are implemented.
+The MVP and the patch-capable local runner are implemented.
 NightShift currently provides:
@ -857,14 +857,24 @@ NightShift currently provides:
 * `nightshift run --task TASK-ID` for a specific task
 * `nightshift run --all` for sequential multi-task execution
 * `nightshift web` for a read-only artifact dashboard
 * Operational run logging to the CLI, per-run logs, and aggregate logs
 * Markdown task parsing with descriptions, acceptance criteria, completion state, and dependency bullets
 * Dependency validation for missing references and simple cycles
 * Dependency-aware task selection and task blocking
 * Declarative YAML pipeline execution
-* Command, agent, agent-review, review, and summarize stage handling
+* Command, agent, agent-review, review, summarize, repo-context, code-writer, patch-normalizer, patch-validator, and patch-apply stage handling
 * Retry redirection with a configured task retry limit
 * Command-backed agents
 * Ollama-backed local model agents
 * OpenAI-compatible local/server model agents
 * Per-agent temperature settings
 * Scoped repo lookup tools: `list_files`, `read_file`, and `grep`
 * Planner lookup requests, `files-inspected.md`, and planner reruns with retrieved context
 * Project context chart generation
 * Context pack generation
 * Unified diff code-writing contract
 * Patch normalization, validation, dry-run, and apply modes
 * Test/static failure repair loops via bounded stage retries
 * Prompt bundle construction with project, task, retry, and previous-stage context
 * Prompt snapshots and run metadata for experiment comparison
 * Optional experiment labels and prompt variant metadata
@ -881,8 +891,8 @@ NightShift currently provides:
 * Per-run and per-task markdown/text artifacts
 * Project, task, retry, and context-out files
 * Final task notes, stage summaries, task completion artifacts, and run summaries
-* Documentation for config, artifact review, troubleshooting, and quickstart workflows
+* Documentation for config, artifact review, troubleshooting, quickstart, and patch workflows
-* A complete fake-agent quickstart Lisp example under `examples/quickstart-lisp/`
+* A complete fake-agent patch-mode quickstart Lisp example under `examples/quickstart-lisp/`
 The system remains sequential and local-first. It is designed to produce reviewable artifacts and repository state, not to deploy, push, or autonomously ship changes.
@ -929,7 +939,13 @@ Current run artifacts include:
          task.md
          context.md
          plan.md
-          implementation-log.md
+          files-inspected.md
          context-pack.md
          proposed.patch
          normalized.patch
          patch-validation.md
          applied.patch
          patch-apply-output.txt
          test-output.txt
          review.md
          stage-results.md
@ -969,144 +985,51 @@ Current limitations:
 * Execution is sequential; there is no parallel task runner.
 * The web dashboard is read-only and artifact-oriented.
 * Live run progress is limited to basic CLI prints and artifact inspection.
 * Flask is optional; `nightshift web` requires it to be installed.
-* Ollama support depends on the user's local Ollama installation and model availability.
+* Model backends depend on the user's local model server, Ollama installation, or command wrappers.
 * Git artifacts can be unavailable or degraded in non-git repositories or repositories blocked by Git safe-directory rules.
 * Task mutation is intentionally minimal and only flips matching checklist lines.
-* Command configuration is safer than the MVP but is still string-first for compatibility.
+* Patch application currently uses `git apply`; non-git workflows are limited.
 * Command configuration remains string-first for compatibility.
 * There is no branch isolation, resumable run state machine, approval workflow, or deployment integration.
 ---
-# 18. Next Major Update Plan
+# 18. Active Roadmap
-The next major update should improve operational visibility while preserving the current artifact-first model.
+Completed phase checklists are removed from this design document once they are reflected in the implemented baseline and user-facing docs. Track future phase work here only while it is active, using concise implementation notes when a decision needs durable context.
-Phase work is tracked in this design document by updating the relevant phase checklist and adding concise implementation notes only when a decision needs durable context. The old `docs/devlog/` phase files have been retired.
+The next important additions are:
-## Phase 23: Improved Logging and Live Visibility
+1. Branch isolation for patch runs
   Run each task on a dedicated branch or worktree, record branch metadata, and make rollback/review safer.
-NightShift should make active runs easier to observe from both the CLI and the web dashboard.
+2. Resumable run state
   Persist machine-readable run state so interrupted runs can continue from the last completed stage instead of restarting.
-Implementation tasks:
+3. Human approval gates
   Add optional approval stages before patch apply, after failed validation, or before task completion.
-* [x] Add a small logging module with structured operational events.
+4. Structured patch policy config
-* [x] Stream human-readable progress to the CLI during `run` and `run --all`.
+   Move max files, max lines, forbidden paths, allowed file types, binary rejection, and protected files into a reusable project-level write policy.
 * [x] Include run id, task id, stage id, agent/backend, command index, retry count, status, duration, and artifact path where available.
 * [x] Write a per-run log file such as `.nightshift/runs/<run-id>/run.log`.
 * [x] Optionally write or rotate an aggregate `.nightshift/nightshift.log` for cross-run troubleshooting.
 * [x] Keep logs operational; do not duplicate full prompts, full model responses, or full command output that already lives in artifacts.
 * [x] Redact or avoid secrets from logged environment/config values.
 * [x] Add dashboard support for viewing the latest log tail.
 * [x] Cap the dashboard log view to the last 100 lines by default.
 * [x] Keep the full per-run log file available as an artifact unless a later size cap is configured.
 * [x] Auto-refresh the dashboard log view with the existing dashboard refresh model.
 * [x] Add tests for log writing, CLI progress hooks, dashboard log rendering, missing log files, and the 100-line cap.
-Acceptance Criteria:
+5. Better model backend support
   Expand OpenAI-compatible behavior, add request metadata artifacts, support response format hints, and document local server patterns.
-* A user running NightShift from a terminal can tell which task and stage are active.
+6. Richer dashboard
-* Long Ollama or command stages show enough lifecycle information that the process does not appear hung.
+   Add task/stage navigation, patch views, validation status, run log tail, and artifact links without adding mutation controls.
 * The latest run log is visible from `nightshift web`.
 * The web client displays at most the last 100 log lines by default.
 * Logs point users to detailed artifacts instead of replacing them.
 * Missing or partial log files do not crash the dashboard.
-Notes:
+7. Project context chart improvements
   Use language-aware parsers where available, include import graphs, ownership hints, and stale-context detection.
-* This phase should not add process control, websockets, authentication, or write actions to the web client.
+8. Stronger repair feedback
-* If future live streaming is needed, the first version can still use file tailing plus refresh before introducing websockets.
+   Feed compact test/static failure summaries, patch apply errors, and reviewer objections into repair attempts with clearer bounded policies.
 * Operational logs should complement artifacts: artifacts remain the source of detailed prompts, responses, command output, diffs, and summaries.
-## Phase 24: Per-Agent Model Parameters
+9. End-to-end apply-mode examples
   Add more small target projects and fake-agent fixtures that exercise patch apply, repair, validation failure, and review retry paths.
- [x] Add `temperature` to agent config.
+10. Packaging and dependency extras
- [x] Pass temperature to Ollama/OpenAI-compatible backends.
+   Add optional extras such as `nightshift[web]`, document supported Python versions, and prepare the project for repeatable installation.
 - [x] Default safely if omitted.
 - [x] Add config validation tests.
 ## Phase 25: Repo Lookup Tools MVP
 - [x] Add tool interface for repo operations.
 - [x] Implement scoped `list_files`.
 - [x] Implement scoped `read_file`.
 - [x] Implement scoped `grep`.
 - [x] Enforce existing path safety rules.
 - [x] Log tool calls as artifacts.
 ## Phase 26: Planner Code-Discovery Support
 - [x] Teach planner prompt to request needed code context.
 - [x] Add structured planner output for lookup requests.
 - [x] Execute requested lookup tools.
 - [x] Save `files-inspected.md`.
 - [x] Re-run planner with retrieved context.
 ## Phase 27: Context Pack Builder
 - [x] Add `repo_context` stage.
 - [x] Generate `context-pack.md`.
 - [x] Include task, acceptance criteria, relevant files, snippets, and constraints.
 - [x] Add line-numbered excerpts.
 - [x] Add context-size caps.
 ## Phase 28: Project Context Chart MVP
 - [x] Generate `.nightshift/project-context-chart.md`.
 - [x] Include files, responsibilities, functions/classes, entry points, tests.
 - [x] Use simple regex/parser MVP.
 - [x] Update chart during planning.
 - [x] Store anchors/line numbers/search terms.
 ## Phase 29: Code Writer Stage
 - [x] Add `code_writer` stage type.
 - [x] Feed it task + context pack.
 - [x] Require unified diff output.
 - [x] Save `proposed.patch`.
 - [x] Save `implementation-summary.md`.
 ## Phase 30: Patch Normalization
 - [x] Add `patch_normalizer` stage.
 - [x] Support low-temperature formatter model.
 - [x] Convert messy model output to valid unified diff.
 - [x] Reject missing/ambiguous edits.
 - [x] Save `normalized.patch`.
 ## Phase 31: Patch Validation
 - [x] Parse unified diffs.
 - [x] Reject malformed patches.
 - [x] Enforce scoped paths.
 - [x] Reject path traversal.
 - [x] Enforce max files/max lines changed.
 - [x] Reject forbidden files.
 ## Phase 32: Patch Apply / Dry Run
 - [x] Add `patch_apply` stage.
 - [x] Support `mode: dry_run`.
 - [x] Support `mode: apply`.
 - [x] Save `applied.patch`.
 - [x] Preserve pre/post git status.
 - [x] Fail cleanly on apply errors.
 ## Phase 33: Test Feedback Repair Loop
 - [x] Feed test/static failure output back into implementer.
 - [x] Add bounded repair attempts.
 - [x] Save each repair patch.
 - [x] Save repair summaries.
 - [x] Stop after max retry count.
 ## Phase 34: End-to-End Coding Quickstart
 - [x] Update quickstart to modify real code.
 - [x] Include fake-agent test fixture.
 - [x] Demonstrate lookup → context pack → patch → apply → test.
 - [x] Document dry-run vs apply mode.
 ---
 # Appendix A: Design Decisions and Rationale
--- a/docs/vibe.md
+++ b/docs/vibe.md
@ -652,206 +652,6 @@ Do not require real LLM calls in unit tests.
 ---
 # 13. MVP Task Checklist
 ## Phase 1: Skeleton
 * [ ] Create project package/module layout
 * [ ] Add CLI entry point
 * [ ] Add `nightshift init`
 * [ ] Generate example `nightshift.yaml`
 * [ ] Generate example `tasks.md`
 * [ ] Generate example agent prompt files
 Acceptance Criteria:
 * User can run init command
 * Expected files are created
 * Existing files are not overwritten without confirmation or force flag
 ---
 ## Phase 2: Config Loading
 * [ ] Implement YAML config loader
 * [ ] Define typed config objects
 * [ ] Validate required sections
 * [ ] Validate agent references
 * [ ] Validate pipeline stages
 * [ ] Add tests
 Acceptance Criteria:
 * Valid config loads
 * Invalid config fails with clear error
 * Pipeline stages cannot reference missing agents
 ---
 ## Phase 3: Safety Layer
 * [ ] Implement project root resolution
 * [ ] Implement scoped path validation
 * [ ] Implement safe artifact path creation
 * [ ] Implement command allowlist check
 * [ ] Implement forbidden command fragment check
 * [ ] Add tests for path traversal
 * [ ] Add tests for forbidden commands
 Acceptance Criteria:
 * Cannot write outside project root
 * Cannot execute commands outside allowlist
 * Dangerous command fragments are blocked
 ---
 ## Phase 4: Task Parser
 * [ ] Parse markdown task checklist
 * [ ] Extract task id
 * [ ] Extract title
 * [ ] Extract description
 * [ ] Extract acceptance criteria
 * [ ] Support selecting next incomplete task
 * [ ] Support selecting specific task id
 * [ ] Add tests
 Acceptance Criteria:
 * Parser handles documented task format
 * Parser returns useful errors for malformed tasks
 * Task selection works
 ---
 ## Phase 5: Artifact Store
 * [ ] Create `.nightshift/`
 * [ ] Create per-run directory
 * [ ] Create per-task directory
 * [ ] Write config snapshot
 * [ ] Write task snapshot
 * [ ] Write stage outputs
 * [ ] Write command outputs
 * [ ] Write final task notes
 * [ ] Add tests
 Acceptance Criteria:
 * Every run creates deterministic artifact structure
 * Artifacts are present even when stages fail
 ---
 ## Phase 6: Command Executor
 * [ ] Implement command stage execution
 * [ ] Capture stdout
 * [ ] Capture stderr
 * [ ] Capture exit code
 * [ ] Persist command output
 * [ ] Return structured stage result
 * [ ] Add tests with harmless commands
 Acceptance Criteria:
 * Passing command returns pass
 * Failing command returns fail
 * Output is written to artifact file
 ---
 ## Phase 7: Agent Executor
 * [ ] Implement `command` backend agent
 * [ ] Load system prompt file
 * [ ] Build prompt bundle
 * [ ] Pass prompt to command backend
 * [ ] Capture output
 * [ ] Persist output
 * [ ] Return structured stage result
 * [ ] Add fake-agent tests
 Acceptance Criteria:
 * Fake command agent can produce stage output
 * Prompt includes task and acceptance criteria
 * Agent output is stored in artifacts
 ---
 ## Phase 8: Pipeline Runner
 * [ ] Execute configured stages in order
 * [ ] Stop on unrecoverable failure
 * [ ] Support `on_fail` stage redirection
 * [ ] Track retry count
 * [ ] Enforce max task retries
 * [ ] Write per-stage summaries
 * [ ] Add tests
 Acceptance Criteria:
 * Happy path pipeline completes
 * Failed review can retry implementation
 * Retry limit is enforced
 * Final task status is recorded
 ---
 ## Phase 9: Context Manager
 * [ ] Create project context file if absent
 * [ ] Create task context file
 * [ ] Include project context in agent prompt bundle
 * [ ] Include prior stage notes in retry prompt
 * [ ] Write `context-out.md`
 * [ ] Add tests
 Acceptance Criteria:
 * Context files are created
 * Agent prompt receives compact context
 * Context output is persisted
 ---
 ## Phase 10: Reports
 * [ ] Generate task final report
 * [ ] Generate run summary
 * [ ] Include task status
 * [ ] Include retry count
 * [ ] Include modified files if available
 * [ ] Include test/static results
 * [ ] Include artifact paths
 * [ ] Add tests
 Acceptance Criteria:
 * User can inspect one summary after run
 * Summary explains what happened without reading every artifact
 ---
 ## Phase 11: README
 * [ ] Explain what NightShift is
 * [ ] Explain what it is not
 * [ ] Add quickstart
 * [ ] Add config example
 * [ ] Add task file example
 * [ ] Add safety model explanation
 * [ ] Add MVP status
 Acceptance Criteria:
 * A new user can understand and run the MVP
 * README emphasizes reviewable output, not blind autonomy
 ---
 # 14. Implementation Guidance
 ## 14.1 Prefer boring code