diff --git a/QUICKSTART.md b/QUICKSTART.md index dd50a94..ba22fda 100644 --- a/QUICKSTART.md +++ b/QUICKSTART.md @@ -85,7 +85,7 @@ tasks/TASK-001/final-notes.md ## Example Templates -Example run files are available in `templates/`. +Example run files are available in `examples/templates/`. They are safe starter examples and use command-backed fake agents. The repository also includes a complete sample target project: @@ -148,7 +148,7 @@ agents: system_prompt: agents/reviewer.md pipeline: - max_task_retries: 1 + max_task_retries: 3 continue_on_task_failure: false stages: - id: plan diff --git a/README.md b/README.md index dba8a8c..a3c4271 100644 --- a/README.md +++ b/README.md @@ -72,7 +72,7 @@ NightShift uses the Python standard library for runtime behavior where practical Start with the [Quickstart](QUICKSTART.md). It uses deterministic fake agents so you can verify lookup, context generation, patch validation, patch apply, tests, and artifacts without installing a model. -After that works, continue with [Tutorial 01: Running NightShift With Real Local Models](docs/tutorial/01-intro.md). It swaps the fake agents for Ollama-backed agents such as `qwen2.5-coder:14b` and walks through dry-run and apply-mode patch generation. +After that works, continue with [Tutorial 01: Running NightShift With Real Local Models](examples/tutorial/01-intro.md). It swaps the fake agents for Ollama-backed agents such as `qwen2.5-coder:14b` and walks through dry-run and apply-mode patch generation. ### Quickstart Commands @@ -306,7 +306,7 @@ python -m compileall nightshift tests Additional docs: - [Quickstart](QUICKSTART.md) -- [Tutorial: running real local models](docs/tutorial/01-intro.md) +- [Tutorial: running real local models](examples/tutorial/01-intro.md) - [Config reference](docs/config-reference.md) - [Artifact review workflow](docs/artifact-review.md) - [Troubleshooting](docs/troubleshooting.md) diff --git a/docs/design.md b/docs/design.md index 7cd9acb..d64cb2a 100644 --- a/docs/design.md +++ b/docs/design.md @@ -1014,23 +1014,36 @@ The next important additions are: Move max files, max lines, forbidden paths, allowed file types, binary rejection, and protected files into a reusable project-level write policy. 5. Better model backend support - Expand OpenAI-compatible behavior, add request metadata artifacts, support response format hints, and document local server patterns. + Expand OpenAI-compatible behavior, add request metadata artifacts, support response format hints, and document local server patterns. Prefer non-terminal APIs for machine-readable model output. In particular, avoid relying on interactive CLI streaming paths such as `ollama run` when exact patch text matters; use the Ollama HTTP API or OpenAI-compatible endpoint so terminal rendering, spinners, and line-wrapping behavior cannot corrupt artifacts. -6. Richer dashboard +6. Deterministic diff generation + Reduce direct reliance on models emitting perfect unified diffs. Add a workflow where the model returns complete file contents or a structured edit description, then NightShift writes the unified diff deterministically from before/after file snapshots. Keep the existing unified-diff contract for advanced agents, but make deterministic diff generation the preferred path for smaller local models. + +7. Retry artifact versioning + Preserve per-attempt artifacts instead of overwriting fixed filenames such as `proposed.patch`, `normalized.patch`, and `patch-validation.md`. Retry artifacts should include attempt numbers, while summary artifacts can point to the latest attempt. This makes repeated validation and repair failures diagnosable. + +8. Patch repair stage + Add an explicit patch repair or strict normalizer stage that receives the invalid patch, validation error, and relevant source excerpts, then returns a complete replacement patch. This stage should remain bounded by strict validation and should not silently guess intent for arbitrary malformed hunks. + +9. Richer dashboard Add task/stage navigation, patch views, validation status, run log tail, and artifact links without adding mutation controls. -7. Project context chart improvements +10. Project context chart improvements Use language-aware parsers where available, include import graphs, ownership hints, and stale-context detection. -8. Stronger repair feedback +11. Stronger repair feedback Feed compact test/static failure summaries, patch apply errors, and reviewer objections into repair attempts with clearer bounded policies. -9. End-to-end apply-mode examples +12. End-to-end apply-mode examples Add more small target projects and fake-agent fixtures that exercise patch apply, repair, validation failure, and review retry paths. -10. Packaging and dependency extras +13. Packaging and dependency extras Add optional extras such as `nightshift[web]`, document supported Python versions, and prepare the project for repeatable installation. ---- + +Implementation note: + +Recent local-model patch experiments exposed repeated line-fragment artifacts where long generated lines were split and the tail was duplicated on the following line. This affected prose and unified diffs, producing malformed hunk lines that strict validation correctly rejected. Treat this as a backend/output-capture and patch-contract problem before adding editor or linter agents: remove terminal streaming from model capture, preserve retry artifacts, and prefer deterministic diff generation when exact syntax matters. +--- # Appendix A: Design Decisions and Rationale diff --git a/examples/quickstart-lisp/nightshift.yaml b/examples/quickstart-lisp/nightshift.yaml index 29ef7db..65588f6 100644 --- a/examples/quickstart-lisp/nightshift.yaml +++ b/examples/quickstart-lisp/nightshift.yaml @@ -36,7 +36,7 @@ agents: system_prompt: agents/reviewer.md pipeline: - max_task_retries: 1 + max_task_retries: 3 continue_on_task_failure: false stages: - id: plan diff --git a/templates/agents/implementer.md b/examples/templates/agents/implementer.md similarity index 100% rename from templates/agents/implementer.md rename to examples/templates/agents/implementer.md diff --git a/templates/agents/planner.md b/examples/templates/agents/planner.md similarity index 100% rename from templates/agents/planner.md rename to examples/templates/agents/planner.md diff --git a/templates/agents/reviewer.md b/examples/templates/agents/reviewer.md similarity index 100% rename from templates/agents/reviewer.md rename to examples/templates/agents/reviewer.md diff --git a/templates/nightshift.yaml b/examples/templates/nightshift.yaml similarity index 100% rename from templates/nightshift.yaml rename to examples/templates/nightshift.yaml diff --git a/templates/tasks.md b/examples/templates/tasks.md similarity index 100% rename from templates/tasks.md rename to examples/templates/tasks.md diff --git a/docs/tutorial/01-intro.md b/examples/tutorial/01-intro.md similarity index 99% rename from docs/tutorial/01-intro.md rename to examples/tutorial/01-intro.md index b40b8a3..650bc2e 100644 --- a/docs/tutorial/01-intro.md +++ b/examples/tutorial/01-intro.md @@ -239,7 +239,7 @@ For real models, start conservatively: ```yaml pipeline: - max_task_retries: 1 + max_task_retries: 3 continue_on_task_failure: false ``` diff --git a/nightshift/agents.py b/nightshift/agents.py index e67f1a1..94c5f05 100644 --- a/nightshift/agents.py +++ b/nightshift/agents.py @@ -469,6 +469,7 @@ def output_contract_for(stage: StageConfig) -> str: "Do not include prose outside the patch.", "Use diff --git headers and hunk headers.", "For existing files, do not use new file mode or /dev/null headers.", + "On repair attempts, return a complete corrected replacement diff.", ] ) if stage.type == "patch_normalizer": diff --git a/nightshift/patches.py b/nightshift/patches.py index 07c56b9..d582145 100644 --- a/nightshift/patches.py +++ b/nightshift/patches.py @@ -191,7 +191,7 @@ def _validate_hunk_lines(patch: str) -> None: continue raise PipelineError( "Patch validation failed: malformed hunk line " - f"{line_number}; expected ' ', '+', '-', or '\\'." + f"{line_number}; expected a leading space, '+', '-', or backslash." )