close out phase2 and support quickstart

This commit is contained in:
K. Hodges 2026-05-17 10:11:59 -07:00
parent 4e502ba494
commit a8616a1062
8 changed files with 99 additions and 36 deletions

View File

@ -1,6 +1,6 @@
# NightShift Quickstart # NightShift Quickstart
This guide runs the current MVP with safe example files. This guide runs NightShift with safe example files, including an end-to-end patch workflow.
## 1. Install for Development ## 1. Install for Development
@ -67,9 +67,16 @@ Useful files:
```text ```text
run-summary.md run-summary.md
config.snapshot.yaml config.snapshot.yaml
project-context-chart.md
tasks/TASK-001/task.md tasks/TASK-001/task.md
tasks/TASK-001/context.md tasks/TASK-001/context.md
tasks/TASK-001/plan.md tasks/TASK-001/plan.md
tasks/TASK-001/context-pack.md
tasks/TASK-001/proposed.patch
tasks/TASK-001/normalized.patch
tasks/TASK-001/patch-validation.md
tasks/TASK-001/applied.patch
tasks/TASK-001/patch-apply-output.txt
tasks/TASK-001/test-output.txt tasks/TASK-001/test-output.txt
tasks/TASK-001/stage-results.md tasks/TASK-001/stage-results.md
tasks/TASK-001/context-out.md tasks/TASK-001/context-out.md
@ -87,7 +94,7 @@ The repository also includes a complete sample target project:
examples/quickstart-lisp/ examples/quickstart-lisp/
``` ```
Copy that directory elsewhere if you want to test NightShift against a multi-task project. Copy that directory elsewhere if you want to test NightShift against a multi-task project that modifies real code through patch mode.
## Quickstart Test Project ## Quickstart Test Project
@ -127,12 +134,12 @@ safety:
agents: agents:
planner: planner:
backend: command backend: command
command: echo command: python agents/fake_planner.py
system_prompt: agents/planner.md system_prompt: agents/planner.md
implementer: implementer:
backend: command backend: command
command: echo command: python agents/fake_code_writer.py
system_prompt: agents/implementer.md system_prompt: agents/implementer.md
reviewer: reviewer:
@ -149,16 +156,34 @@ pipeline:
agent: planner agent: planner
output: plan.md output: plan.md
- id: context
type: repo_context
output: context-pack.md
- id: implement - id: implement
type: agent type: code_writer
agent: implementer agent: implementer
output: implementation-log.md output: proposed.patch
- id: normalize
type: patch_normalizer
output: normalized.patch
- id: validate_patch
type: patch_validator
output: patch-validation.md
- id: apply_patch
type: patch_apply
mode: apply
output: patch-apply-output.txt
- id: test - id: test
type: command type: command
commands: commands:
- python -m unittest discover -v - python -m unittest discover -v
output: test-output.txt output: test-output.txt
on_fail: implement
- id: review - id: review
type: agent_review type: agent_review
@ -171,7 +196,7 @@ pipeline:
output: final-notes.md output: final-notes.md
``` ```
This uses fake command agents so the pipeline is safe and deterministic. Replace `command: echo` later with your real local agent wrapper. This uses fake command-backed planner and code-writer fixtures so the pipeline is deterministic but still inspects files and modifies real files through patch mode. Replace the fake agent commands later with your real local agent wrapper.
### 3. Add `tasks.md` ### 3. Add `tasks.md`
@ -242,10 +267,12 @@ Do not write code. Include files to edit, tests to add, and risks.
`agents/implementer.md`: `agents/implementer.md`:
```markdown ```markdown
You are the implementation agent. Implement the smallest correct change. You are the implementation agent. Output only a unified diff.
Preserve existing behavior and include tests. Preserve existing behavior and include tests when needed.
``` ```
For deterministic local fixtures, add `agents/fake_planner.py` that requests file lookups and `agents/fake_code_writer.py` that prints a unified diff. The included `examples/quickstart-lisp/` project contains working fixtures.
`agents/reviewer.md`: `agents/reviewer.md`:
```markdown ```markdown
@ -288,7 +315,9 @@ Run all currently runnable tasks:
nightshift run --all nightshift run --all
``` ```
Because the example uses fake agents, it will not actually implement the Lisp interpreter by itself. It is meant to verify the pipeline, dependency handling, reports, and artifacts before you connect a real command-backed agent. The included `examples/quickstart-lisp/` fake code writer implements the first parser task by emitting a patch. It exercises lookup, context-pack generation, patch normalization, validation, application, tests, reports, and artifacts before you connect a real model-backed agent.
Use `mode: dry_run` on the `patch_apply` stage when you want to verify that a patch would apply without changing files. Use `mode: apply` when the validated patch should be written to the target project.
### 7. Review Artifacts ### 7. Review Artifacts
@ -297,10 +326,16 @@ After a run, inspect:
```text ```text
.nightshift/runs/<run-id>/run-summary.md .nightshift/runs/<run-id>/run-summary.md
.nightshift/runs/<run-id>/tasks/TASK-001/plan.md .nightshift/runs/<run-id>/tasks/TASK-001/plan.md
.nightshift/runs/<run-id>/tasks/TASK-001/implementation-log.md .nightshift/runs/<run-id>/tasks/TASK-001/files-inspected.md
.nightshift/runs/<run-id>/tasks/TASK-001/context-pack.md
.nightshift/runs/<run-id>/tasks/TASK-001/proposed.patch
.nightshift/runs/<run-id>/tasks/TASK-001/normalized.patch
.nightshift/runs/<run-id>/tasks/TASK-001/patch-validation.md
.nightshift/runs/<run-id>/tasks/TASK-001/applied.patch
.nightshift/runs/<run-id>/tasks/TASK-001/patch-apply-output.txt
.nightshift/runs/<run-id>/tasks/TASK-001/test-output.txt .nightshift/runs/<run-id>/tasks/TASK-001/test-output.txt
.nightshift/runs/<run-id>/tasks/TASK-001/review.md .nightshift/runs/<run-id>/tasks/TASK-001/review.md
.nightshift/runs/<run-id>/tasks/TASK-001/final-notes.md .nightshift/runs/<run-id>/tasks/TASK-001/final-notes.md
``` ```
The useful signal is whether NightShift selected the right task, respected dependencies, ran the command stage, wrote artifacts, updated task completion, and produced a clear summary. The useful signal is whether NightShift selected the right task, respected dependencies, generated context, validated and applied a patch, ran tests, wrote artifacts, updated task completion, and produced a clear summary.

View File

@ -1086,27 +1086,27 @@ Notes:
## Phase 32: Patch Apply / Dry Run ## Phase 32: Patch Apply / Dry Run
- [ ] Add `patch_apply` stage. - [x] Add `patch_apply` stage.
- [ ] Support `mode: dry_run`. - [x] Support `mode: dry_run`.
- [ ] Support `mode: apply`. - [x] Support `mode: apply`.
- [ ] Save `applied.patch`. - [x] Save `applied.patch`.
- [ ] Preserve pre/post git status. - [x] Preserve pre/post git status.
- [ ] Fail cleanly on apply errors. - [x] Fail cleanly on apply errors.
## Phase 33: Test Feedback Repair Loop ## Phase 33: Test Feedback Repair Loop
- [ ] Feed test/static failure output back into implementer. - [x] Feed test/static failure output back into implementer.
- [ ] Add bounded repair attempts. - [x] Add bounded repair attempts.
- [ ] Save each repair patch. - [x] Save each repair patch.
- [ ] Save repair summaries. - [x] Save repair summaries.
- [ ] Stop after max retry count. - [x] Stop after max retry count.
## Phase 34: End-to-End Coding Quickstart ## Phase 34: End-to-End Coding Quickstart
- [ ] Update quickstart to modify real code. - [x] Update quickstart to modify real code.
- [ ] Include fake-agent test fixture. - [x] Include fake-agent test fixture.
- [ ] Demonstrate lookup → context pack → patch → apply → test. - [x] Demonstrate lookup → context pack → patch → apply → test.
- [ ] Document dry-run vs apply mode. - [x] Document dry-run vs apply mode.
--- ---
# Appendix A: Design Decisions and Rationale # Appendix A: Design Decisions and Rationale

View File

@ -0,0 +1,20 @@
"""Fake planner for the NightShift end-to-end quickstart."""
from __future__ import annotations
import sys
prompt = sys.stdin.read()
if "repo_lookup_results" in prompt:
print("# Plan")
print("")
print("- Use the context pack and inspected files.")
print("- Add parser functions to `lisp.py`.")
print("- Replace the smoke test with parser unit tests.")
else:
print("lookup_requests:")
print("- tool: read_file")
print(" path: lisp.py")
print("- tool: read_file")
print(" path: tests/test_lisp.py")

View File

@ -1,3 +1,4 @@
You are the implementation agent. You are the implementation agent.
Implement the smallest correct change and include tests. Output only a unified diff.
Implement the smallest correct change and include tests when needed.

View File

@ -22,7 +22,7 @@ experiment:
agents: agents:
planner: planner:
backend: command backend: command
command: echo command: python agents/fake_planner.py
system_prompt: agents/planner.md system_prompt: agents/planner.md
implementer: implementer:
@ -44,6 +44,10 @@ pipeline:
agent: planner agent: planner
output: plan.md output: plan.md
- id: context
type: repo_context
output: context-pack.md
- id: implement - id: implement
type: code_writer type: code_writer
agent: implementer agent: implementer

View File

@ -103,9 +103,9 @@ def format_validation_result(result: PatchValidationResult) -> str:
def apply_patch_with_git(patch_path: Path, project_root: str | Path, mode: str = "dry_run") -> PatchApplyResult: def apply_patch_with_git(patch_path: Path, project_root: str | Path, mode: str = "dry_run") -> PatchApplyResult:
root = resolve_project_root(project_root) root = resolve_project_root(project_root)
command = ["git", "apply", "--check", str(patch_path)] command = ["git", "apply", "--ignore-whitespace", "--check", str(patch_path)]
if mode == "apply": if mode == "apply":
command = ["git", "apply", str(patch_path)] command = ["git", "apply", "--ignore-whitespace", str(patch_path)]
completed = subprocess.run( completed = subprocess.run(
command, command,
cwd=root, cwd=root,

View File

@ -155,6 +155,8 @@ class PipelineRunner:
reason=f"Unexpected OS error while running stage: {exc}", reason=f"Unexpected OS error while running stage: {exc}",
) )
stage_results.append(result) stage_results.append(result)
if stage.id in previous_outputs:
del previous_outputs[stage.id]
previous_outputs[stage.id] = self._read_output(result.output_path) previous_outputs[stage.id] = self._read_output(result.output_path)
self.logger.event( self.logger.event(
"stage.finish", "stage.finish",

View File

@ -469,6 +469,7 @@ Acceptance Criteria:
), ),
encoding="utf-8", encoding="utf-8",
) )
test_command = 'python -c "from pathlib import Path; raise SystemExit(0 if Path(\'app.py\').read_text().strip() == \'new\' else 1)"'
stages = ( stages = (
StageConfig(id="write", type="code_writer", agent="writer"), StageConfig(id="write", type="code_writer", agent="writer"),
StageConfig(id="normalize", type="patch_normalizer"), StageConfig(id="normalize", type="patch_normalizer"),
@ -477,7 +478,7 @@ Acceptance Criteria:
StageConfig( StageConfig(
id="test", id="test",
type="command", type="command",
commands=('python -c "from pathlib import Path; raise SystemExit(0 if Path(\'app.py\').read_text() == \'new\\\\n\' else 1)"',), commands=(test_command,),
output="test-output.txt", output="test-output.txt",
on_fail="write", on_fail="write",
), ),
@ -492,7 +493,7 @@ Acceptance Criteria:
safety=SafetyConfig( safety=SafetyConfig(
require_clean_worktree=False, require_clean_worktree=False,
scoped_paths=(".",), scoped_paths=(".",),
allowed_commands=('python -c "from pathlib import Path; raise SystemExit(0 if Path(\'app.py\').read_text() == \'new\\\\n\' else 1)"',), allowed_commands=(test_command,),
forbidden_commands=("rm -rf",), forbidden_commands=("rm -rf",),
), ),
) )