mirror of
https://github.com/khodges42/nightShift.git
synced 2026-06-14 18:18:36 +00:00
improve communication, add code writing, change logo
This commit is contained in:
parent
646c655314
commit
12e2c99a75
|
|
@ -983,6 +983,8 @@ Current limitations:
|
||||||
|
|
||||||
The next major update should improve operational visibility while preserving the current artifact-first model.
|
The next major update should improve operational visibility while preserving the current artifact-first model.
|
||||||
|
|
||||||
|
Phase work is tracked in this design document by updating the relevant phase checklist and adding concise implementation notes only when a decision needs durable context. The old `docs/devlog/` phase files have been retired.
|
||||||
|
|
||||||
## Phase 23: Improved Logging and Live Visibility
|
## Phase 23: Improved Logging and Live Visibility
|
||||||
|
|
||||||
NightShift should make active runs easier to observe from both the CLI and the web dashboard.
|
NightShift should make active runs easier to observe from both the CLI and the web dashboard.
|
||||||
|
|
@ -1051,36 +1053,36 @@ Notes:
|
||||||
|
|
||||||
## Phase 28: Project Context Chart MVP
|
## Phase 28: Project Context Chart MVP
|
||||||
|
|
||||||
- [ ] Generate `.nightshift/project-context-chart.md`.
|
- [x] Generate `.nightshift/project-context-chart.md`.
|
||||||
- [ ] Include files, responsibilities, functions/classes, entry points, tests.
|
- [x] Include files, responsibilities, functions/classes, entry points, tests.
|
||||||
- [ ] Use simple regex/parser MVP.
|
- [x] Use simple regex/parser MVP.
|
||||||
- [ ] Update chart during planning.
|
- [x] Update chart during planning.
|
||||||
- [ ] Store anchors/line numbers/search terms.
|
- [x] Store anchors/line numbers/search terms.
|
||||||
|
|
||||||
## Phase 29: Code Writer Stage
|
## Phase 29: Code Writer Stage
|
||||||
|
|
||||||
- [ ] Add `code_writer` stage type.
|
- [x] Add `code_writer` stage type.
|
||||||
- [ ] Feed it task + context pack.
|
- [x] Feed it task + context pack.
|
||||||
- [ ] Require unified diff output.
|
- [x] Require unified diff output.
|
||||||
- [ ] Save `proposed.patch`.
|
- [x] Save `proposed.patch`.
|
||||||
- [ ] Save `implementation-summary.md`.
|
- [x] Save `implementation-summary.md`.
|
||||||
|
|
||||||
## Phase 30: Patch Normalization
|
## Phase 30: Patch Normalization
|
||||||
|
|
||||||
- [ ] Add `patch_normalizer` stage.
|
- [x] Add `patch_normalizer` stage.
|
||||||
- [ ] Support low-temperature formatter model.
|
- [x] Support low-temperature formatter model.
|
||||||
- [ ] Convert messy model output to valid unified diff.
|
- [x] Convert messy model output to valid unified diff.
|
||||||
- [ ] Reject missing/ambiguous edits.
|
- [x] Reject missing/ambiguous edits.
|
||||||
- [ ] Save `normalized.patch`.
|
- [x] Save `normalized.patch`.
|
||||||
|
|
||||||
## Phase 31: Patch Validation
|
## Phase 31: Patch Validation
|
||||||
|
|
||||||
- [ ] Parse unified diffs.
|
- [x] Parse unified diffs.
|
||||||
- [ ] Reject malformed patches.
|
- [x] Reject malformed patches.
|
||||||
- [ ] Enforce scoped paths.
|
- [x] Enforce scoped paths.
|
||||||
- [ ] Reject path traversal.
|
- [x] Reject path traversal.
|
||||||
- [ ] Enforce max files/max lines changed.
|
- [x] Enforce max files/max lines changed.
|
||||||
- [ ] Reject forbidden files.
|
- [x] Reject forbidden files.
|
||||||
|
|
||||||
## Phase 32: Patch Apply / Dry Run
|
## Phase 32: Patch Apply / Dry Run
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,45 +0,0 @@
|
||||||
# MVP Devlog Summary
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
The first MVP pass implemented phases 1 through 11 from `docs/vibe.md`.
|
|
||||||
|
|
||||||
## Completed Stages
|
|
||||||
|
|
||||||
- Phase 1: Python package skeleton, CLI entry point, starter project generation, and init tests.
|
|
||||||
- Phase 2: typed YAML config loading, structural validation, agent/stage reference checks, and config tests.
|
|
||||||
- Phase 3: project-root path safety, scoped path checks, artifact path safety, command allowlist checks, forbidden command fragments, and safety tests.
|
|
||||||
- Phase 4: markdown task parser, task selection helpers, useful task errors, and parser tests.
|
|
||||||
- Phase 5: artifact store, run/task directories, config and task snapshots, stage output writing, and artifact tests.
|
|
||||||
- Phase 6: command stage executor, stdout/stderr/exit code capture, output persistence, `StageResult`, and command tests.
|
|
||||||
- Phase 7: command-backed agent executor, prompt bundle construction, review output parsing, and fake-agent tests.
|
|
||||||
- Phase 8: deterministic pipeline runner, ordered stage execution, retry redirection, retry limit enforcement, CLI `run`, and pipeline tests.
|
|
||||||
- Phase 9: project/task/retry context files, agent context injection, `context-out.md`, and context tests.
|
|
||||||
- Phase 10: final task reports, stage summaries, run summaries, modified-file detection when available, and report tests.
|
|
||||||
- Phase 11: README updated to document the implemented MVP and current safety model.
|
|
||||||
|
|
||||||
## Major Decisions
|
|
||||||
|
|
||||||
- Runtime code stays dependency-light and uses the standard library where practical.
|
|
||||||
- YAML support uses PyYAML if installed, with a small fallback parser for starter configs.
|
|
||||||
- Pipelines are state machines, not DAGs.
|
|
||||||
- v1 executes one task at a time.
|
|
||||||
- Agents use the `command` backend first.
|
|
||||||
- Command stages require exact allowlist matches after whitespace normalization.
|
|
||||||
- Forbidden command fragments are checked before allowlist acceptance.
|
|
||||||
- Artifacts are markdown/text-first and are treated as product output, not debug leftovers.
|
|
||||||
- Context is compact and layered into project, task, and retry context.
|
|
||||||
|
|
||||||
## Current MVP State
|
|
||||||
|
|
||||||
NightShift can initialize a project, validate config and tasks, run a fake command-agent pipeline for one markdown task, enforce retry limits, persist artifacts, and produce reviewable summaries.
|
|
||||||
|
|
||||||
## Remaining Product Gaps
|
|
||||||
|
|
||||||
- Real local model backends are not implemented.
|
|
||||||
- `nightshift status` remains a placeholder.
|
|
||||||
- Clean-worktree enforcement is configured but not fully implemented.
|
|
||||||
- Diff patch capture is not implemented.
|
|
||||||
- Task completion mutation is not implemented.
|
|
||||||
- Dependency solving is not implemented.
|
|
||||||
- Multi-task overnight batching is not implemented.
|
|
||||||
|
|
@ -1,25 +0,0 @@
|
||||||
# Phase 1 Devlog: Skeleton
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Created the `nightshift` Python package.
|
|
||||||
- Added a CLI module with `nightshift init`, `nightshift validate`, and placeholder `run` / `status` commands.
|
|
||||||
- Added `pyproject.toml` with a console entry point.
|
|
||||||
- Added starter file generation for:
|
|
||||||
- `nightshift.yaml`
|
|
||||||
- `tasks.md`
|
|
||||||
- `agents/planner.md`
|
|
||||||
- `agents/implementer.md`
|
|
||||||
- `agents/reviewer.md`
|
|
||||||
- Added unit tests for initialization behavior.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Used `argparse` instead of a CLI dependency so the MVP works from a clean Python checkout.
|
|
||||||
- Implemented overwrite protection with a `--force` flag. Interactive confirmation was deferred to keep the command deterministic and scriptable.
|
|
||||||
- Added `run` and `status` as CLI placeholders only. The phase required an entry point, but actual execution belongs to later phases.
|
|
||||||
- Kept starter prompts short and human-readable so they can be revised easily as agent execution is implemented.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Phase 1 establishes the file layout expected by later phases without introducing model or pipeline execution behavior early.
|
|
||||||
|
|
@ -1,21 +0,0 @@
|
||||||
# Phase 10 Devlog: Reports
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/reports.py`.
|
|
||||||
- Generated final task notes.
|
|
||||||
- Generated `stage-results.md`.
|
|
||||||
- Generated run summaries.
|
|
||||||
- Included task status, retry count, final reason, acceptance criteria, stage results, artifact paths, and modified files when available.
|
|
||||||
- Wired report generation into the pipeline runner.
|
|
||||||
- Added report tests.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Report generation is separated from the pipeline runner so formatting can evolve without changing orchestration logic.
|
|
||||||
- Modified file detection uses `git status --short` when available, but report generation succeeds if Git is unavailable or rejects the repository.
|
|
||||||
- The summarize stage remains a pipeline stage artifact; Phase 10 final reports are always generated at task completion.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Reports are intentionally concise markdown. They are meant to be the morning review entry point, not a full replacement for detailed artifacts.
|
|
||||||
|
|
@ -1,23 +0,0 @@
|
||||||
# Phase 11 Devlog: README
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Rewrote `README.md` around the implemented MVP rather than the earlier planned MVP.
|
|
||||||
- Explained what NightShift is and what it is not.
|
|
||||||
- Added development install and direct module usage.
|
|
||||||
- Added quickstart commands for `init`, `validate`, `run`, and `run --task`.
|
|
||||||
- Added task file and config examples that match the current command-backed MVP.
|
|
||||||
- Documented command-backed agent behavior and review output contracts.
|
|
||||||
- Documented the current safety model.
|
|
||||||
- Documented the artifact layout created by the runner.
|
|
||||||
- Added testing commands and a concise roadmap.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Kept README focused on user-facing operation and reviewability instead of implementation internals.
|
|
||||||
- Described PyYAML as optional because the MVP has a small standard-library fallback parser for starter configs.
|
|
||||||
- Left future backend details in the roadmap rather than implying they already exist.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- The README now reflects the current MVP state through Phase 10.
|
|
||||||
|
|
@ -1,18 +0,0 @@
|
||||||
# Phase 12 Devlog: Status Command
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/status.py`.
|
|
||||||
- Implemented project status inspection for config path, project root, task counts, next runnable task, latest run directory, and warnings.
|
|
||||||
- Wired `nightshift status --config ...` into the CLI.
|
|
||||||
- Added status tests.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Status is read-only and uses existing config/task/artifact files.
|
|
||||||
- The next task is dependency-aware, so blocked tasks are not reported as runnable.
|
|
||||||
- Latest run detection is filesystem-based and uses the newest run directory by modification time.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Status warnings currently focus on dependency problems. Broader validation warnings can be added without changing the CLI shape.
|
|
||||||
|
|
@ -1,20 +0,0 @@
|
||||||
# Phase 13 Devlog: Git Safety and Diff Artifacts
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/git.py`.
|
|
||||||
- Implemented clean-worktree enforcement when `require_clean_worktree` is true.
|
|
||||||
- Captured pre-run and post-run git status artifacts.
|
|
||||||
- Wrote per-task `diff.patch` artifacts.
|
|
||||||
- Handled non-git repositories and git failures gracefully when clean worktree is not required.
|
|
||||||
- Added git tests with temporary repositories.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Clean-worktree enforcement runs before artifact creation so NightShift does not dirty a repo before checking it.
|
|
||||||
- If clean worktree is required and git status cannot be read, execution fails safely.
|
|
||||||
- Diff artifacts are written even when git is unavailable, with a readable explanation instead of crashing.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Existing final reports already include modified files when git status is available.
|
|
||||||
|
|
@ -1,19 +0,0 @@
|
||||||
# Phase 14 Devlog: Task Completion Updates
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added task-file mutation helper to mark successful tasks complete.
|
|
||||||
- Successful runs update the target task from `[ ]` to `[x]`.
|
|
||||||
- Failed runs leave tasks incomplete.
|
|
||||||
- Added `task-completion.md` artifacts recording the completion decision.
|
|
||||||
- Added tests for task completion mutation and pipeline completion artifacts.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Task completion uses a minimal line edit instead of rewriting the parsed task file.
|
|
||||||
- Already-completed tasks are treated as no-op updates.
|
|
||||||
- Completion happens before final report generation so reports can include task-file changes when git status is available.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- More advanced task-file formatting preservation can be revisited if broader markdown support is added.
|
|
||||||
|
|
@ -1,22 +0,0 @@
|
||||||
# Phase 15 Devlog: Multi-Task Run Mode
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift run --all`.
|
|
||||||
- Added `PipelineRunner.run_tasks()`.
|
|
||||||
- Processes incomplete tasks in file order.
|
|
||||||
- Reuses one artifact store/run directory for the batch.
|
|
||||||
- Stops on first failure by default.
|
|
||||||
- Added `pipeline.continue_on_task_failure` config support, defaulting to false.
|
|
||||||
- Writes aggregate run summaries with completed and failed counts.
|
|
||||||
- Added multi-task tests.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- `--all` and `--task` are mutually exclusive.
|
|
||||||
- Failed and blocked tasks count as failed in aggregate summaries.
|
|
||||||
- The default remains conservative: stop on first failure unless explicitly configured otherwise.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Multi-task mode is still sequential. Parallel execution remains out of scope.
|
|
||||||
|
|
@ -1,21 +0,0 @@
|
||||||
# Phase 16 Devlog: Dependency Handling
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Parsed existing `Dependencies:` bullets into dependency lists.
|
|
||||||
- Added dependency validation for missing references and simple cycles.
|
|
||||||
- Added dependency-aware next-task selection.
|
|
||||||
- Blocked specific task runs when dependencies are incomplete.
|
|
||||||
- Blocked multi-task entries when dependencies are not satisfied by completed or earlier successful tasks.
|
|
||||||
- Reported dependency warnings through status.
|
|
||||||
- Added dependency tests.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Dependencies are simple task IDs listed as bullets under `Dependencies:`.
|
|
||||||
- Dependency enforcement is deterministic and follows task file order.
|
|
||||||
- Missing references and cycles are validation errors; incomplete dependencies are runtime blockers.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- No dependency solver or reordering is implemented. File order remains the source of execution order.
|
|
||||||
|
|
@ -1,21 +0,0 @@
|
||||||
# Phase 17 Devlog: Local Model Backend
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added first-class `backend: ollama` agent config support.
|
|
||||||
- Required `model` for Ollama agents.
|
|
||||||
- Kept `backend: command` unchanged.
|
|
||||||
- Reused the existing prompt bundle for Ollama.
|
|
||||||
- Invoked Ollama as `ollama run <model>` with prompt input on stdin.
|
|
||||||
- Persisted Ollama responses through the same agent artifact format.
|
|
||||||
- Added tests with mocked subprocess calls so Ollama is not required.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Ollama is implemented as a local subprocess backend instead of an HTTP API wrapper.
|
|
||||||
- Missing Ollama executable returns a failed agent invocation artifact rather than crashing.
|
|
||||||
- Backend artifacts remain comparable across command and Ollama agents.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Real model quality and model availability are user environment concerns; tests do not require a running Ollama daemon.
|
|
||||||
|
|
@ -1,18 +0,0 @@
|
||||||
# Phase 18 Devlog: Prompt and Pipeline Experiments
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added optional `experiment.label` and `experiment.prompt_variant` config fields.
|
|
||||||
- Snapshotted agent prompt files into `runs/<run-id>/prompts/`.
|
|
||||||
- Wrote `run-metadata.md` with project, experiment, agent backend, model, command, and prompt metadata.
|
|
||||||
- Included experiment metadata in final task reports and run summaries.
|
|
||||||
- Added tests for experiment config loading and prompt/metadata artifact creation.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Experiment metadata is descriptive only and does not alter execution semantics.
|
|
||||||
- Prompt snapshots are per-run, not per-task, because agent definitions are run-level configuration.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- This creates enough metadata to compare prompt/backend runs from artifacts without adding a database.
|
|
||||||
|
|
@ -1,20 +0,0 @@
|
||||||
# Phase 19 Devlog: Stronger Command Execution
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added command stage `shell` option, defaulting to true for backward compatibility.
|
|
||||||
- Added command stage `timeout_seconds` override.
|
|
||||||
- Added command stage `working_dir` restricted to the project root.
|
|
||||||
- Added `safety.allowed_env` for optional environment variable pass-through.
|
|
||||||
- Added argv-style execution path when `shell: false`.
|
|
||||||
- Added tests for shell-free execution and working-directory restrictions.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Existing string command config remains valid.
|
|
||||||
- `shell: false` still uses the same exact allowlist check before splitting into argv.
|
|
||||||
- `PATH` is preserved when an environment allowlist is configured so common executables remain discoverable.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Future hardening can move toward structured command definitions, but this phase avoids breaking current configs.
|
|
||||||
|
|
@ -1,29 +0,0 @@
|
||||||
# Phase 2 Devlog: Config Loading
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added typed configuration objects for project, safety, agents, pipeline, and stages.
|
|
||||||
- Added `load_config()` for parsing `nightshift.yaml`.
|
|
||||||
- Added `validate_config()` for checking referenced task and prompt files.
|
|
||||||
- Added validation for:
|
|
||||||
- required top-level sections
|
|
||||||
- required project fields
|
|
||||||
- non-empty agents
|
|
||||||
- supported stage types
|
|
||||||
- agent stage references
|
|
||||||
- command stage command lists
|
|
||||||
- duplicate stage IDs
|
|
||||||
- `on_fail` references
|
|
||||||
- Added unit tests for valid config loading and key invalid config cases.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Used PyYAML automatically when available, but added a small standard-library fallback parser for the YAML subset emitted by `nightshift init`.
|
|
||||||
- Deferred full YAML edge-case support to a future dependency/install pass. The fallback is intentionally documented as a starter-config parser, not a general YAML implementation.
|
|
||||||
- Validation currently confirms that scoped paths resolve inside the project root, but it does not require every scoped path to already exist. That allows users to scaffold configs before creating all source/test directories.
|
|
||||||
- Kept config validation focused on structural correctness and references. Command safety enforcement is left for Phase 3.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- The config layer now catches missing agent references with explicit messages such as `pipeline stage 'plan' references unknown agent 'critic'`.
|
|
||||||
- Tests use `unittest` from the standard library so they can run before development dependencies are introduced.
|
|
||||||
|
|
@ -1,19 +0,0 @@
|
||||||
# Phase 20 Devlog: Documentation and Examples Refresh
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `docs/config-reference.md`.
|
|
||||||
- Added `docs/artifact-review.md`.
|
|
||||||
- Added `docs/troubleshooting.md`.
|
|
||||||
- Added a complete `examples/quickstart-lisp/` project.
|
|
||||||
- Updated quickstart docs to point users at the example project.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Documentation now distinguishes command and Ollama agent backends.
|
|
||||||
- The example project uses fake command agents so it can run without external services.
|
|
||||||
- The quickstart Lisp project is included as a target repo example rather than baked into NightShift runtime behavior.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- The example is intended for pipeline testing and artifact review, not as a full Lisp implementation.
|
|
||||||
|
|
@ -1,21 +0,0 @@
|
||||||
# Phase 21 Devlog: Read-Only Web Dashboard
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/web.py`.
|
|
||||||
- Added `nightshift web` CLI command.
|
|
||||||
- Implemented read-only artifact dashboard rendering.
|
|
||||||
- Listed runs from `.nightshift/runs/`.
|
|
||||||
- Rendered run summaries with simple auto-refresh.
|
|
||||||
- Added safe artifact reading that rejects path traversal.
|
|
||||||
- Added tests for missing runs, run listing, and artifact path handling.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Flask is an optional dependency. The CLI gives a clear error if Flask is missing.
|
|
||||||
- The dashboard is artifact-driven and does not control pipeline execution.
|
|
||||||
- No websockets, authentication, mutation, or live process control were added.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- This is intentionally a monitoring entry point, not an operations console.
|
|
||||||
|
|
@ -1,19 +0,0 @@
|
||||||
# Phase 22 Devlog: Quickstart Test Project
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added a guided Lisp interpreter quickstart project to `QUICKSTART.md`.
|
|
||||||
- Added concrete quickstart project files under `examples/quickstart-lisp/`.
|
|
||||||
- Included multi-task `tasks.md` with dependencies.
|
|
||||||
- Included a matching `nightshift.yaml`.
|
|
||||||
- Included planner, implementer, and reviewer prompt files.
|
|
||||||
- Included an initial passing unittest smoke test.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Kept the Lisp interpreter as the recommended test project because it is compact, incremental, and testable.
|
|
||||||
- Fake agents are used in the example so users can validate NightShift before connecting a real local model or coding agent.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Users can copy `examples/quickstart-lisp/` to a scratch directory and run `nightshift validate`, `nightshift status`, and `nightshift run --all`.
|
|
||||||
|
|
@ -1,24 +0,0 @@
|
||||||
# Phase 3 Devlog: Safety Layer
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/safety.py`.
|
|
||||||
- Implemented project root resolution.
|
|
||||||
- Implemented path resolution that rejects traversal outside the configured project root.
|
|
||||||
- Implemented scoped path validation.
|
|
||||||
- Implemented safe artifact path construction that rejects escapes from the artifact directory.
|
|
||||||
- Implemented command allowlist checks.
|
|
||||||
- Implemented forbidden command fragment checks.
|
|
||||||
- Wired command and path safety checks into `validate_config()`.
|
|
||||||
- Added tests for path traversal, artifact escapes, allowlist behavior, and forbidden command fragments.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Command matching uses normalized whitespace and exact allowlist entries. This keeps v1 predictable while still handling harmless spacing differences.
|
|
||||||
- Forbidden fragments are checked before allowlist acceptance, so a dangerous command cannot be made valid by adding it to `allowed_commands`.
|
|
||||||
- Scoped paths are validated for containment inside the project root, but they are not required to exist yet. This preserves the Phase 2 decision that configs can be scaffolded before all source directories exist.
|
|
||||||
- The safety layer raises `SafetyError`; config validation wraps those failures as config errors when they come from `nightshift validate`.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- This phase does not execute commands. It only validates whether a command would be permitted. Process execution belongs to Phase 6.
|
|
||||||
|
|
@ -1,22 +0,0 @@
|
||||||
# Phase 4 Devlog: Task Parser
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/tasks.py`.
|
|
||||||
- Implemented parsing for documented markdown checklist tasks.
|
|
||||||
- Extracted task id, title, completion state, description, acceptance criteria, dependency bullets, raw task markdown, and source line number.
|
|
||||||
- Added selection of the next incomplete task.
|
|
||||||
- Added selection of a specific task id.
|
|
||||||
- Added useful errors for malformed task headers, duplicate ids, missing acceptance criteria, missing files, traversal attempts, and unknown task ids.
|
|
||||||
- Added parser and selection tests.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- The parser intentionally supports the documented v1 format rather than broad Markdown. This keeps failure behavior explicit and testable.
|
|
||||||
- Acceptance criteria are required for each task because downstream pipeline stages need concrete review targets.
|
|
||||||
- Dependencies are parsed as simple bullets under a `Dependencies:` section, but no dependency solver is implemented in this phase.
|
|
||||||
- Completed tasks use `[x]` or `[X]`; incomplete tasks use `[ ]`.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Task mutation, completion updates, and dependency enforcement are deferred until later pipeline phases.
|
|
||||||
|
|
@ -1,24 +0,0 @@
|
||||||
# Phase 5 Devlog: Artifact Store
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/artifacts.py`.
|
|
||||||
- Created `.nightshift/`, per-run directories, and per-task directories.
|
|
||||||
- Created `project-context.md` and `run-summary.md` placeholders when a run is initialized.
|
|
||||||
- Added config snapshot copying to `config.snapshot.yaml`.
|
|
||||||
- Added task snapshot writing to `task.md`.
|
|
||||||
- Added generic stage output writing.
|
|
||||||
- Added command output writing.
|
|
||||||
- Added final task notes writing.
|
|
||||||
- Added tests for artifact tree creation, snapshot writing, and task-directory escape rejection.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- `ArtifactStore` accepts an optional `run_id` so tests and future pipeline code can produce deterministic artifact paths.
|
|
||||||
- Default run ids use UTC timestamps in `YYYYMMDDTHHMMSSZ` format.
|
|
||||||
- Stage output filenames are relative to the task artifact directory and may include subdirectories, but they cannot escape that task directory.
|
|
||||||
- Project context and run summary files are initialized with simple markdown headers. Later phases can append richer content.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- The artifact store is intentionally independent from pipeline execution so command, agent, context, and report phases can reuse it.
|
|
||||||
|
|
@ -1,21 +0,0 @@
|
||||||
# Phase 6 Devlog: Command Executor
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/commands.py`.
|
|
||||||
- Added command-stage execution for configured `command` stages.
|
|
||||||
- Captured stdout, stderr, exit code, duration, and timeout state.
|
|
||||||
- Persisted command transcripts through the artifact store.
|
|
||||||
- Returned structured `StageResult` objects.
|
|
||||||
- Added tests for passing commands, failing commands, output persistence, and allowlist rejection.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Commands are validated through the Phase 3 safety layer immediately before execution, even though config validation also checks them. This keeps command execution safe if called directly in later code.
|
|
||||||
- Command stages stop at the first failing or timed-out command and persist the commands that ran.
|
|
||||||
- Commands run with `shell=True` because v1 config stores commands as shell-style strings. This is constrained by exact allowlist matching and forbidden fragment checks.
|
|
||||||
- The default timeout is 300 seconds. Tests can override it later if timeout-specific behavior needs coverage.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- This phase does not wire command execution into a full pipeline runner. That belongs to Phase 8.
|
|
||||||
|
|
@ -1,23 +0,0 @@
|
||||||
# Phase 7 Devlog: Agent Executor
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/agents.py`.
|
|
||||||
- Implemented the v1 `command` backend for agents.
|
|
||||||
- Loaded system prompt files through project-root-safe path resolution.
|
|
||||||
- Built compact prompt bundles containing system prompt, task markdown, acceptance criteria, project context, previous stage output, retry notes, and output contract.
|
|
||||||
- Passed prompt bundles to command agents on stdin.
|
|
||||||
- Captured stdout, stderr, exit code, duration, and timeout state.
|
|
||||||
- Persisted agent output and prompt artifacts through the artifact store.
|
|
||||||
- Parsed structured review-agent output into `StageResult`.
|
|
||||||
- Added fake-agent tests.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Agent commands are command strings and run with `shell=True`, matching the Phase 6 command-string model. Unlike validation/test commands, agent commands are configured agent backends rather than allowlisted project commands.
|
|
||||||
- Agent stages pass when the command exits successfully. Review stages must emit a valid `status:` field or they fail.
|
|
||||||
- Prompt artifacts include the exact prompt sent to the agent to support auditability and prompt debugging.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Only the `command` backend is implemented. Ollama, Codex CLI, Claude Code, and API backends remain future integrations.
|
|
||||||
|
|
@ -1,25 +0,0 @@
|
||||||
# Phase 8 Devlog: Pipeline Runner
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/pipeline.py`.
|
|
||||||
- Executed configured stages in order for one task.
|
|
||||||
- Supported agent, agent-review, command, and summarize stages.
|
|
||||||
- Stopped on unrecoverable stage failure.
|
|
||||||
- Supported `on_fail` redirection and review-provided `next_stage` redirection.
|
|
||||||
- Tracked retry count per task.
|
|
||||||
- Enforced `pipeline.max_task_retries`.
|
|
||||||
- Wrote task snapshots, config snapshots, per-stage outputs, stage summaries, final task notes, and run summary.
|
|
||||||
- Wired `nightshift run --task TASK-001` into the CLI.
|
|
||||||
- Added tests for happy-path pipeline execution and retry-limit enforcement.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- `on_fail` takes precedence over review-provided `next_stage` because it is deterministic config controlled by the user.
|
|
||||||
- Retry count increments when a failing stage redirects to another stage. Once the configured maximum is reached, the task fails.
|
|
||||||
- The summarize stage writes a simple artifact from known stage outputs and retry notes. Rich report generation remains Phase 10.
|
|
||||||
- Pipeline execution runs one task at a time, matching the v1 constraint.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- The runner is now sufficient for fake command-agent pipelines. Context management and fuller reports are still deferred to later phases.
|
|
||||||
|
|
@ -1,22 +0,0 @@
|
||||||
# Phase 9 Devlog: Context Manager
|
|
||||||
|
|
||||||
## Implemented
|
|
||||||
|
|
||||||
- Added `nightshift/context.py`.
|
|
||||||
- Created project context files when absent.
|
|
||||||
- Created per-task `context.md` files.
|
|
||||||
- Added compact task context with task id, title, description, and acceptance criteria.
|
|
||||||
- Passed project context, task context, and retry context into agent prompt bundles.
|
|
||||||
- Persisted `context-out.md` after task execution.
|
|
||||||
- Included review `context_update` values in retry/context output notes.
|
|
||||||
- Added context manager tests and prompt coverage for task/retry context.
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- Context files are plain markdown artifacts so they remain readable and easy to edit.
|
|
||||||
- Retry context is built from compact retry notes rather than full previous transcripts.
|
|
||||||
- Durable project-context bubbling is implemented as an explicit helper, but the pipeline does not automatically append every task detail into project context.
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- Later phases can decide which completed-task facts are worth promoting into project context.
|
|
||||||
Binary file not shown.
|
Before Width: | Height: | Size: 1.3 MiB After Width: | Height: | Size: 1.1 MiB |
|
|
@ -409,6 +409,22 @@ def _extract_openai_content(raw: str) -> str:
|
||||||
|
|
||||||
|
|
||||||
def output_contract_for(stage: StageConfig) -> str:
|
def output_contract_for(stage: StageConfig) -> str:
|
||||||
|
if stage.type == "code_writer":
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"Return a unified diff only, suitable for saving as proposed.patch.",
|
||||||
|
"Do not include prose outside the patch.",
|
||||||
|
"Use diff --git headers and hunk headers.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if stage.type == "patch_normalizer":
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"Convert the supplied patch-like content to one valid unified diff.",
|
||||||
|
"Return only the normalized patch.",
|
||||||
|
"If the edit is missing or ambiguous, say that no valid unified diff can be produced.",
|
||||||
|
]
|
||||||
|
)
|
||||||
if stage.type in {"agent_review", "review"}:
|
if stage.type in {"agent_review", "review"}:
|
||||||
return "\n".join(
|
return "\n".join(
|
||||||
[
|
[
|
||||||
|
|
|
||||||
|
|
@ -37,6 +37,7 @@ class ArtifactStore:
|
||||||
self.run_dir = self._artifact_path("runs", self.run_id)
|
self.run_dir = self._artifact_path("runs", self.run_id)
|
||||||
self.tasks_dir = self.run_dir / "tasks"
|
self.tasks_dir = self.run_dir / "tasks"
|
||||||
self.project_context_path = self.artifact_root / "project-context.md"
|
self.project_context_path = self.artifact_root / "project-context.md"
|
||||||
|
self.project_context_chart_path = self.artifact_root / "project-context-chart.md"
|
||||||
self.run_summary_path = self.run_dir / "run-summary.md"
|
self.run_summary_path = self.run_dir / "run-summary.md"
|
||||||
self.config_snapshot_path = self.run_dir / "config.snapshot.yaml"
|
self.config_snapshot_path = self.run_dir / "config.snapshot.yaml"
|
||||||
self.run_log_path = self.run_dir / "run.log"
|
self.run_log_path = self.run_dir / "run.log"
|
||||||
|
|
|
||||||
|
|
@ -59,6 +59,10 @@ class StageConfig:
|
||||||
shell: bool = True
|
shell: bool = True
|
||||||
timeout_seconds: int | None = None
|
timeout_seconds: int | None = None
|
||||||
working_dir: Path | None = None
|
working_dir: Path | None = None
|
||||||
|
max_files: int | None = None
|
||||||
|
max_lines: int | None = None
|
||||||
|
forbidden_paths: tuple[str, ...] = ()
|
||||||
|
mode: str | None = None
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
|
|
@ -86,7 +90,14 @@ class NightShiftConfig:
|
||||||
|
|
||||||
AGENT_STAGE_TYPES = {"agent", "agent_review", "review"}
|
AGENT_STAGE_TYPES = {"agent", "agent_review", "review"}
|
||||||
COMMAND_STAGE_TYPES = {"command"}
|
COMMAND_STAGE_TYPES = {"command"}
|
||||||
SUPPORTED_STAGE_TYPES = AGENT_STAGE_TYPES | COMMAND_STAGE_TYPES | {"repo_context", "summarize"}
|
SUPPORTED_STAGE_TYPES = AGENT_STAGE_TYPES | COMMAND_STAGE_TYPES | {
|
||||||
|
"code_writer",
|
||||||
|
"patch_normalizer",
|
||||||
|
"patch_apply",
|
||||||
|
"patch_validator",
|
||||||
|
"repo_context",
|
||||||
|
"summarize",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def load_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
|
def load_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
|
||||||
|
|
@ -282,6 +293,17 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
|
||||||
if timeout_seconds is not None and timeout_seconds <= 0:
|
if timeout_seconds is not None and timeout_seconds <= 0:
|
||||||
raise ConfigError(f"Config error: {stage_context}.timeout_seconds must be greater than zero.")
|
raise ConfigError(f"Config error: {stage_context}.timeout_seconds must be greater than zero.")
|
||||||
working_dir_raw = _optional_string(stage_raw.get("working_dir"), f"{stage_context}.working_dir")
|
working_dir_raw = _optional_string(stage_raw.get("working_dir"), f"{stage_context}.working_dir")
|
||||||
|
max_files = _optional_int_or_none(stage_raw.get("max_files"), f"{stage_context}.max_files")
|
||||||
|
max_lines = _optional_int_or_none(stage_raw.get("max_lines"), f"{stage_context}.max_lines")
|
||||||
|
if max_files is not None and max_files <= 0:
|
||||||
|
raise ConfigError(f"Config error: {stage_context}.max_files must be greater than zero.")
|
||||||
|
if max_lines is not None and max_lines <= 0:
|
||||||
|
raise ConfigError(f"Config error: {stage_context}.max_lines must be greater than zero.")
|
||||||
|
mode = _optional_string(stage_raw.get("mode"), f"{stage_context}.mode")
|
||||||
|
if stage_type == "patch_apply" and mode not in {None, "dry_run", "apply"}:
|
||||||
|
raise ConfigError(
|
||||||
|
f"Config error: {stage_context}.mode must be 'dry_run' or 'apply'."
|
||||||
|
)
|
||||||
|
|
||||||
if stage_type in AGENT_STAGE_TYPES:
|
if stage_type in AGENT_STAGE_TYPES:
|
||||||
if agent is None:
|
if agent is None:
|
||||||
|
|
@ -292,6 +314,21 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
|
||||||
f"Config error: pipeline stage '{stage_id}' references unknown agent "
|
f"Config error: pipeline stage '{stage_id}' references unknown agent "
|
||||||
f"'{agent}'. Defined agents: {defined}."
|
f"'{agent}'. Defined agents: {defined}."
|
||||||
)
|
)
|
||||||
|
if stage_type == "code_writer":
|
||||||
|
if agent is None:
|
||||||
|
raise ConfigError(f"Config error: code_writer stage '{stage_id}' must reference an agent.")
|
||||||
|
if agent not in agents:
|
||||||
|
defined = ", ".join(sorted(agents))
|
||||||
|
raise ConfigError(
|
||||||
|
f"Config error: pipeline stage '{stage_id}' references unknown agent "
|
||||||
|
f"'{agent}'. Defined agents: {defined}."
|
||||||
|
)
|
||||||
|
if stage_type == "patch_normalizer" and agent is not None and agent not in agents:
|
||||||
|
defined = ", ".join(sorted(agents))
|
||||||
|
raise ConfigError(
|
||||||
|
f"Config error: pipeline stage '{stage_id}' references unknown agent "
|
||||||
|
f"'{agent}'. Defined agents: {defined}."
|
||||||
|
)
|
||||||
|
|
||||||
if stage_type in COMMAND_STAGE_TYPES and not commands:
|
if stage_type in COMMAND_STAGE_TYPES and not commands:
|
||||||
raise ConfigError(f"Config error: command stage '{stage_id}' must define commands.")
|
raise ConfigError(f"Config error: command stage '{stage_id}' must define commands.")
|
||||||
|
|
@ -311,6 +348,13 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
|
||||||
shell=_optional_bool(stage_raw.get("shell", True), f"{stage_context}.shell"),
|
shell=_optional_bool(stage_raw.get("shell", True), f"{stage_context}.shell"),
|
||||||
timeout_seconds=timeout_seconds,
|
timeout_seconds=timeout_seconds,
|
||||||
working_dir=Path(working_dir_raw) if working_dir_raw else None,
|
working_dir=Path(working_dir_raw) if working_dir_raw else None,
|
||||||
|
max_files=max_files,
|
||||||
|
max_lines=max_lines,
|
||||||
|
forbidden_paths=_string_tuple(
|
||||||
|
stage_raw.get("forbidden_paths", []),
|
||||||
|
f"{stage_context}.forbidden_paths",
|
||||||
|
),
|
||||||
|
mode=mode,
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
|
||||||
221
nightshift/patches.py
Normal file
221
nightshift/patches.py
Normal file
|
|
@ -0,0 +1,221 @@
|
||||||
|
"""Unified diff extraction and validation."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
from .config import SafetyConfig
|
||||||
|
from .errors import PipelineError, SafetyError
|
||||||
|
from .safety import resolve_inside_root, resolve_project_root, validate_scoped_paths
|
||||||
|
|
||||||
|
|
||||||
|
DEFAULT_MAX_FILES = 20
|
||||||
|
DEFAULT_MAX_CHANGED_LINES = 2000
|
||||||
|
DEFAULT_FORBIDDEN_PATHS = (".git", ".nightshift", ".env")
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class PatchValidationResult:
|
||||||
|
files: tuple[str, ...]
|
||||||
|
changed_lines: int
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class PatchApplyResult:
|
||||||
|
status: str
|
||||||
|
command: str
|
||||||
|
exit_code: int
|
||||||
|
stdout: str
|
||||||
|
stderr: str
|
||||||
|
mode: str
|
||||||
|
|
||||||
|
|
||||||
|
def extract_unified_diff(text: str) -> str:
|
||||||
|
fenced = re.search(r"```(?:diff|patch)?\s*\n(.*?)```", text, flags=re.DOTALL | re.IGNORECASE)
|
||||||
|
candidate = fenced.group(1) if fenced else text
|
||||||
|
lines = candidate.splitlines()
|
||||||
|
start = next((index for index, line in enumerate(lines) if line.startswith("diff --git ")), None)
|
||||||
|
if start is None:
|
||||||
|
start = next((index for index, line in enumerate(lines) if line.startswith("--- ")), None)
|
||||||
|
if start is None:
|
||||||
|
raise PipelineError("Patch error: no unified diff found.")
|
||||||
|
patch = "\n".join(lines[start:]).strip()
|
||||||
|
if not patch:
|
||||||
|
raise PipelineError("Patch error: unified diff is empty.")
|
||||||
|
return patch + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_patch_text(text: str) -> str:
|
||||||
|
patch = extract_unified_diff(text)
|
||||||
|
if "@@" not in patch:
|
||||||
|
raise PipelineError("Patch error: unified diff has no hunks.")
|
||||||
|
return patch
|
||||||
|
|
||||||
|
|
||||||
|
def validate_patch(
|
||||||
|
patch: str,
|
||||||
|
project_root: str | Path,
|
||||||
|
safety: SafetyConfig,
|
||||||
|
max_files: int = DEFAULT_MAX_FILES,
|
||||||
|
max_changed_lines: int = DEFAULT_MAX_CHANGED_LINES,
|
||||||
|
forbidden_paths: tuple[str, ...] = DEFAULT_FORBIDDEN_PATHS,
|
||||||
|
) -> PatchValidationResult:
|
||||||
|
root = resolve_project_root(project_root)
|
||||||
|
scoped_roots = validate_scoped_paths(root, safety.scoped_paths or (".",))
|
||||||
|
files = _patch_files(patch)
|
||||||
|
if not files:
|
||||||
|
raise PipelineError("Patch validation failed: no changed files found.")
|
||||||
|
if len(files) > max_files:
|
||||||
|
raise PipelineError(f"Patch validation failed: touches {len(files)} files, max is {max_files}.")
|
||||||
|
|
||||||
|
changed_lines = _changed_line_count(patch)
|
||||||
|
if changed_lines <= 0:
|
||||||
|
raise PipelineError("Patch validation failed: patch has no changed lines.")
|
||||||
|
if changed_lines > max_changed_lines:
|
||||||
|
raise PipelineError(
|
||||||
|
f"Patch validation failed: changes {changed_lines} lines, max is {max_changed_lines}."
|
||||||
|
)
|
||||||
|
|
||||||
|
for path_text in files:
|
||||||
|
_validate_patch_path(path_text, root, scoped_roots, forbidden_paths)
|
||||||
|
return PatchValidationResult(files=tuple(sorted(files)), changed_lines=changed_lines)
|
||||||
|
|
||||||
|
|
||||||
|
def format_validation_result(result: PatchValidationResult) -> str:
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"# Patch Validation",
|
||||||
|
"",
|
||||||
|
"Status: pass",
|
||||||
|
f"Changed files: {len(result.files)}",
|
||||||
|
f"Changed lines: {result.changed_lines}",
|
||||||
|
"",
|
||||||
|
"## Files",
|
||||||
|
"",
|
||||||
|
*[f"- `{path}`" for path in result.files],
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def apply_patch_with_git(patch_path: Path, project_root: str | Path, mode: str = "dry_run") -> PatchApplyResult:
|
||||||
|
root = resolve_project_root(project_root)
|
||||||
|
command = ["git", "apply", "--check", str(patch_path)]
|
||||||
|
if mode == "apply":
|
||||||
|
command = ["git", "apply", str(patch_path)]
|
||||||
|
completed = subprocess.run(
|
||||||
|
command,
|
||||||
|
cwd=root,
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
encoding="utf-8",
|
||||||
|
errors="replace",
|
||||||
|
)
|
||||||
|
return PatchApplyResult(
|
||||||
|
status="pass" if completed.returncode == 0 else "fail",
|
||||||
|
command=" ".join(command),
|
||||||
|
exit_code=completed.returncode,
|
||||||
|
stdout=completed.stdout or "",
|
||||||
|
stderr=completed.stderr or "",
|
||||||
|
mode=mode,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def format_patch_apply_result(result: PatchApplyResult, patch_path: str) -> str:
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"# Patch Apply",
|
||||||
|
"",
|
||||||
|
f"Status: {result.status}",
|
||||||
|
f"Mode: {result.mode}",
|
||||||
|
f"Patch: `{patch_path}`",
|
||||||
|
f"Command: `{result.command}`",
|
||||||
|
f"Exit code: {result.exit_code}",
|
||||||
|
"",
|
||||||
|
"## stdout",
|
||||||
|
"",
|
||||||
|
"```text",
|
||||||
|
result.stdout.rstrip(),
|
||||||
|
"```",
|
||||||
|
"",
|
||||||
|
"## stderr",
|
||||||
|
"",
|
||||||
|
"```text",
|
||||||
|
result.stderr.rstrip(),
|
||||||
|
"```",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _patch_files(patch: str) -> set[str]:
|
||||||
|
files: set[str] = set()
|
||||||
|
saw_hunk = False
|
||||||
|
for line in patch.splitlines():
|
||||||
|
if line.startswith("@@"):
|
||||||
|
saw_hunk = True
|
||||||
|
if line.startswith("diff --git "):
|
||||||
|
parts = line.split()
|
||||||
|
if len(parts) >= 4:
|
||||||
|
files.add(_strip_prefix(parts[3]))
|
||||||
|
elif line.startswith("+++ "):
|
||||||
|
target = line[4:].strip()
|
||||||
|
if target != "/dev/null":
|
||||||
|
files.add(_strip_prefix(target))
|
||||||
|
elif line.startswith("--- "):
|
||||||
|
source = line[4:].strip()
|
||||||
|
if source != "/dev/null":
|
||||||
|
files.add(_strip_prefix(source))
|
||||||
|
if not saw_hunk:
|
||||||
|
raise PipelineError("Patch validation failed: unified diff has no hunk headers.")
|
||||||
|
return {path for path in files if path}
|
||||||
|
|
||||||
|
|
||||||
|
def _changed_line_count(patch: str) -> int:
|
||||||
|
count = 0
|
||||||
|
for line in patch.splitlines():
|
||||||
|
if line.startswith(("+++", "---")):
|
||||||
|
continue
|
||||||
|
if line.startswith(("+", "-")):
|
||||||
|
count += 1
|
||||||
|
return count
|
||||||
|
|
||||||
|
|
||||||
|
def _validate_patch_path(
|
||||||
|
path_text: str,
|
||||||
|
root: Path,
|
||||||
|
scoped_roots: tuple[Path, ...],
|
||||||
|
forbidden_paths: tuple[str, ...],
|
||||||
|
) -> None:
|
||||||
|
path = Path(path_text)
|
||||||
|
if path.is_absolute() or ".." in path.parts:
|
||||||
|
raise PipelineError(f"Patch validation failed: unsafe path `{path_text}`.")
|
||||||
|
normalized = path.as_posix()
|
||||||
|
for forbidden in forbidden_paths:
|
||||||
|
forbidden_path = forbidden.strip("/\\")
|
||||||
|
if normalized == forbidden_path or normalized.startswith(forbidden_path + "/"):
|
||||||
|
raise PipelineError(f"Patch validation failed: forbidden path `{path_text}`.")
|
||||||
|
try:
|
||||||
|
resolved = resolve_inside_root(root, path, f"patch path '{path_text}'")
|
||||||
|
except SafetyError as exc:
|
||||||
|
raise PipelineError(f"Patch validation failed: {exc}") from exc
|
||||||
|
for scoped_root in scoped_roots:
|
||||||
|
try:
|
||||||
|
resolved.relative_to(scoped_root)
|
||||||
|
return
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
scopes = ", ".join(item.relative_to(root).as_posix() for item in scoped_roots)
|
||||||
|
raise PipelineError(
|
||||||
|
f"Patch validation failed: path `{path_text}` is outside scoped paths: {scopes}."
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_prefix(path_text: str) -> str:
|
||||||
|
path = path_text.strip()
|
||||||
|
if path.startswith(("a/", "b/")):
|
||||||
|
return path[2:]
|
||||||
|
return path
|
||||||
|
|
@ -14,6 +14,18 @@ from .context import ContextManager
|
||||||
from .errors import PipelineError
|
from .errors import PipelineError
|
||||||
from .errors import NightShiftError
|
from .errors import NightShiftError
|
||||||
from .git import ensure_clean_worktree, write_diff_artifact, write_git_artifacts
|
from .git import ensure_clean_worktree, write_diff_artifact, write_git_artifacts
|
||||||
|
from .patches import (
|
||||||
|
DEFAULT_FORBIDDEN_PATHS,
|
||||||
|
DEFAULT_MAX_CHANGED_LINES,
|
||||||
|
DEFAULT_MAX_FILES,
|
||||||
|
apply_patch_with_git,
|
||||||
|
extract_unified_diff,
|
||||||
|
format_patch_apply_result,
|
||||||
|
format_validation_result,
|
||||||
|
normalize_patch_text,
|
||||||
|
validate_patch,
|
||||||
|
)
|
||||||
|
from .project_chart import build_project_context_chart
|
||||||
from .reports import ReportGenerator
|
from .reports import ReportGenerator
|
||||||
from .repo_tools import RepoTools, extract_agent_stdout, parse_lookup_requests
|
from .repo_tools import RepoTools, extract_agent_stdout, parse_lookup_requests
|
||||||
from .runlog import RunLogger
|
from .runlog import RunLogger
|
||||||
|
|
@ -102,6 +114,7 @@ class PipelineRunner:
|
||||||
)
|
)
|
||||||
self.artifacts.write_run_metadata(format_run_metadata(self.config))
|
self.artifacts.write_run_metadata(format_run_metadata(self.config))
|
||||||
self.artifacts.write_task_snapshot(task)
|
self.artifacts.write_task_snapshot(task)
|
||||||
|
self._write_project_context_chart()
|
||||||
write_git_artifacts(self.artifacts, task.id, "before")
|
write_git_artifacts(self.artifacts, task.id, "before")
|
||||||
self.context.ensure_project_context()
|
self.context.ensure_project_context()
|
||||||
self.context.create_task_context(task)
|
self.context.create_task_context(task)
|
||||||
|
|
@ -128,7 +141,7 @@ class PipelineRunner:
|
||||||
retry_count=retry_count,
|
retry_count=retry_count,
|
||||||
)
|
)
|
||||||
try:
|
try:
|
||||||
result = self._run_stage(stage, task, previous_outputs, retry_notes)
|
result = self._run_stage(stage, task, previous_outputs, retry_notes, retry_count)
|
||||||
except NightShiftError as exc:
|
except NightShiftError as exc:
|
||||||
result = StageResult(
|
result = StageResult(
|
||||||
stage_id=stage.id,
|
stage_id=stage.id,
|
||||||
|
|
@ -325,6 +338,7 @@ class PipelineRunner:
|
||||||
task: Task,
|
task: Task,
|
||||||
previous_outputs: dict[str, str],
|
previous_outputs: dict[str, str],
|
||||||
retry_notes: list[str],
|
retry_notes: list[str],
|
||||||
|
retry_count: int = 0,
|
||||||
) -> StageResult:
|
) -> StageResult:
|
||||||
if stage.type in {"agent", "agent_review", "review"}:
|
if stage.type in {"agent", "agent_review", "review"}:
|
||||||
context = self.context.read_context(task, retry_notes)
|
context = self.context.read_context(task, retry_notes)
|
||||||
|
|
@ -351,6 +365,14 @@ class PipelineRunner:
|
||||||
return result
|
return result
|
||||||
if stage.type in COMMAND_STAGE_TYPES:
|
if stage.type in COMMAND_STAGE_TYPES:
|
||||||
return self.command_executor.run_stage(stage, task.id)
|
return self.command_executor.run_stage(stage, task.id)
|
||||||
|
if stage.type == "code_writer":
|
||||||
|
return self._run_code_writer_stage(stage, task, previous_outputs, retry_notes, retry_count)
|
||||||
|
if stage.type == "patch_normalizer":
|
||||||
|
return self._run_patch_normalizer_stage(stage, task, previous_outputs, retry_notes)
|
||||||
|
if stage.type == "patch_validator":
|
||||||
|
return self._run_patch_validator_stage(stage, task, previous_outputs)
|
||||||
|
if stage.type == "patch_apply":
|
||||||
|
return self._run_patch_apply_stage(stage, task, previous_outputs)
|
||||||
if stage.type == "repo_context":
|
if stage.type == "repo_context":
|
||||||
output_path = self.artifacts.write_stage_output(
|
output_path = self.artifacts.write_stage_output(
|
||||||
task.id,
|
task.id,
|
||||||
|
|
@ -384,6 +406,224 @@ class PipelineRunner:
|
||||||
)
|
)
|
||||||
raise PipelineError(f"Pipeline error: unsupported stage type '{stage.type}'.")
|
raise PipelineError(f"Pipeline error: unsupported stage type '{stage.type}'.")
|
||||||
|
|
||||||
|
def _write_project_context_chart(self) -> Path:
|
||||||
|
chart = build_project_context_chart(self.config.project.root, self.config.safety)
|
||||||
|
self.artifacts.initialize_run()
|
||||||
|
self.artifacts.project_context_chart_path.write_text(chart, encoding="utf-8")
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote project context chart",
|
||||||
|
artifact_path=self.artifacts.project_context_chart_path.relative_to(self.config.project.root),
|
||||||
|
)
|
||||||
|
return self.artifacts.project_context_chart_path
|
||||||
|
|
||||||
|
def _run_code_writer_stage(
|
||||||
|
self,
|
||||||
|
stage: StageConfig,
|
||||||
|
task: Task,
|
||||||
|
previous_outputs: dict[str, str],
|
||||||
|
retry_notes: list[str],
|
||||||
|
retry_count: int = 0,
|
||||||
|
) -> StageResult:
|
||||||
|
if stage.agent is None:
|
||||||
|
raise PipelineError(f"Pipeline error: code_writer stage '{stage.id}' must reference an agent.")
|
||||||
|
enriched_outputs = dict(previous_outputs)
|
||||||
|
context_pack_path = self._latest_task_artifact(task.id, "context-pack.md")
|
||||||
|
if context_pack_path is not None:
|
||||||
|
enriched_outputs["context-pack.md"] = context_pack_path.read_text(encoding="utf-8", errors="replace")
|
||||||
|
chart_path = self.artifacts.project_context_chart_path
|
||||||
|
if chart_path.exists():
|
||||||
|
enriched_outputs["project-context-chart.md"] = chart_path.read_text(encoding="utf-8", errors="replace")
|
||||||
|
result = self.agent_executor.run_stage(
|
||||||
|
stage,
|
||||||
|
task,
|
||||||
|
enriched_outputs,
|
||||||
|
retry_notes,
|
||||||
|
project_context=self.context.read_context(task, retry_notes).project_context,
|
||||||
|
task_context=self.context.read_context(task, retry_notes).task_context,
|
||||||
|
retry_context=self.context.read_context(task, retry_notes).retry_context,
|
||||||
|
)
|
||||||
|
raw_output = self._read_output(result.output_path)
|
||||||
|
stdout = extract_agent_stdout(raw_output)
|
||||||
|
try:
|
||||||
|
patch = extract_unified_diff(stdout)
|
||||||
|
except PipelineError as exc:
|
||||||
|
self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
"implementation-summary.md",
|
||||||
|
f"# Implementation Summary\n\nStatus: fail\nReason: {exc}\n",
|
||||||
|
)
|
||||||
|
return StageResult(stage.id, "fail", str(exc), output_path=result.output_path)
|
||||||
|
patch_filename = stage.output or ("proposed.patch" if retry_count == 0 else f"repair-{retry_count}.patch")
|
||||||
|
summary_filename = "implementation-summary.md" if retry_count == 0 else f"repair-summary-{retry_count}.md"
|
||||||
|
proposed_path = self.artifacts.write_stage_output(task.id, patch_filename, patch)
|
||||||
|
summary_path = self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
summary_filename,
|
||||||
|
format_implementation_summary(
|
||||||
|
stage.id,
|
||||||
|
proposed_path.relative_to(self.config.project.root).as_posix(),
|
||||||
|
retry_count=retry_count,
|
||||||
|
retry_notes=retry_notes,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote proposed patch",
|
||||||
|
stage_id=stage.id,
|
||||||
|
task_id=task.id,
|
||||||
|
artifact_path=proposed_path.relative_to(self.config.project.root),
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"pass",
|
||||||
|
"Proposed patch written.",
|
||||||
|
output_path=str(proposed_path.relative_to(self.config.project.root)),
|
||||||
|
context_update=f"Implementation summary: {summary_path.relative_to(self.config.project.root).as_posix()}",
|
||||||
|
)
|
||||||
|
|
||||||
|
def _run_patch_normalizer_stage(
|
||||||
|
self,
|
||||||
|
stage: StageConfig,
|
||||||
|
task: Task,
|
||||||
|
previous_outputs: dict[str, str],
|
||||||
|
retry_notes: list[str],
|
||||||
|
) -> StageResult:
|
||||||
|
source = _latest_patch_like_output(previous_outputs)
|
||||||
|
if stage.agent is not None:
|
||||||
|
result = self.agent_executor.run_stage(
|
||||||
|
stage,
|
||||||
|
task,
|
||||||
|
{"patch_input": source, **previous_outputs},
|
||||||
|
retry_notes,
|
||||||
|
project_context=self.context.read_context(task, retry_notes).project_context,
|
||||||
|
task_context=self.context.read_context(task, retry_notes).task_context,
|
||||||
|
retry_context=self.context.read_context(task, retry_notes).retry_context,
|
||||||
|
)
|
||||||
|
source = extract_agent_stdout(self._read_output(result.output_path))
|
||||||
|
try:
|
||||||
|
patch = normalize_patch_text(source)
|
||||||
|
except PipelineError as exc:
|
||||||
|
return StageResult(stage.id, "fail", str(exc))
|
||||||
|
output_path = self.artifacts.write_stage_output(task.id, stage.output or "normalized.patch", patch)
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote normalized patch",
|
||||||
|
stage_id=stage.id,
|
||||||
|
task_id=task.id,
|
||||||
|
artifact_path=output_path.relative_to(self.config.project.root),
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"pass",
|
||||||
|
"Normalized patch written.",
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _run_patch_validator_stage(
|
||||||
|
self,
|
||||||
|
stage: StageConfig,
|
||||||
|
task: Task,
|
||||||
|
previous_outputs: dict[str, str],
|
||||||
|
) -> StageResult:
|
||||||
|
source = _latest_patch_like_output(previous_outputs)
|
||||||
|
try:
|
||||||
|
patch = normalize_patch_text(source)
|
||||||
|
result = validate_patch(
|
||||||
|
patch,
|
||||||
|
self.config.project.root,
|
||||||
|
self.config.safety,
|
||||||
|
max_files=stage.max_files or DEFAULT_MAX_FILES,
|
||||||
|
max_changed_lines=stage.max_lines or DEFAULT_MAX_CHANGED_LINES,
|
||||||
|
forbidden_paths=stage.forbidden_paths or DEFAULT_FORBIDDEN_PATHS,
|
||||||
|
)
|
||||||
|
except PipelineError as exc:
|
||||||
|
output_path = self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
stage.output or "patch-validation.md",
|
||||||
|
f"# Patch Validation\n\nStatus: fail\nReason: {exc}\n",
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"fail",
|
||||||
|
str(exc),
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
)
|
||||||
|
output_path = self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
stage.output or "patch-validation.md",
|
||||||
|
format_validation_result(result),
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"pass",
|
||||||
|
"Patch validation passed.",
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _run_patch_apply_stage(
|
||||||
|
self,
|
||||||
|
stage: StageConfig,
|
||||||
|
task: Task,
|
||||||
|
previous_outputs: dict[str, str],
|
||||||
|
) -> StageResult:
|
||||||
|
source = _latest_patch_like_output(previous_outputs)
|
||||||
|
try:
|
||||||
|
patch = normalize_patch_text(source)
|
||||||
|
validate_patch(
|
||||||
|
patch,
|
||||||
|
self.config.project.root,
|
||||||
|
self.config.safety,
|
||||||
|
max_files=stage.max_files or DEFAULT_MAX_FILES,
|
||||||
|
max_changed_lines=stage.max_lines or DEFAULT_MAX_CHANGED_LINES,
|
||||||
|
forbidden_paths=stage.forbidden_paths or DEFAULT_FORBIDDEN_PATHS,
|
||||||
|
)
|
||||||
|
except PipelineError as exc:
|
||||||
|
output_path = self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
stage.output or "patch-apply-output.txt",
|
||||||
|
f"# Patch Apply\n\nStatus: fail\nReason: {exc}\n",
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"fail",
|
||||||
|
str(exc),
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
)
|
||||||
|
|
||||||
|
applied_path = self.artifacts.write_stage_output(task.id, "applied.patch", patch)
|
||||||
|
write_git_artifacts(self.artifacts, task.id, "before-patch-apply")
|
||||||
|
mode = stage.mode or "dry_run"
|
||||||
|
apply_result = apply_patch_with_git(applied_path, self.config.project.root, mode=mode)
|
||||||
|
write_git_artifacts(self.artifacts, task.id, "after-patch-apply")
|
||||||
|
output_path = self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
stage.output or "patch-apply-output.txt",
|
||||||
|
format_patch_apply_result(
|
||||||
|
apply_result,
|
||||||
|
applied_path.relative_to(self.config.project.root).as_posix(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
if apply_result.status != "pass":
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"fail",
|
||||||
|
f"Patch apply failed with code {apply_result.exit_code}.",
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
context_update=apply_result.stderr.strip() or apply_result.stdout.strip(),
|
||||||
|
)
|
||||||
|
reason = "Patch dry run passed." if mode == "dry_run" else "Patch applied."
|
||||||
|
return StageResult(
|
||||||
|
stage.id,
|
||||||
|
"pass",
|
||||||
|
reason,
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _latest_task_artifact(self, task_id: str, filename: str) -> Path | None:
|
||||||
|
path = self.artifacts.create_task_dir(task_id).directory / filename
|
||||||
|
return path if path.exists() else None
|
||||||
|
|
||||||
def _maybe_rerun_agent_with_repo_lookup(
|
def _maybe_rerun_agent_with_repo_lookup(
|
||||||
self,
|
self,
|
||||||
stage: StageConfig,
|
stage: StageConfig,
|
||||||
|
|
@ -528,6 +768,39 @@ def format_task_completion(task: Task, status: str, changed: bool) -> str:
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def format_implementation_summary(
|
||||||
|
stage_id: str,
|
||||||
|
patch_path: str,
|
||||||
|
retry_count: int = 0,
|
||||||
|
retry_notes: list[str] | None = None,
|
||||||
|
) -> str:
|
||||||
|
notes = retry_notes or []
|
||||||
|
lines = [
|
||||||
|
"# Implementation Summary",
|
||||||
|
"",
|
||||||
|
f"Stage: `{stage_id}`",
|
||||||
|
"Status: pass",
|
||||||
|
f"Repair attempt: {retry_count}",
|
||||||
|
f"Patch: `{patch_path}`",
|
||||||
|
"",
|
||||||
|
"## Retry Feedback",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
lines.extend(f"- {note}" for note in notes[-5:]) if notes else lines.append("- None")
|
||||||
|
lines.append("")
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _latest_patch_like_output(previous_outputs: dict[str, str]) -> str:
|
||||||
|
for name in ("normalized.patch", "applied.patch", "proposed.patch", "patch_input"):
|
||||||
|
if name in previous_outputs and previous_outputs[name].strip():
|
||||||
|
return previous_outputs[name]
|
||||||
|
for stage_id, content in reversed(list(previous_outputs.items())):
|
||||||
|
if stage_id.endswith(".patch") or "diff --git " in content or "\n--- " in content:
|
||||||
|
return content
|
||||||
|
raise PipelineError("Patch error: no previous patch output found.")
|
||||||
|
|
||||||
|
|
||||||
def format_aggregate_run_summary(results: list[PipelineResult], status: str, reason: str) -> str:
|
def format_aggregate_run_summary(results: list[PipelineResult], status: str, reason: str) -> str:
|
||||||
lines = [
|
lines = [
|
||||||
"# Run Summary",
|
"# Run Summary",
|
||||||
|
|
|
||||||
148
nightshift/project_chart.py
Normal file
148
nightshift/project_chart.py
Normal file
|
|
@ -0,0 +1,148 @@
|
||||||
|
"""Project context chart generation."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .config import SafetyConfig
|
||||||
|
from .safety import resolve_project_root, validate_scoped_paths
|
||||||
|
|
||||||
|
|
||||||
|
CODE_EXTENSIONS = {".py", ".js", ".ts", ".tsx", ".jsx", ".md", ".yaml", ".yml", ".toml"}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class FileChart:
|
||||||
|
path: str
|
||||||
|
responsibility: str
|
||||||
|
functions: tuple[str, ...]
|
||||||
|
classes: tuple[str, ...]
|
||||||
|
anchors: tuple[str, ...]
|
||||||
|
is_entry_point: bool
|
||||||
|
is_test: bool
|
||||||
|
|
||||||
|
|
||||||
|
def build_project_context_chart(project_root: str | Path, safety: SafetyConfig, max_files: int = 200) -> str:
|
||||||
|
root = resolve_project_root(project_root)
|
||||||
|
scoped_roots = validate_scoped_paths(root, safety.scoped_paths or (".",))
|
||||||
|
files: list[Path] = []
|
||||||
|
for scoped_root in scoped_roots:
|
||||||
|
if scoped_root.is_file():
|
||||||
|
candidates = [scoped_root]
|
||||||
|
else:
|
||||||
|
candidates = [item for item in scoped_root.rglob("*") if item.is_file()]
|
||||||
|
for candidate in candidates:
|
||||||
|
if _skip(candidate, root):
|
||||||
|
continue
|
||||||
|
if candidate not in files:
|
||||||
|
files.append(candidate)
|
||||||
|
|
||||||
|
charts = [_chart_file(path, root) for path in sorted(files)[:max_files]]
|
||||||
|
return format_project_context_chart(charts, truncated=max(0, len(files) - max_files))
|
||||||
|
|
||||||
|
|
||||||
|
def format_project_context_chart(charts: list[FileChart], truncated: int = 0) -> str:
|
||||||
|
lines = ["# Project Context Chart", ""]
|
||||||
|
if not charts:
|
||||||
|
lines.append("No project files found.")
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
lines.extend(["## Entry Points", ""])
|
||||||
|
entry_points = [chart for chart in charts if chart.is_entry_point]
|
||||||
|
lines.extend(f"- `{chart.path}`: {chart.responsibility}" for chart in entry_points)
|
||||||
|
if not entry_points:
|
||||||
|
lines.append("- None detected")
|
||||||
|
|
||||||
|
lines.extend(["", "## Tests", ""])
|
||||||
|
tests = [chart for chart in charts if chart.is_test]
|
||||||
|
lines.extend(f"- `{chart.path}`" for chart in tests)
|
||||||
|
if not tests:
|
||||||
|
lines.append("- None detected")
|
||||||
|
|
||||||
|
lines.extend(["", "## Files", ""])
|
||||||
|
for chart in charts:
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
f"### `{chart.path}`",
|
||||||
|
"",
|
||||||
|
f"- Responsibility: {chart.responsibility}",
|
||||||
|
f"- Entry point: {str(chart.is_entry_point).lower()}",
|
||||||
|
f"- Test file: {str(chart.is_test).lower()}",
|
||||||
|
f"- Functions: {', '.join(chart.functions) if chart.functions else 'None detected'}",
|
||||||
|
f"- Classes: {', '.join(chart.classes) if chart.classes else 'None detected'}",
|
||||||
|
f"- Anchors/search terms: {', '.join(chart.anchors) if chart.anchors else 'None detected'}",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if truncated:
|
||||||
|
lines.append(f"Truncated {truncated} additional files.")
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _chart_file(path: Path, root: Path) -> FileChart:
|
||||||
|
relative = path.relative_to(root).as_posix()
|
||||||
|
text = path.read_text(encoding="utf-8", errors="replace")
|
||||||
|
lines = text.splitlines()
|
||||||
|
functions: list[str] = []
|
||||||
|
classes: list[str] = []
|
||||||
|
anchors: list[str] = []
|
||||||
|
for line_number, line in enumerate(lines, start=1):
|
||||||
|
function = re.match(r"\s*(?:async\s+)?def\s+([A-Za-z_][A-Za-z0-9_]*)\s*\(", line)
|
||||||
|
js_function = re.match(r"\s*(?:export\s+)?function\s+([A-Za-z_][A-Za-z0-9_]*)\s*\(", line)
|
||||||
|
klass = re.match(r"\s*class\s+([A-Za-z_][A-Za-z0-9_]*)", line)
|
||||||
|
if function:
|
||||||
|
name = function.group(1)
|
||||||
|
functions.append(f"{name}@L{line_number}")
|
||||||
|
anchors.append(name)
|
||||||
|
elif js_function:
|
||||||
|
name = js_function.group(1)
|
||||||
|
functions.append(f"{name}@L{line_number}")
|
||||||
|
anchors.append(name)
|
||||||
|
elif klass:
|
||||||
|
name = klass.group(1)
|
||||||
|
classes.append(f"{name}@L{line_number}")
|
||||||
|
anchors.append(name)
|
||||||
|
|
||||||
|
return FileChart(
|
||||||
|
path=relative,
|
||||||
|
responsibility=_responsibility(relative, lines),
|
||||||
|
functions=tuple(functions),
|
||||||
|
classes=tuple(classes),
|
||||||
|
anchors=tuple(dict.fromkeys(anchors[:12])),
|
||||||
|
is_entry_point=_is_entry_point(relative, text),
|
||||||
|
is_test=_is_test(relative),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _responsibility(relative: str, lines: list[str]) -> str:
|
||||||
|
for line in lines[:12]:
|
||||||
|
stripped = line.strip().strip("#").strip()
|
||||||
|
if stripped and not stripped.startswith(("from ", "import ")):
|
||||||
|
return stripped[:140]
|
||||||
|
if _is_test(relative):
|
||||||
|
return "Test coverage."
|
||||||
|
return "Source or project support file."
|
||||||
|
|
||||||
|
|
||||||
|
def _is_entry_point(relative: str, text: str) -> bool:
|
||||||
|
name = Path(relative).name.lower()
|
||||||
|
return name in {"cli.py", "main.py", "__main__.py"} or "if __name__ == \"__main__\"" in text
|
||||||
|
|
||||||
|
|
||||||
|
def _is_test(relative: str) -> bool:
|
||||||
|
parts = relative.lower().split("/")
|
||||||
|
name = parts[-1]
|
||||||
|
return "tests" in parts or name.startswith("test_") or name.endswith("_test.py")
|
||||||
|
|
||||||
|
|
||||||
|
def _skip(path: Path, root: Path) -> bool:
|
||||||
|
try:
|
||||||
|
relative = path.relative_to(root)
|
||||||
|
except ValueError:
|
||||||
|
return True
|
||||||
|
parts = set(relative.parts)
|
||||||
|
if ".git" in parts or ".nightshift" in parts or "__pycache__" in parts:
|
||||||
|
return True
|
||||||
|
return path.suffix.lower() not in CODE_EXTENSIONS
|
||||||
|
|
@ -223,19 +223,20 @@ def extract_agent_stdout(artifact_text: str) -> str:
|
||||||
for index, line in enumerate(lines):
|
for index, line in enumerate(lines):
|
||||||
if line.strip() != "## stdout":
|
if line.strip() != "## stdout":
|
||||||
continue
|
continue
|
||||||
start = None
|
end = next(
|
||||||
for cursor in range(index + 1, len(lines)):
|
(cursor for cursor in range(index + 1, len(lines)) if lines[cursor].strip() == "## stderr"),
|
||||||
if lines[cursor].strip().startswith("```"):
|
len(lines),
|
||||||
start = cursor + 1
|
)
|
||||||
break
|
section = lines[index + 1:end]
|
||||||
if start is None:
|
while section and not section[0].strip():
|
||||||
return ""
|
section = section[1:]
|
||||||
end = len(lines)
|
while section and not section[-1].strip():
|
||||||
for cursor in range(start, len(lines)):
|
section = section[:-1]
|
||||||
if lines[cursor].strip().startswith("```"):
|
if section and section[0].strip().startswith("```"):
|
||||||
end = cursor
|
section = section[1:]
|
||||||
break
|
if section and section[-1].strip() == "```":
|
||||||
return "\n".join(lines[start:end])
|
section = section[:-1]
|
||||||
|
return "\n".join(section)
|
||||||
return artifact_text
|
return artifact_text
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -206,6 +206,46 @@ class ConfigTests(unittest.TestCase):
|
||||||
self.assertEqual(test_stage.timeout_seconds, 30)
|
self.assertEqual(test_stage.timeout_seconds, 30)
|
||||||
self.assertEqual(test_stage.working_dir, Path("."))
|
self.assertEqual(test_stage.working_dir, Path("."))
|
||||||
|
|
||||||
|
def test_patch_validator_stage_options_load(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
init_project(root)
|
||||||
|
config_path = root / "nightshift.yaml"
|
||||||
|
config_path.write_text(
|
||||||
|
config_path.read_text(encoding="utf-8").replace(
|
||||||
|
" - id: summarize",
|
||||||
|
" - id: validate_patch\n type: patch_validator\n max_files: 2\n max_lines: 100\n forbidden_paths:\n - secrets\n\n - id: summarize",
|
||||||
|
1,
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
config = load_config(config_path)
|
||||||
|
patch_stage = next(stage for stage in config.pipeline.stages if stage.id == "validate_patch")
|
||||||
|
|
||||||
|
self.assertEqual(patch_stage.max_files, 2)
|
||||||
|
self.assertEqual(patch_stage.max_lines, 100)
|
||||||
|
self.assertEqual(patch_stage.forbidden_paths, ("secrets",))
|
||||||
|
|
||||||
|
def test_patch_apply_mode_loads(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
init_project(root)
|
||||||
|
config_path = root / "nightshift.yaml"
|
||||||
|
config_path.write_text(
|
||||||
|
config_path.read_text(encoding="utf-8").replace(
|
||||||
|
" - id: summarize",
|
||||||
|
" - id: apply_patch\n type: patch_apply\n mode: dry_run\n\n - id: summarize",
|
||||||
|
1,
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
config = load_config(config_path)
|
||||||
|
apply_stage = next(stage for stage in config.pipeline.stages if stage.id == "apply_patch")
|
||||||
|
|
||||||
|
self.assertEqual(apply_stage.mode, "dry_run")
|
||||||
|
|
||||||
def test_agent_temperature_loads(self) -> None:
|
def test_agent_temperature_loads(self) -> None:
|
||||||
with tempfile.TemporaryDirectory() as directory:
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
root = Path(directory)
|
root = Path(directory)
|
||||||
|
|
|
||||||
57
tests/test_patches.py
Normal file
57
tests/test_patches.py
Normal file
|
|
@ -0,0 +1,57 @@
|
||||||
|
from pathlib import Path
|
||||||
|
import tempfile
|
||||||
|
import unittest
|
||||||
|
|
||||||
|
from nightshift.config import SafetyConfig
|
||||||
|
from nightshift.errors import PipelineError
|
||||||
|
from nightshift.patches import normalize_patch_text, validate_patch
|
||||||
|
|
||||||
|
|
||||||
|
PATCH = """diff --git a/src/app.py b/src/app.py
|
||||||
|
--- a/src/app.py
|
||||||
|
+++ b/src/app.py
|
||||||
|
@@ -1 +1 @@
|
||||||
|
-old
|
||||||
|
+new
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
class PatchTests(unittest.TestCase):
|
||||||
|
def test_normalize_extracts_fenced_patch(self) -> None:
|
||||||
|
text = f"Here it is:\n```diff\n{PATCH}```\n"
|
||||||
|
|
||||||
|
self.assertEqual(normalize_patch_text(text), PATCH)
|
||||||
|
|
||||||
|
def test_validate_patch_enforces_scopes(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
(root / "src").mkdir()
|
||||||
|
safety = SafetyConfig(
|
||||||
|
require_clean_worktree=False,
|
||||||
|
scoped_paths=("src",),
|
||||||
|
allowed_commands=(),
|
||||||
|
forbidden_commands=(),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = validate_patch(PATCH, root, safety)
|
||||||
|
|
||||||
|
self.assertEqual(result.files, ("src/app.py",))
|
||||||
|
self.assertEqual(result.changed_lines, 2)
|
||||||
|
|
||||||
|
def test_validate_patch_rejects_forbidden_path(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
safety = SafetyConfig(
|
||||||
|
require_clean_worktree=False,
|
||||||
|
scoped_paths=(".",),
|
||||||
|
allowed_commands=(),
|
||||||
|
forbidden_commands=(),
|
||||||
|
)
|
||||||
|
patch = PATCH.replace("src/app.py", ".nightshift/log.txt")
|
||||||
|
|
||||||
|
with self.assertRaisesRegex(PipelineError, "forbidden path"):
|
||||||
|
validate_patch(patch, root, safety)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
|
|
@ -303,6 +303,107 @@ Acceptance Criteria:
|
||||||
self.assertIn("Context Pack", pack.read_text(encoding="utf-8"))
|
self.assertIn("Context Pack", pack.read_text(encoding="utf-8"))
|
||||||
self.assertIn("app.py", pack.read_text(encoding="utf-8"))
|
self.assertIn("app.py", pack.read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
def test_project_context_chart_is_written_during_run(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
(root / "cli.py").write_text(
|
||||||
|
"def main():\n return 0\n\nif __name__ == \"__main__\":\n main()\n",
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
stages = (StageConfig(id="plan", type="agent", agent="planner", output="plan.md"),)
|
||||||
|
config = make_config(root, stages)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
task = parse_tasks(TASK_MD)[0]
|
||||||
|
|
||||||
|
runner.run_task(task)
|
||||||
|
|
||||||
|
chart = root / ".nightshift" / "project-context-chart.md"
|
||||||
|
self.assertTrue(chart.exists())
|
||||||
|
content = chart.read_text(encoding="utf-8")
|
||||||
|
self.assertIn("cli.py", content)
|
||||||
|
self.assertIn("main@L1", content)
|
||||||
|
|
||||||
|
def test_code_writer_normalizer_and_validator_pipeline(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
(root / "app.py").write_text("old\n", encoding="utf-8")
|
||||||
|
(root / "fake_writer.py").write_text(
|
||||||
|
"\n".join(
|
||||||
|
[
|
||||||
|
"print('```diff')",
|
||||||
|
"print('diff --git a/app.py b/app.py')",
|
||||||
|
"print('--- a/app.py')",
|
||||||
|
"print('+++ b/app.py')",
|
||||||
|
"print('@@ -1 +1 @@')",
|
||||||
|
"print('-old')",
|
||||||
|
"print('+new')",
|
||||||
|
"print('```')",
|
||||||
|
]
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
stages = (
|
||||||
|
StageConfig(id="context", type="repo_context", output="context-pack.md"),
|
||||||
|
StageConfig(id="write", type="code_writer", agent="writer"),
|
||||||
|
StageConfig(id="normalize", type="patch_normalizer"),
|
||||||
|
StageConfig(id="validate", type="patch_validator"),
|
||||||
|
)
|
||||||
|
config = make_config(root, stages)
|
||||||
|
config.agents["writer"] = AgentConfig(
|
||||||
|
id="writer",
|
||||||
|
backend="command",
|
||||||
|
command="python fake_writer.py",
|
||||||
|
system_prompt=Path("planner.md"),
|
||||||
|
)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
task = parse_tasks(TASK_MD)[0]
|
||||||
|
|
||||||
|
result = runner.run_task(task)
|
||||||
|
|
||||||
|
task_dir = root / ".nightshift" / "runs" / "test-run" / "tasks" / task.id
|
||||||
|
self.assertEqual(result.status, "complete")
|
||||||
|
self.assertTrue((task_dir / "proposed.patch").exists())
|
||||||
|
self.assertTrue((task_dir / "implementation-summary.md").exists())
|
||||||
|
self.assertTrue((task_dir / "normalized.patch").exists())
|
||||||
|
self.assertIn("Status: pass", (task_dir / "patch-validation.md").read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
def test_patch_validator_rejects_unsafe_patch(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
stages = (
|
||||||
|
StageConfig(id="write", type="code_writer", agent="writer"),
|
||||||
|
StageConfig(id="validate", type="patch_validator"),
|
||||||
|
)
|
||||||
|
(root / "fake_writer.py").write_text(
|
||||||
|
"\n".join(
|
||||||
|
[
|
||||||
|
"print('diff --git a/.nightshift/log.txt b/.nightshift/log.txt')",
|
||||||
|
"print('--- a/.nightshift/log.txt')",
|
||||||
|
"print('+++ b/.nightshift/log.txt')",
|
||||||
|
"print('@@ -1 +1 @@')",
|
||||||
|
"print('-old')",
|
||||||
|
"print('+new')",
|
||||||
|
]
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
config = make_config(root, stages)
|
||||||
|
config.agents["writer"] = AgentConfig(
|
||||||
|
id="writer",
|
||||||
|
backend="command",
|
||||||
|
command="python fake_writer.py",
|
||||||
|
system_prompt=Path("planner.md"),
|
||||||
|
)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
|
||||||
|
result = runner.run_task(parse_tasks(TASK_MD)[0])
|
||||||
|
|
||||||
|
self.assertEqual(result.status, "failed")
|
||||||
|
self.assertIn("forbidden path", result.reason)
|
||||||
|
|
||||||
|
|
||||||
def _write_common_files(root: Path) -> None:
|
def _write_common_files(root: Path) -> None:
|
||||||
(root / "nightshift.yaml").write_text("project:\n name: test\n", encoding="utf-8")
|
(root / "nightshift.yaml").write_text("project:\n name: test\n", encoding="utf-8")
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue
Block a user