mirror of
https://github.com/khodges42/nightShift.git
synced 2026-06-14 10:08:37 +00:00
Clean up docs, tests, patch writing bug
Checked out commit from rsarv3006 which is super interesting, grabbed some inspiration from it and mentioned it in the ideas file.
This commit is contained in:
parent
33b9de5441
commit
e1e6803eb1
3
.gitignore
vendored
3
.gitignore
vendored
|
|
@ -25,7 +25,10 @@ share/python-wheels/
|
||||||
.installed.cfg
|
.installed.cfg
|
||||||
*.egg
|
*.egg
|
||||||
MANIFEST
|
MANIFEST
|
||||||
|
|
||||||
|
# Codex working notes and generated analysis docs
|
||||||
docs/codex/
|
docs/codex/
|
||||||
|
|
||||||
# PyInstaller
|
# PyInstaller
|
||||||
# Usually these files are written by a python script from a template
|
# Usually these files are written by a python script from a template
|
||||||
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||||
|
|
|
||||||
|
|
@ -1,119 +0,0 @@
|
||||||
# Iteration 1: SCENE-002 Update State Failure
|
|
||||||
|
|
||||||
Date: 2026-05-22
|
|
||||||
|
|
||||||
## Run Reviewed
|
|
||||||
|
|
||||||
- Sandbox: `integ_runs/20260522T214944.385761Z`
|
|
||||||
- Run: `.nightshift/runs/20260522T215005.188534Z`
|
|
||||||
- Task: `SCENE-002`
|
|
||||||
- Final status: failed
|
|
||||||
- Failed stage: `update_state`
|
|
||||||
|
|
||||||
## What Happened
|
|
||||||
|
|
||||||
The scene workflow mostly succeeded:
|
|
||||||
|
|
||||||
- `draft_scene` wrote the scene.
|
|
||||||
- `continuity_review` correctly failed the first draft for pronoun drift.
|
|
||||||
- `edit_scene` repaired the pronoun issue.
|
|
||||||
- `continuity_review` passed after edit.
|
|
||||||
- `style_review` passed.
|
|
||||||
|
|
||||||
The remaining failure happened in `update_state`.
|
|
||||||
|
|
||||||
NightShift reported:
|
|
||||||
|
|
||||||
```text
|
|
||||||
File writer error: no file blocks found. Expected FILE: path with ---CONTENT---/---END--- or fenced blocks like ```file:path.py.
|
|
||||||
```
|
|
||||||
|
|
||||||
The model output did contain visible `FILE:` blocks, but it omitted the required `---END---` delimiter. It emitted:
|
|
||||||
|
|
||||||
```text
|
|
||||||
FILE: story/plot-state.md
|
|
||||||
---CONTENT---
|
|
||||||
...
|
|
||||||
|
|
||||||
FILE: story/characters.md
|
|
||||||
---CONTENT---
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
The current parser requires `---END---`, so it rejected all of the blocks.
|
|
||||||
|
|
||||||
## Additional Risk Found
|
|
||||||
|
|
||||||
The rejected state update also tried to rewrite character canon in unsafe ways:
|
|
||||||
|
|
||||||
- It changed BLOODMONEY's pronoun reference to `he/him`.
|
|
||||||
- It changed Cricket's pronoun reference to `they/them`.
|
|
||||||
- It compressed/replaced larger parts of `story/characters.md`.
|
|
||||||
|
|
||||||
That means simply accepting unterminated blocks is not enough. The parser can be more tolerant, but the state updater still needs stronger constraints so durable canon does not drift.
|
|
||||||
|
|
||||||
## Suggested Fixes
|
|
||||||
|
|
||||||
Short-term fixes for this iteration:
|
|
||||||
|
|
||||||
1. Make `parse_file_updates` tolerate delimiter blocks that omit `---END---` when a new `FILE:` block or EOF clearly terminates the previous block.
|
|
||||||
2. Keep strict path validation and duplicate-file validation unchanged.
|
|
||||||
3. Strengthen the state-updater prompt:
|
|
||||||
- never edit `Pronouns / Reference` sections
|
|
||||||
- preserve existing character profiles
|
|
||||||
- prefer updating `plot-state.md`, `timeline.md`, and `unresolved-threads.md`
|
|
||||||
- edit `characters.md` only for small additive current-status facts
|
|
||||||
4. Add regression tests for unterminated delimiter parsing.
|
|
||||||
|
|
||||||
Longer-term follow-up:
|
|
||||||
|
|
||||||
- Add deterministic writing-state validation that rejects changes to protected canon sections such as `Pronouns / Reference`.
|
|
||||||
- Move character canon into structured data so pronoun constraints can be validated directly.
|
|
||||||
|
|
||||||
## Planned Changes
|
|
||||||
|
|
||||||
- Update delimiter block parsing in `nightshift/patches.py`.
|
|
||||||
- Add parser tests in `tests/test_patches.py`.
|
|
||||||
- Tighten `state-updater.md` in the tutorial novel template.
|
|
||||||
- Run focused parser tests and the full suite.
|
|
||||||
|
|
||||||
## Changes Made
|
|
||||||
|
|
||||||
- `parse_file_updates` now accepts delimiter-style file blocks that omit `---END---` when the next `FILE:` header or EOF clearly terminates the block.
|
|
||||||
- Added regression coverage for:
|
|
||||||
- unterminated delimiter blocks before another `FILE:`
|
|
||||||
- mixed terminated and unterminated delimiter blocks
|
|
||||||
- Strengthened the tutorial novel state updater prompt to protect character canon:
|
|
||||||
- never change `Pronouns / Reference`
|
|
||||||
- never change canonical pronouns, narrative reference, identity, or core wound
|
|
||||||
- prefer state/timeline/thread files over `characters.md`
|
|
||||||
- edit `characters.md` only for small additive current-status facts or new named characters
|
|
||||||
- Added deterministic protection in file-block patch generation:
|
|
||||||
- changes to existing `Pronouns / Reference` sections in `story/characters.md` are rejected before a patch is generated
|
|
||||||
- Added regression coverage for rejecting protected pronoun canon changes.
|
|
||||||
|
|
||||||
## Verification
|
|
||||||
|
|
||||||
Focused tests:
|
|
||||||
|
|
||||||
```powershell
|
|
||||||
python -m pytest tests/test_patches.py tests/test_pipeline.py -q
|
|
||||||
```
|
|
||||||
|
|
||||||
Result:
|
|
||||||
|
|
||||||
```text
|
|
||||||
56 passed
|
|
||||||
```
|
|
||||||
|
|
||||||
Full suite:
|
|
||||||
|
|
||||||
```powershell
|
|
||||||
python -m pytest -q
|
|
||||||
```
|
|
||||||
|
|
||||||
Result:
|
|
||||||
|
|
||||||
```text
|
|
||||||
199 passed, 4 subtests passed
|
|
||||||
```
|
|
||||||
|
|
@ -1,73 +0,0 @@
|
||||||
# NightShift Integration Failure Analysis
|
|
||||||
|
|
||||||
## Immediate Causes
|
|
||||||
|
|
||||||
I would separate the failures into four buckets:
|
|
||||||
|
|
||||||
1. The pastebin template is not truly incremental.
|
|
||||||
`tests/test_pastebin.py` already tests listing/filtering and expiration, even though `TASK-001` only asks for create/view. The stock app also already has a fairly complete `create_app` implementation. So the task is not "build feature 1"; it is "modify an already-complete app without breaking future-task behavior."
|
|
||||||
|
|
||||||
2. The retry stop policy is harsher than the config implies.
|
|
||||||
Even with `stop_on_repeated_failure_signature_after: 6`, `nightshift/escalation.py` unconditionally stops after the last 3 entries have the same stage and cause. That explains the "same stage same reason" stop before the configured repeated-signature threshold.
|
|
||||||
|
|
||||||
3. The model got bad or insufficient context early.
|
|
||||||
In the run artifacts, the planner asked for `app/models.py` and `app/routes.py`, both outside the actual scoped repo. That pushed it toward a hallucinated Flask/SQLAlchemy architecture. Later repairs added `tests/test_snippets.py` importing nonexistent `app`, then tried to repair by deleting large amounts of code, which patch validation correctly rejected.
|
|
||||||
|
|
||||||
4. The template and manual deletion created contradictory state.
|
|
||||||
In the latest project, `src/pastebin_app/__init__.py` imports `create_app`, but `src/pastebin_app/app.py` no longer defines it. `tests/test_pastebin.py` is now empty, while generated `tests/test_snippets.py` expects a different app shape. That is exactly the kind of broken intermediate state a local model will churn on unless the orchestrator gives it a very explicit recovery path.
|
|
||||||
|
|
||||||
## On Pre-Generated Code
|
|
||||||
|
|
||||||
I agree with your instinct: for this tutorial, pre-generated app code is hurting more than helping.
|
|
||||||
|
|
||||||
A better template would include:
|
|
||||||
|
|
||||||
- `pyproject.toml`
|
|
||||||
- package directories and empty `__init__.py`
|
|
||||||
- minimal templates if the task needs HTML later
|
|
||||||
- no complete app logic
|
|
||||||
- no future-task tests active during `TASK-001`
|
|
||||||
- a small `tests/test_task001.py` for only create/view
|
|
||||||
|
|
||||||
Then `TASK-002` adds list/filter tests, `TASK-003` adds expiration tests, etc. The AI should build forward, not preserve a hidden completed app.
|
|
||||||
|
|
||||||
## Why Claude/Codex Feel Different
|
|
||||||
|
|
||||||
Production coding agents usually have an inner loop:
|
|
||||||
|
|
||||||
- inspect files
|
|
||||||
- edit narrowly
|
|
||||||
- run targeted tests
|
|
||||||
- read exact failure
|
|
||||||
- inspect more files
|
|
||||||
- edit again
|
|
||||||
- rerun
|
|
||||||
|
|
||||||
NightShift currently has a coarser loop: generate one patch, normalize, apply, run tests, summarize, retry. That is auditable, but it means each retry is another sampled patch rather than an interactive repair session. Swapping models does not fix bad task shape, bad context, or contradictory repo state.
|
|
||||||
|
|
||||||
## Best Options
|
|
||||||
|
|
||||||
Option A: fix the current design conservatively.
|
|
||||||
|
|
||||||
- Remove pre-generated pastebin app logic.
|
|
||||||
- Split tests by task.
|
|
||||||
- Run only task-relevant tests during the task, then full suite after success.
|
|
||||||
- Move deterministic repo context before planning, or at least always include file tree plus full contents of likely target files.
|
|
||||||
- Make churn stopping obey config; do not hard-stop after 3 same-stage failures unless configured.
|
|
||||||
- Improve retry signatures to ignore pytest cache warnings and prefer project traceback lines.
|
|
||||||
|
|
||||||
Option B: add a real repair micro-loop.
|
|
||||||
|
|
||||||
For command/test failures, run a bounded repair loop before consuming another global retry:
|
|
||||||
|
|
||||||
```text
|
|
||||||
failure -> classify -> inspect exact files -> produce small patch -> run targeted test -> repeat 2-4 times
|
|
||||||
```
|
|
||||||
|
|
||||||
That would make NightShift behave more like Codex/Claude while preserving artifacts.
|
|
||||||
|
|
||||||
Option C: delegate hard repairs to production agent backends.
|
|
||||||
|
|
||||||
Add a `codex`/`claude-code` backend stage for implementation/repair. NightShift still owns task selection, safety, artifacts, tests, and reports, but lets a stronger tool run the inner edit/test loop.
|
|
||||||
|
|
||||||
My recommendation: do A first, then B. The template/task mismatch is the largest avoidable failure source, and the unconditional churn stop is a real policy bug. Once those are fixed, the remaining failures will be much more informative.
|
|
||||||
|
|
@ -256,3 +256,16 @@ Reason:
|
||||||
- fallback makes artifacts harder to reason about
|
- fallback makes artifacts harder to reason about
|
||||||
- model variability is bad while debugging pipeline behavior
|
- model variability is bad while debugging pipeline behavior
|
||||||
- the default template should remain the reliability harness
|
- the default template should remain the reliability harness
|
||||||
|
|
||||||
|
## P2: Adopt Useful Fork Ideas From rsarv3006
|
||||||
|
|
||||||
|
Source: https://github.com/rsarv3006/nightShift/commit/649eef65546a4ae648170bf29663f939eb031d2c
|
||||||
|
|
||||||
|
Author: GitHub user `rsarv3006`
|
||||||
|
|
||||||
|
Useful ideas to consider porting:
|
||||||
|
|
||||||
|
- Add `on_status` stage routing so review stages can route `pass`, `retry`, `fail`, and `escalate` to different follow-up stages.
|
||||||
|
- Add configurable repo lookup exclusions, for example `safety.skip_repo_parts`, so projects can hide generated or irrelevant directories from planner/reviewer context tools.
|
||||||
|
- Add configurable agent timeout, for example `pipeline.agent_timeout_seconds`, so long local-model runs can be tuned per project.
|
||||||
|
- Add docs and focused tests around status-based routing behavior.
|
||||||
|
|
|
||||||
|
|
@ -1,396 +0,0 @@
|
||||||
# Agentic Novel Writing Workflow Idea
|
|
||||||
|
|
||||||
NightShift could plausibly support non-coding workflows, especially long-form fiction, because the core abstraction is not actually "write code." It is:
|
|
||||||
|
|
||||||
- read task context
|
|
||||||
- call one or more agents
|
|
||||||
- produce artifacts
|
|
||||||
- validate outputs
|
|
||||||
- update project state
|
|
||||||
- move to the next task
|
|
||||||
|
|
||||||
That maps surprisingly well to writing a novel.
|
|
||||||
|
|
||||||
## Core Realization
|
|
||||||
|
|
||||||
A novel workflow should not ask one model to write the whole book, or even necessarily one whole chapter.
|
|
||||||
|
|
||||||
The durable project files would act like the source of truth:
|
|
||||||
|
|
||||||
- `worldbuilding.md`
|
|
||||||
- `characters.md`
|
|
||||||
- `plot-state.md`
|
|
||||||
- `style-guide.md`
|
|
||||||
- `outline.md`
|
|
||||||
- `chapters/chapter-001.md`
|
|
||||||
- `chapters/chapter-001-scene-001.md`
|
|
||||||
- `tasks.md`
|
|
||||||
|
|
||||||
The task file would drive the work, similar to coding tasks:
|
|
||||||
|
|
||||||
```text
|
|
||||||
- [ ] SCENE-001: Opening scene at the border checkpoint
|
|
||||||
|
|
||||||
Description:
|
|
||||||
Write the opening scene where Mara tries to enter the city under a false work permit.
|
|
||||||
|
|
||||||
Acceptance Criteria:
|
|
||||||
- Introduces Mara's immediate goal
|
|
||||||
- Shows the checkpoint culture without exposition dump
|
|
||||||
- Mentions the salt tax conflict indirectly
|
|
||||||
- Ends with the inspector noticing the forged seal
|
|
||||||
- 900-1400 words
|
|
||||||
- Maintains close third-person POV
|
|
||||||
```
|
|
||||||
|
|
||||||
NightShift would run one scene or section at a time.
|
|
||||||
|
|
||||||
## What We Already Have
|
|
||||||
|
|
||||||
NightShift already has several useful primitives:
|
|
||||||
|
|
||||||
- task files for chunking the novel into scenes or chapter sections
|
|
||||||
- scoped paths so agents only edit allowed writing/project files
|
|
||||||
- artifact output so drafts, reviews, and notes are preserved
|
|
||||||
- retry loops for revision
|
|
||||||
- planner/reviewer/debugger-style roles
|
|
||||||
- repo context and semantic context retrieval
|
|
||||||
- command stages that could run deterministic checks
|
|
||||||
- file-writer stages that can update Markdown files
|
|
||||||
- `lookup_requests` so agents can ask to read worldbuilding or prior scenes
|
|
||||||
|
|
||||||
That means this may not require a totally new engine. It may mostly need a new template and some writing-specific validation/review stages.
|
|
||||||
|
|
||||||
## Likely Workflow
|
|
||||||
|
|
||||||
One practical pipeline:
|
|
||||||
|
|
||||||
```text
|
|
||||||
plan_scene
|
|
||||||
gather_context
|
|
||||||
draft_scene
|
|
||||||
validate_scene
|
|
||||||
continuity_review
|
|
||||||
style_review
|
|
||||||
update_plot_state
|
|
||||||
summarize
|
|
||||||
```
|
|
||||||
|
|
||||||
Possible roles:
|
|
||||||
|
|
||||||
- Planner: turns the scene task into a beat plan.
|
|
||||||
- Context agent: pulls relevant worldbuilding, character, and plot-state excerpts.
|
|
||||||
- Drafting agent: writes the scene.
|
|
||||||
- Continuity reviewer: checks contradictions against known state.
|
|
||||||
- Style reviewer: checks POV, tone, pacing, and prose constraints.
|
|
||||||
- State updater: updates `plot-state.md`, `characters.md`, and maybe `timeline.md`.
|
|
||||||
|
|
||||||
## Chunking Strategy
|
|
||||||
|
|
||||||
Do not make a task equal to "write chapter 4" unless chapters are short.
|
|
||||||
|
|
||||||
Better units:
|
|
||||||
|
|
||||||
- scene
|
|
||||||
- scene fragment
|
|
||||||
- chapter section
|
|
||||||
- revision pass for one scene
|
|
||||||
- continuity update after one scene
|
|
||||||
- prose polish for one scene
|
|
||||||
|
|
||||||
A chapter can be assembled from multiple scene files:
|
|
||||||
|
|
||||||
```text
|
|
||||||
chapters/
|
|
||||||
chapter-001/
|
|
||||||
scene-001.md
|
|
||||||
scene-002.md
|
|
||||||
scene-003.md
|
|
||||||
chapter-001.md
|
|
||||||
```
|
|
||||||
|
|
||||||
Then a later command or agent stage can compile `chapter-001.md`.
|
|
||||||
|
|
||||||
## Durable State Files
|
|
||||||
|
|
||||||
The most important design piece is explicit state.
|
|
||||||
|
|
||||||
Recommended files:
|
|
||||||
|
|
||||||
```text
|
|
||||||
story/
|
|
||||||
worldbuilding.md
|
|
||||||
style-guide.md
|
|
||||||
characters.md
|
|
||||||
timeline.md
|
|
||||||
plot-state.md
|
|
||||||
unresolved-threads.md
|
|
||||||
continuity-rules.md
|
|
||||||
outline.md
|
|
||||||
chapters/
|
|
||||||
```
|
|
||||||
|
|
||||||
`plot-state.md` should be updated after every completed scene.
|
|
||||||
|
|
||||||
It should track:
|
|
||||||
|
|
||||||
- current character locations
|
|
||||||
- known secrets
|
|
||||||
- promises made to the reader
|
|
||||||
- unresolved questions
|
|
||||||
- relationships
|
|
||||||
- injuries/resources/items
|
|
||||||
- timeline date/time
|
|
||||||
- what each POV character currently knows
|
|
||||||
|
|
||||||
This is the fiction equivalent of application state.
|
|
||||||
|
|
||||||
## Validation Ideas
|
|
||||||
|
|
||||||
Some checks can be deterministic:
|
|
||||||
|
|
||||||
- word count range
|
|
||||||
- file exists
|
|
||||||
- only allowed files changed
|
|
||||||
- Markdown heading format
|
|
||||||
- no forbidden placeholders like `TODO`, `[insert]`, or `TBD`
|
|
||||||
- no accidental author notes in final prose
|
|
||||||
- required task terms are present
|
|
||||||
- output compiles into a chapter file
|
|
||||||
|
|
||||||
Some checks need model review:
|
|
||||||
|
|
||||||
- continuity with worldbuilding
|
|
||||||
- character voice consistency
|
|
||||||
- POV discipline
|
|
||||||
- pacing
|
|
||||||
- whether the scene satisfies the beat plan
|
|
||||||
- whether exposition is too direct
|
|
||||||
- whether the state update accurately reflects the scene
|
|
||||||
|
|
||||||
The key is not to overtrust model review. It should produce actionable retry notes, not silently bless everything.
|
|
||||||
|
|
||||||
## What Might Be Missing
|
|
||||||
|
|
||||||
### 1. Better Non-Code Templates
|
|
||||||
|
|
||||||
This likely needs a dedicated template:
|
|
||||||
|
|
||||||
```text
|
|
||||||
tutorial-deaddrop
|
|
||||||
tutorial-novel
|
|
||||||
```
|
|
||||||
|
|
||||||
or:
|
|
||||||
|
|
||||||
```text
|
|
||||||
writer-novel
|
|
||||||
```
|
|
||||||
|
|
||||||
The template would include:
|
|
||||||
|
|
||||||
- starter story files
|
|
||||||
- writing prompts
|
|
||||||
- task examples
|
|
||||||
- validation commands
|
|
||||||
- allowed paths
|
|
||||||
- recommended pipeline
|
|
||||||
|
|
||||||
### 2. Better Markdown Patch/File Handling
|
|
||||||
|
|
||||||
The current file-writer flow can work, but fiction output may be long. It may be safer to require complete file blocks for one scene file at a time.
|
|
||||||
|
|
||||||
The workflow should avoid having an agent rewrite the whole novel or whole `plot-state.md` unless necessary.
|
|
||||||
|
|
||||||
### 3. Stronger State Update Governance
|
|
||||||
|
|
||||||
The risky part is not drafting prose. The risky part is bad state updates.
|
|
||||||
|
|
||||||
Example failure:
|
|
||||||
|
|
||||||
- the scene says Mara never saw the prince
|
|
||||||
- the state updater records that Mara recognized the prince
|
|
||||||
- future scenes build on the wrong state
|
|
||||||
|
|
||||||
A state update should probably be reviewed against the actual scene before being applied.
|
|
||||||
|
|
||||||
Possible pipeline:
|
|
||||||
|
|
||||||
```text
|
|
||||||
draft_scene -> review_scene -> propose_state_update -> review_state_update -> apply
|
|
||||||
```
|
|
||||||
|
|
||||||
### 4. Context Window Management
|
|
||||||
|
|
||||||
Worldbuilding documents can get large.
|
|
||||||
|
|
||||||
The agent should not receive the entire story bible every time. It should receive:
|
|
||||||
|
|
||||||
- the current task
|
|
||||||
- relevant worldbuilding excerpts
|
|
||||||
- relevant character entries
|
|
||||||
- recent scene summaries
|
|
||||||
- current plot state
|
|
||||||
- style guide
|
|
||||||
|
|
||||||
Semantic search is probably enough for a first version, but a novel template may want a more explicit index:
|
|
||||||
|
|
||||||
```text
|
|
||||||
world-index.md
|
|
||||||
character-index.md
|
|
||||||
location-index.md
|
|
||||||
```
|
|
||||||
|
|
||||||
### 5. Scene Dependency Tracking
|
|
||||||
|
|
||||||
Coding tasks already have dependencies. Fiction tasks would need the same:
|
|
||||||
|
|
||||||
```text
|
|
||||||
Dependencies:
|
|
||||||
- SCENE-001
|
|
||||||
- SCENE-002
|
|
||||||
```
|
|
||||||
|
|
||||||
This prevents writing a later scene before the required earlier story state exists.
|
|
||||||
|
|
||||||
### 6. Revision Workflows
|
|
||||||
|
|
||||||
Writing is not only forward generation.
|
|
||||||
|
|
||||||
Useful task types:
|
|
||||||
|
|
||||||
- draft new scene
|
|
||||||
- revise scene for pacing
|
|
||||||
- revise dialogue
|
|
||||||
- continuity repair
|
|
||||||
- line edit
|
|
||||||
- chapter assembly
|
|
||||||
- chapter-level review
|
|
||||||
- update outline after discovery writing
|
|
||||||
|
|
||||||
NightShift can already represent these as tasks, but the prompts should distinguish them clearly.
|
|
||||||
|
|
||||||
### 7. Output Length Controls
|
|
||||||
|
|
||||||
Long fiction output needs explicit limits.
|
|
||||||
|
|
||||||
Use:
|
|
||||||
|
|
||||||
- scene word count bounds
|
|
||||||
- `num_predict`
|
|
||||||
- task acceptance criteria
|
|
||||||
- smaller scene files
|
|
||||||
|
|
||||||
Do not ask for "write chapter 12" unless the chapter has already been broken into beats.
|
|
||||||
|
|
||||||
## Suggested First Template
|
|
||||||
|
|
||||||
Start with a minimal `writer-novel` template.
|
|
||||||
|
|
||||||
Files:
|
|
||||||
|
|
||||||
```text
|
|
||||||
nightshift.yaml
|
|
||||||
.nightshift/tasks.md
|
|
||||||
.nightshift/agents/planner.md
|
|
||||||
.nightshift/agents/drafter.md
|
|
||||||
.nightshift/agents/continuity-reviewer.md
|
|
||||||
.nightshift/agents/style-reviewer.md
|
|
||||||
.nightshift/agents/state-updater.md
|
|
||||||
story/worldbuilding.md
|
|
||||||
story/characters.md
|
|
||||||
story/style-guide.md
|
|
||||||
story/plot-state.md
|
|
||||||
story/timeline.md
|
|
||||||
story/unresolved-threads.md
|
|
||||||
story/chapters/.gitkeep
|
|
||||||
```
|
|
||||||
|
|
||||||
Pipeline:
|
|
||||||
|
|
||||||
```text
|
|
||||||
plan
|
|
||||||
semantic_context
|
|
||||||
context
|
|
||||||
draft
|
|
||||||
validate_draft
|
|
||||||
continuity_review
|
|
||||||
style_review
|
|
||||||
update_state
|
|
||||||
validate_state
|
|
||||||
summarize
|
|
||||||
```
|
|
||||||
|
|
||||||
Allowed paths:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
scoped_paths:
|
|
||||||
- story
|
|
||||||
- .nightshift/tasks.md
|
|
||||||
```
|
|
||||||
|
|
||||||
Draft stage allowed paths:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
allowed_paths:
|
|
||||||
- story/chapters
|
|
||||||
```
|
|
||||||
|
|
||||||
State update stage allowed paths:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
allowed_paths:
|
|
||||||
- story/plot-state.md
|
|
||||||
- story/characters.md
|
|
||||||
- story/timeline.md
|
|
||||||
- story/unresolved-threads.md
|
|
||||||
```
|
|
||||||
|
|
||||||
That separation matters. The drafter should not freely rewrite the world bible, and the state updater should not rewrite the scene prose.
|
|
||||||
|
|
||||||
## What We Should Not Do First
|
|
||||||
|
|
||||||
Do not start with:
|
|
||||||
|
|
||||||
- automatic full-plot generation
|
|
||||||
- full chapter generation
|
|
||||||
- global rewrites of all prior chapters
|
|
||||||
- one giant `worldbuilding.md` dumped into every prompt
|
|
||||||
- trusting the model to maintain continuity without explicit state files
|
|
||||||
|
|
||||||
Those are likely to produce impressive-looking but unstable output.
|
|
||||||
|
|
||||||
## Practical First Experiment
|
|
||||||
|
|
||||||
A good first test:
|
|
||||||
|
|
||||||
1. Create a tiny worldbuilding document.
|
|
||||||
2. Create three characters.
|
|
||||||
3. Create five scene tasks.
|
|
||||||
4. Have NightShift draft one scene at a time.
|
|
||||||
5. After each scene, update `plot-state.md`.
|
|
||||||
6. Run continuity review against only the scene, state files, and relevant worldbuilding.
|
|
||||||
7. Inspect artifacts.
|
|
||||||
|
|
||||||
Success criteria:
|
|
||||||
|
|
||||||
- scenes land in the right files
|
|
||||||
- word counts stay bounded
|
|
||||||
- state updates are accurate
|
|
||||||
- future scenes use prior state correctly
|
|
||||||
- reviewers catch obvious contradictions
|
|
||||||
|
|
||||||
## Bottom Line
|
|
||||||
|
|
||||||
Theoretically, NightShift already has many of the needed utilities.
|
|
||||||
|
|
||||||
The missing piece is mostly a writing-oriented template with:
|
|
||||||
|
|
||||||
- scene-sized tasks
|
|
||||||
- durable story state files
|
|
||||||
- strict path separation between prose and state updates
|
|
||||||
- writing-specific prompts
|
|
||||||
- lightweight deterministic validators
|
|
||||||
- continuity/style review stages
|
|
||||||
|
|
||||||
This is viable, but it should start as a constrained scene-writing workflow, not an autonomous novel generator.
|
|
||||||
|
|
@ -162,7 +162,6 @@ def generate_patch_from_file_updates(
|
||||||
_validate_allowed_patch_path(normalized_path, root, allowed_paths)
|
_validate_allowed_patch_path(normalized_path, root, allowed_paths)
|
||||||
file_path = resolve_inside_root(root, normalized_path, f"file update '{normalized_path}'")
|
file_path = resolve_inside_root(root, normalized_path, f"file update '{normalized_path}'")
|
||||||
old_text = file_path.read_text(encoding="utf-8", errors="replace") if file_path.exists() else ""
|
old_text = file_path.read_text(encoding="utf-8", errors="replace") if file_path.exists() else ""
|
||||||
_validate_protected_character_canon(normalized_path, old_text, update.content)
|
|
||||||
if old_text == update.content:
|
if old_text == update.content:
|
||||||
continue
|
continue
|
||||||
patch_parts.extend(_diff_for_file(normalized_path, old_text, update.content, file_path.exists()))
|
patch_parts.extend(_diff_for_file(normalized_path, old_text, update.content, file_path.exists()))
|
||||||
|
|
@ -225,51 +224,6 @@ def _validate_allowed_patch_path(path_text: str, root: Path, allowed_paths: tupl
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _validate_protected_character_canon(path_text: str, old_text: str, new_text: str) -> None:
|
|
||||||
if path_text.replace("\\", "/") != "story/characters.md" or not old_text:
|
|
||||||
return
|
|
||||||
old_sections = _pronoun_reference_sections(old_text)
|
|
||||||
if not old_sections:
|
|
||||||
return
|
|
||||||
new_sections = _pronoun_reference_sections(new_text)
|
|
||||||
changed = [
|
|
||||||
character
|
|
||||||
for character, old_section in old_sections.items()
|
|
||||||
if new_sections.get(character) != old_section
|
|
||||||
]
|
|
||||||
if changed:
|
|
||||||
names = ", ".join(changed)
|
|
||||||
raise PipelineError(
|
|
||||||
"File writer error: protected character pronoun canon changed in "
|
|
||||||
f"`story/characters.md` for: {names}."
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _pronoun_reference_sections(text: str) -> dict[str, str]:
|
|
||||||
sections: dict[str, str] = {}
|
|
||||||
current_character: str | None = None
|
|
||||||
lines = text.splitlines()
|
|
||||||
index = 0
|
|
||||||
while index < len(lines):
|
|
||||||
line = lines[index]
|
|
||||||
if line.startswith("## "):
|
|
||||||
current_character = line[3:].strip()
|
|
||||||
index += 1
|
|
||||||
continue
|
|
||||||
if current_character and line.strip() == "### Pronouns / Reference":
|
|
||||||
start = index
|
|
||||||
index += 1
|
|
||||||
while index < len(lines):
|
|
||||||
candidate = lines[index]
|
|
||||||
if candidate.startswith("## ") or candidate.startswith("### "):
|
|
||||||
break
|
|
||||||
index += 1
|
|
||||||
sections[current_character] = "\n".join(lines[start:index]).strip()
|
|
||||||
continue
|
|
||||||
index += 1
|
|
||||||
return sections
|
|
||||||
|
|
||||||
|
|
||||||
def format_validation_result(result: PatchValidationResult) -> str:
|
def format_validation_result(result: PatchValidationResult) -> str:
|
||||||
return "\n".join(
|
return "\n".join(
|
||||||
[
|
[
|
||||||
|
|
|
||||||
|
|
@ -48,6 +48,7 @@ from .runlog import RunLogger
|
||||||
from .stages import StageResult
|
from .stages import StageResult
|
||||||
from .tasks import Task, mark_task_completed
|
from .tasks import Task, mark_task_completed
|
||||||
from .telemetry import TelemetryEntry, format_telemetry_summary, telemetry_from_stage_output
|
from .telemetry import TelemetryEntry, format_telemetry_summary, telemetry_from_stage_output
|
||||||
|
from .writing_validators import validate_writing_file_updates
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
|
|
@ -776,6 +777,8 @@ class PipelineRunner:
|
||||||
updates,
|
updates,
|
||||||
retry_count,
|
retry_count,
|
||||||
)
|
)
|
||||||
|
if _is_writing_file_writer_stage(stage):
|
||||||
|
validate_writing_file_updates(updates, self.config.project.root)
|
||||||
patch = generate_patch_from_file_updates(
|
patch = generate_patch_from_file_updates(
|
||||||
updates,
|
updates,
|
||||||
self.config.project.root,
|
self.config.project.root,
|
||||||
|
|
@ -794,6 +797,8 @@ class PipelineRunner:
|
||||||
and len(allowed_updates) < len(updates)
|
and len(allowed_updates) < len(updates)
|
||||||
and "not allowed for this stage" in str(exc)
|
and "not allowed for this stage" in str(exc)
|
||||||
):
|
):
|
||||||
|
if _is_writing_file_writer_stage(stage):
|
||||||
|
validate_writing_file_updates(allowed_updates, self.config.project.root)
|
||||||
patch = generate_patch_from_file_updates(
|
patch = generate_patch_from_file_updates(
|
||||||
allowed_updates,
|
allowed_updates,
|
||||||
self.config.project.root,
|
self.config.project.root,
|
||||||
|
|
@ -1322,8 +1327,9 @@ class PipelineRunner:
|
||||||
"Previous review output was malformed. Return exactly four lines: status, reason, next_stage, context_update. Do not return prose, headings, or analysis.",
|
"Previous review output was malformed. Return exactly four lines: status, reason, next_stage, context_update. Do not return prose, headings, or analysis.",
|
||||||
]
|
]
|
||||||
strict_outputs = _review_previous_outputs(previous_outputs)
|
strict_outputs = _review_previous_outputs(previous_outputs)
|
||||||
|
malformed_stdout = self._read_agent_stdout(malformed_result.output_path).strip()
|
||||||
strict_outputs["malformed_review_output"] = _compact_previous_output(
|
strict_outputs["malformed_review_output"] = _compact_previous_output(
|
||||||
self._read_output(malformed_result.output_path),
|
malformed_stdout if malformed_stdout else self._read_output(malformed_result.output_path),
|
||||||
max_chars=800,
|
max_chars=800,
|
||||||
)
|
)
|
||||||
result = self.agent_executor.run_stage(
|
result = self.agent_executor.run_stage(
|
||||||
|
|
@ -1336,6 +1342,17 @@ class PipelineRunner:
|
||||||
retry_context="\n".join(f"- {note}" for note in strict_notes),
|
retry_context="\n".join(f"- {note}" for note in strict_notes),
|
||||||
)
|
)
|
||||||
if _is_malformed_review_result(result):
|
if _is_malformed_review_result(result):
|
||||||
|
if stage.id == "style_review" and _previous_continuity_review_passed(previous_outputs):
|
||||||
|
return StageResult(
|
||||||
|
result.stage_id,
|
||||||
|
"pass",
|
||||||
|
(
|
||||||
|
"Style review output remained malformed after strict retry; "
|
||||||
|
"continuing because continuity review passed and deterministic validators already ran."
|
||||||
|
),
|
||||||
|
output_path=result.output_path,
|
||||||
|
context_update="Style review was malformed twice; treated as soft-pass after continuity passed.",
|
||||||
|
)
|
||||||
return StageResult(
|
return StageResult(
|
||||||
result.stage_id,
|
result.stage_id,
|
||||||
"fail",
|
"fail",
|
||||||
|
|
@ -1784,6 +1801,13 @@ def _failure_target_stage(stage: StageConfig, result: StageResult) -> str | None
|
||||||
return stage.on_fail
|
return stage.on_fail
|
||||||
|
|
||||||
|
|
||||||
|
def _previous_continuity_review_passed(previous_outputs: dict[str, str]) -> bool:
|
||||||
|
for name, output in previous_outputs.items():
|
||||||
|
if "continuity" in name and re.search(r"(?im)^status:\s*pass\s*$", output):
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
def _review_previous_outputs(previous_outputs: dict[str, str], max_chars: int = 1600) -> dict[str, str]:
|
def _review_previous_outputs(previous_outputs: dict[str, str], max_chars: int = 1600) -> dict[str, str]:
|
||||||
compacted: dict[str, str] = {}
|
compacted: dict[str, str] = {}
|
||||||
priority_names = {
|
priority_names = {
|
||||||
|
|
@ -1896,6 +1920,18 @@ def _is_scene_edit_stage(stage: StageConfig) -> bool:
|
||||||
return stage.type == "file_writer" and stage.id.startswith("edit_") and "story/chapters" in allowed
|
return stage.type == "file_writer" and stage.id.startswith("edit_") and "story/chapters" in allowed
|
||||||
|
|
||||||
|
|
||||||
|
def _is_writing_file_writer_stage(stage: StageConfig) -> bool:
|
||||||
|
allowed = {path.replace("\\", "/").rstrip("/") for path in stage.allowed_paths}
|
||||||
|
writing_paths = {
|
||||||
|
"story/chapters",
|
||||||
|
"story/plot-state.md",
|
||||||
|
"story/characters.md",
|
||||||
|
"story/timeline.md",
|
||||||
|
"story/unresolved-threads.md",
|
||||||
|
}
|
||||||
|
return stage.type == "file_writer" and bool(allowed & writing_paths)
|
||||||
|
|
||||||
|
|
||||||
def _task_story_chapter_paths(task: Task) -> tuple[str, ...]:
|
def _task_story_chapter_paths(task: Task) -> tuple[str, ...]:
|
||||||
paths: list[str] = []
|
paths: list[str] = []
|
||||||
seen: set[str] = set()
|
seen: set[str] = set()
|
||||||
|
|
|
||||||
|
|
@ -23,6 +23,15 @@ Do not fail the scene because durable state files are not updated yet. State fil
|
||||||
|
|
||||||
Wrong pronouns are a continuity failure. If a drafted scene uses non-canonical pronouns for a named character, return `status: fail` and explain which character drifted. Do not pass the scene with only `context_update` guidance.
|
Wrong pronouns are a continuity failure. If a drafted scene uses non-canonical pronouns for a named character, return `status: fail` and explain which character drifted. Do not pass the scene with only `context_update` guidance.
|
||||||
|
|
||||||
|
Pronoun canon quick reference:
|
||||||
|
- Proxy: she/her
|
||||||
|
- BLOODMONEY: narrative default they/them; he/him allowed only when dialogue or close character voice has a specific reason
|
||||||
|
- Cricket: she/her
|
||||||
|
- Saint: he/him
|
||||||
|
- Miette: she/her
|
||||||
|
|
||||||
|
If retry notes, previous reviewer output, or generated scene text conflict with `story/characters.md`, obey `story/characters.md`. Do not infer pronouns from a previous failure note. Before failing a pronoun issue, verify the character's `Pronouns / Reference` section.
|
||||||
|
|
||||||
Output exactly:
|
Output exactly:
|
||||||
|
|
||||||
status: pass | fail | retry | escalate
|
status: pass | fail | retry | escalate
|
||||||
|
|
|
||||||
|
|
@ -9,6 +9,8 @@ Rules:
|
||||||
- Use `story/style-guide.md` for POV, tense, tone, and prose rules.
|
- Use `story/style-guide.md` for POV, tense, tone, and prose rules.
|
||||||
- Use `story/characters.md`, especially `Pronouns / Reference`, as hard canon.
|
- Use `story/characters.md`, especially `Pronouns / Reference`, as hard canon.
|
||||||
- Wrong pronouns are mandatory fixes.
|
- Wrong pronouns are mandatory fixes.
|
||||||
|
- If retry notes or reviewer feedback conflict with `story/characters.md`, obey `story/characters.md`.
|
||||||
|
- Never change correct canonical pronouns because a review note claims a different canon.
|
||||||
- Do not edit state files, worldbuilding, outline, continuity rules, or style guide.
|
- Do not edit state files, worldbuilding, outline, continuity rules, or style guide.
|
||||||
- Do not resolve future plot threads unless the task explicitly asks for that.
|
- Do not resolve future plot threads unless the task explicitly asks for that.
|
||||||
- Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file.
|
- Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file.
|
||||||
|
|
|
||||||
175
nightshift/writing_validators.py
Normal file
175
nightshift/writing_validators.py
Normal file
|
|
@ -0,0 +1,175 @@
|
||||||
|
"""Writing-workflow validators.
|
||||||
|
|
||||||
|
These checks are intentionally kept out of the generic patch generator so code
|
||||||
|
generation can continue to treat file blocks as ordinary project files.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .errors import PipelineError
|
||||||
|
from .patches import FileUpdate
|
||||||
|
|
||||||
|
|
||||||
|
def validate_writing_file_updates(updates: tuple[FileUpdate, ...], project_root: Path) -> None:
|
||||||
|
"""Validate writing-specific invariants for novel scene/state file updates."""
|
||||||
|
|
||||||
|
root = Path(project_root)
|
||||||
|
characters_path = root / "story" / "characters.md"
|
||||||
|
character_sections = (
|
||||||
|
_pronoun_reference_sections(characters_path.read_text(encoding="utf-8", errors="replace"))
|
||||||
|
if characters_path.is_file()
|
||||||
|
else {}
|
||||||
|
)
|
||||||
|
for update in updates:
|
||||||
|
normalized_path = update.path.replace("\\", "/").strip().strip("/")
|
||||||
|
if normalized_path == "story/characters.md":
|
||||||
|
_validate_protected_character_canon(normalized_path, character_sections, update.content)
|
||||||
|
if normalized_path.startswith("story/chapters/") and normalized_path.endswith(".md"):
|
||||||
|
_validate_scene_pronoun_canon(normalized_path, update.content, character_sections)
|
||||||
|
|
||||||
|
|
||||||
|
def _validate_protected_character_canon(
|
||||||
|
path_text: str,
|
||||||
|
old_sections: dict[str, str],
|
||||||
|
new_text: str,
|
||||||
|
) -> None:
|
||||||
|
if path_text != "story/characters.md" or not old_sections:
|
||||||
|
return
|
||||||
|
new_sections = _pronoun_reference_sections(new_text)
|
||||||
|
changed = [
|
||||||
|
character
|
||||||
|
for character, old_section in old_sections.items()
|
||||||
|
if new_sections.get(character) != old_section
|
||||||
|
]
|
||||||
|
if changed:
|
||||||
|
names = ", ".join(changed)
|
||||||
|
raise PipelineError(
|
||||||
|
"File writer error: protected character pronoun canon changed in "
|
||||||
|
f"`story/characters.md` for: {names}."
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _validate_scene_pronoun_canon(
|
||||||
|
path_text: str,
|
||||||
|
scene_text: str,
|
||||||
|
sections: dict[str, str],
|
||||||
|
) -> None:
|
||||||
|
if not sections:
|
||||||
|
return
|
||||||
|
rules = _pronoun_rules_from_sections(sections)
|
||||||
|
if not rules:
|
||||||
|
return
|
||||||
|
aliases = {alias: character for character in rules for alias in _character_aliases(character)}
|
||||||
|
active_character: str | None = None
|
||||||
|
for sentence in _scene_sentences(scene_text):
|
||||||
|
present = {
|
||||||
|
character
|
||||||
|
for alias, character in aliases.items()
|
||||||
|
if re.search(rf"\b{re.escape(alias)}\b", sentence)
|
||||||
|
}
|
||||||
|
if len(present) > 1:
|
||||||
|
active_character = None
|
||||||
|
continue
|
||||||
|
character = next(iter(present)) if present else active_character
|
||||||
|
if character is None:
|
||||||
|
continue
|
||||||
|
forbidden = rules[character]
|
||||||
|
if present:
|
||||||
|
bad = _first_forbidden_pronoun(sentence, forbidden)
|
||||||
|
active_character = character
|
||||||
|
else:
|
||||||
|
bad = _leading_forbidden_pronoun(sentence, forbidden)
|
||||||
|
if not bad:
|
||||||
|
active_character = None
|
||||||
|
if bad:
|
||||||
|
excerpt = sentence.strip()
|
||||||
|
if len(excerpt) > 160:
|
||||||
|
excerpt = excerpt[:157].rstrip() + "..."
|
||||||
|
raise PipelineError(
|
||||||
|
"File writer error: scene pronoun canon violation for "
|
||||||
|
f"{character}: found `{bad}` near character reference. Excerpt: {excerpt}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _first_forbidden_pronoun(sentence: str, forbidden: tuple[str, ...]) -> str | None:
|
||||||
|
return next(
|
||||||
|
(
|
||||||
|
pronoun
|
||||||
|
for pronoun in forbidden
|
||||||
|
if re.search(rf"\b{re.escape(pronoun)}\b", sentence, flags=re.IGNORECASE)
|
||||||
|
),
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _leading_forbidden_pronoun(sentence: str, forbidden: tuple[str, ...]) -> str | None:
|
||||||
|
stripped = sentence.strip()
|
||||||
|
return next(
|
||||||
|
(
|
||||||
|
pronoun
|
||||||
|
for pronoun in forbidden
|
||||||
|
if re.match(rf"^{re.escape(pronoun)}\b", stripped, flags=re.IGNORECASE)
|
||||||
|
),
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _pronoun_rules_from_sections(sections: dict[str, str]) -> dict[str, tuple[str, ...]]:
|
||||||
|
rules: dict[str, tuple[str, ...]] = {}
|
||||||
|
for character, section in sections.items():
|
||||||
|
match = re.search(r"(?im)^-\s*Pronouns:\s*(?P<pronouns>.+?)\s*$", section)
|
||||||
|
if not match:
|
||||||
|
continue
|
||||||
|
pronouns = match.group("pronouns").lower()
|
||||||
|
forbidden: set[str] = set()
|
||||||
|
if "she/her" not in pronouns:
|
||||||
|
forbidden.update({"she", "her", "hers", "herself"})
|
||||||
|
if "he/him" not in pronouns:
|
||||||
|
forbidden.update({"he", "him", "his", "himself"})
|
||||||
|
if "they/them" not in pronouns:
|
||||||
|
forbidden.update({"they", "them", "their", "theirs", "themselves"})
|
||||||
|
if forbidden:
|
||||||
|
rules[character] = tuple(sorted(forbidden))
|
||||||
|
return rules
|
||||||
|
|
||||||
|
|
||||||
|
def _character_aliases(character: str) -> tuple[str, ...]:
|
||||||
|
base = re.sub(r"\s*\([^)]*\)", "", character).strip()
|
||||||
|
aliases = {base}
|
||||||
|
if base.startswith("DJ "):
|
||||||
|
aliases.add(base[3:].strip())
|
||||||
|
if " aka " in base:
|
||||||
|
aliases.update(part.strip() for part in base.split(" aka ") if part.strip())
|
||||||
|
return tuple(alias for alias in aliases if alias)
|
||||||
|
|
||||||
|
|
||||||
|
def _scene_sentences(text: str) -> tuple[str, ...]:
|
||||||
|
return tuple(part for part in re.split(r"(?<=[.!?])\s+|\n{2,}", text) if part.strip())
|
||||||
|
|
||||||
|
|
||||||
|
def _pronoun_reference_sections(text: str) -> dict[str, str]:
|
||||||
|
sections: dict[str, str] = {}
|
||||||
|
current_character: str | None = None
|
||||||
|
lines = text.splitlines()
|
||||||
|
index = 0
|
||||||
|
while index < len(lines):
|
||||||
|
line = lines[index]
|
||||||
|
if line.startswith("## "):
|
||||||
|
current_character = line[3:].strip()
|
||||||
|
index += 1
|
||||||
|
continue
|
||||||
|
if current_character and line.strip() == "### Pronouns / Reference":
|
||||||
|
start = index
|
||||||
|
index += 1
|
||||||
|
while index < len(lines):
|
||||||
|
candidate = lines[index]
|
||||||
|
if candidate.startswith("## ") or candidate.startswith("### "):
|
||||||
|
break
|
||||||
|
index += 1
|
||||||
|
sections[current_character] = "\n".join(lines[start:index]).strip()
|
||||||
|
continue
|
||||||
|
index += 1
|
||||||
|
return sections
|
||||||
|
|
@ -375,48 +375,5 @@ new
|
||||||
|
|
||||||
self.assertEqual(patch.count("diff --git a/app.py b/app.py"), 1)
|
self.assertEqual(patch.count("diff --git a/app.py b/app.py"), 1)
|
||||||
|
|
||||||
def test_file_updates_reject_character_pronoun_canon_changes(self) -> None:
|
|
||||||
with tempfile.TemporaryDirectory() as directory:
|
|
||||||
root = Path(directory)
|
|
||||||
(root / "story").mkdir()
|
|
||||||
(root / "story" / "characters.md").write_text(
|
|
||||||
"""# Characters
|
|
||||||
|
|
||||||
## Cricket
|
|
||||||
|
|
||||||
### Pronouns / Reference
|
|
||||||
- Pronouns: she/her
|
|
||||||
- Narrative reference: Cricket; she/her
|
|
||||||
|
|
||||||
Scavenger.
|
|
||||||
""",
|
|
||||||
encoding="utf-8",
|
|
||||||
)
|
|
||||||
safety = SafetyConfig(
|
|
||||||
require_clean_worktree=False,
|
|
||||||
scoped_paths=("story",),
|
|
||||||
allowed_commands=(),
|
|
||||||
forbidden_commands=(),
|
|
||||||
)
|
|
||||||
updates = parse_file_updates(
|
|
||||||
"""FILE: story/characters.md
|
|
||||||
---CONTENT---
|
|
||||||
# Characters
|
|
||||||
|
|
||||||
## Cricket
|
|
||||||
|
|
||||||
### Pronouns / Reference
|
|
||||||
- Pronouns: they/them
|
|
||||||
- Narrative reference: Cricket; they/them
|
|
||||||
|
|
||||||
Scavenger.
|
|
||||||
---END---
|
|
||||||
"""
|
|
||||||
)
|
|
||||||
|
|
||||||
with self.assertRaisesRegex(PipelineError, "protected character pronoun canon changed"):
|
|
||||||
generate_patch_from_file_updates(updates, root, safety)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
|
|
||||||
|
|
@ -269,6 +269,48 @@ class PipelineRunnerTests(unittest.TestCase):
|
||||||
self.assertIn("files", (task_dir / "review.md").read_text(encoding="utf-8"))
|
self.assertIn("files", (task_dir / "review.md").read_text(encoding="utf-8"))
|
||||||
self.assertIn("strict retry ok", (task_dir / "review-1.md").read_text(encoding="utf-8"))
|
self.assertIn("strict retry ok", (task_dir / "review-1.md").read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
def test_malformed_review_retry_uses_stdout_summary_not_full_prompt_artifact(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
(root / "fake_reviewer.py").write_text(
|
||||||
|
"\n".join(
|
||||||
|
[
|
||||||
|
"import sys",
|
||||||
|
"prompt = sys.stdin.read()",
|
||||||
|
"if 'Previous review output was malformed' in prompt:",
|
||||||
|
" open('retry-prompt.txt', 'w', encoding='utf-8').write(prompt)",
|
||||||
|
" print('status: pass')",
|
||||||
|
" print('reason: strict retry ok')",
|
||||||
|
" print('next_stage:')",
|
||||||
|
" print('context_update:')",
|
||||||
|
"else:",
|
||||||
|
" print('No extra text. No JSON.')",
|
||||||
|
]
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
stages = (
|
||||||
|
StageConfig(id="implement", type="agent", agent="planner", output="implementation-log.md"),
|
||||||
|
StageConfig(id="review", type="agent_review", agent="reviewer", output="review.md"),
|
||||||
|
)
|
||||||
|
config = make_config(root, stages, max_retries=1)
|
||||||
|
config.agents["reviewer"] = AgentConfig(
|
||||||
|
id="reviewer",
|
||||||
|
backend="command",
|
||||||
|
command="python fake_reviewer.py",
|
||||||
|
system_prompt=Path("reviewer.md"),
|
||||||
|
)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
|
||||||
|
result = runner.run_task(parse_tasks(TASK_MD)[0])
|
||||||
|
|
||||||
|
retry_prompt = (root / "retry-prompt.txt").read_text(encoding="utf-8")
|
||||||
|
self.assertEqual(result.status, "complete")
|
||||||
|
self.assertIn("malformed_review_output", retry_prompt)
|
||||||
|
self.assertIn("No extra text. No JSON.", retry_prompt)
|
||||||
|
self.assertNotIn("## Prompt", retry_prompt)
|
||||||
|
|
||||||
def test_malformed_review_stops_without_on_fail_redraft(self) -> None:
|
def test_malformed_review_stops_without_on_fail_redraft(self) -> None:
|
||||||
with tempfile.TemporaryDirectory() as directory:
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
root = Path(directory)
|
root = Path(directory)
|
||||||
|
|
@ -304,6 +346,31 @@ class PipelineRunnerTests(unittest.TestCase):
|
||||||
self.assertTrue((task_dir / "review.md").exists())
|
self.assertTrue((task_dir / "review.md").exists())
|
||||||
self.assertTrue((task_dir / "review-1.md").exists())
|
self.assertTrue((task_dir / "review-1.md").exists())
|
||||||
|
|
||||||
|
def test_malformed_style_review_soft_passes_after_continuity_pass(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
(root / "fake_style.py").write_text("print('No extra text. No JSON.')\n", encoding="utf-8")
|
||||||
|
stages = (
|
||||||
|
StageConfig(id="continuity_review", type="agent_review", agent="reviewer", output="continuity-review.md"),
|
||||||
|
StageConfig(id="style_review", type="agent_review", agent="style", output="style-review.md"),
|
||||||
|
StageConfig(id="summarize", type="summarize", output="final-notes.md"),
|
||||||
|
)
|
||||||
|
config = make_config(root, stages, max_retries=1)
|
||||||
|
config.agents["style"] = AgentConfig(
|
||||||
|
id="style",
|
||||||
|
backend="command",
|
||||||
|
command="python fake_style.py",
|
||||||
|
system_prompt=Path("reviewer.md"),
|
||||||
|
)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
|
||||||
|
result = runner.run_task(parse_tasks(TASK_MD)[0])
|
||||||
|
|
||||||
|
self.assertEqual(result.status, "complete")
|
||||||
|
self.assertIn("Style review output remained malformed", result.stage_results[1].reason)
|
||||||
|
self.assertEqual([item.stage_id for item in result.stage_results], ["continuity_review", "style_review", "summarize"])
|
||||||
|
|
||||||
def test_passing_review_next_stage_is_ignored(self) -> None:
|
def test_passing_review_next_stage_is_ignored(self) -> None:
|
||||||
with tempfile.TemporaryDirectory() as directory:
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
root = Path(directory)
|
root = Path(directory)
|
||||||
|
|
|
||||||
104
tests/test_writing_validators.py
Normal file
104
tests/test_writing_validators.py
Normal file
|
|
@ -0,0 +1,104 @@
|
||||||
|
from pathlib import Path
|
||||||
|
import tempfile
|
||||||
|
import unittest
|
||||||
|
|
||||||
|
from nightshift.errors import PipelineError
|
||||||
|
from nightshift.patches import FileUpdate
|
||||||
|
from nightshift.writing_validators import validate_writing_file_updates
|
||||||
|
|
||||||
|
|
||||||
|
class WritingValidatorTests(unittest.TestCase):
|
||||||
|
def test_rejects_character_pronoun_canon_changes(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
(root / "story").mkdir()
|
||||||
|
(root / "story" / "characters.md").write_text(
|
||||||
|
"""# Characters
|
||||||
|
|
||||||
|
## Cricket
|
||||||
|
|
||||||
|
### Pronouns / Reference
|
||||||
|
- Pronouns: she/her
|
||||||
|
- Narrative reference: Cricket; she/her
|
||||||
|
|
||||||
|
Scavenger.
|
||||||
|
""",
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
updates = (
|
||||||
|
FileUpdate(
|
||||||
|
path="story/characters.md",
|
||||||
|
content="""# Characters
|
||||||
|
|
||||||
|
## Cricket
|
||||||
|
|
||||||
|
### Pronouns / Reference
|
||||||
|
- Pronouns: they/them
|
||||||
|
- Narrative reference: Cricket; they/them
|
||||||
|
|
||||||
|
Scavenger.
|
||||||
|
""",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
with self.assertRaisesRegex(PipelineError, "protected character pronoun canon changed"):
|
||||||
|
validate_writing_file_updates(updates, root)
|
||||||
|
|
||||||
|
def test_rejects_scene_pronoun_drift(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
(root / "story" / "chapters").mkdir(parents=True)
|
||||||
|
(root / "story" / "characters.md").write_text(
|
||||||
|
"""# Characters
|
||||||
|
|
||||||
|
## Proxy
|
||||||
|
|
||||||
|
### Pronouns / Reference
|
||||||
|
- Pronouns: she/her
|
||||||
|
- Narrative reference: Proxy; she/her
|
||||||
|
""",
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
updates = (
|
||||||
|
FileUpdate(
|
||||||
|
path="story/chapters/chapter-001/scene-001.md",
|
||||||
|
content="Proxy checked the rack. He shut down the bad job.\n",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
with self.assertRaisesRegex(PipelineError, "scene pronoun canon violation for Proxy"):
|
||||||
|
validate_writing_file_updates(updates, root)
|
||||||
|
|
||||||
|
def test_allows_scene_pronouns_when_multiple_characters_make_ambiguous_sentence(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
(root / "story" / "chapters" / "chapter-001").mkdir(parents=True)
|
||||||
|
(root / "story" / "characters.md").write_text(
|
||||||
|
"""# Characters
|
||||||
|
|
||||||
|
## Proxy
|
||||||
|
|
||||||
|
### Pronouns / Reference
|
||||||
|
- Pronouns: she/her
|
||||||
|
- Narrative reference: Proxy; she/her
|
||||||
|
|
||||||
|
## Saint
|
||||||
|
|
||||||
|
### Pronouns / Reference
|
||||||
|
- Pronouns: he/him
|
||||||
|
- Narrative reference: Saint; he/him
|
||||||
|
""",
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
updates = (
|
||||||
|
FileUpdate(
|
||||||
|
path="story/chapters/chapter-001/scene-001.md",
|
||||||
|
content="Proxy watched Saint as he picked up the phone.\n",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
validate_writing_file_updates(updates, root)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
Loading…
Reference in New Issue
Block a user