Update story mode for invariants

This commit is contained in:
K. Hodges 2026-05-22 17:13:27 -07:00
parent 4ad0c310d1
commit c4d88fced5
17 changed files with 1112 additions and 68 deletions

View File

@ -0,0 +1,119 @@
# Iteration 1: SCENE-002 Update State Failure
Date: 2026-05-22
## Run Reviewed
- Sandbox: `integ_runs/20260522T214944.385761Z`
- Run: `.nightshift/runs/20260522T215005.188534Z`
- Task: `SCENE-002`
- Final status: failed
- Failed stage: `update_state`
## What Happened
The scene workflow mostly succeeded:
- `draft_scene` wrote the scene.
- `continuity_review` correctly failed the first draft for pronoun drift.
- `edit_scene` repaired the pronoun issue.
- `continuity_review` passed after edit.
- `style_review` passed.
The remaining failure happened in `update_state`.
NightShift reported:
```text
File writer error: no file blocks found. Expected FILE: path with ---CONTENT---/---END--- or fenced blocks like ```file:path.py.
```
The model output did contain visible `FILE:` blocks, but it omitted the required `---END---` delimiter. It emitted:
```text
FILE: story/plot-state.md
---CONTENT---
...
FILE: story/characters.md
---CONTENT---
...
```
The current parser requires `---END---`, so it rejected all of the blocks.
## Additional Risk Found
The rejected state update also tried to rewrite character canon in unsafe ways:
- It changed BLOODMONEY's pronoun reference to `he/him`.
- It changed Cricket's pronoun reference to `they/them`.
- It compressed/replaced larger parts of `story/characters.md`.
That means simply accepting unterminated blocks is not enough. The parser can be more tolerant, but the state updater still needs stronger constraints so durable canon does not drift.
## Suggested Fixes
Short-term fixes for this iteration:
1. Make `parse_file_updates` tolerate delimiter blocks that omit `---END---` when a new `FILE:` block or EOF clearly terminates the previous block.
2. Keep strict path validation and duplicate-file validation unchanged.
3. Strengthen the state-updater prompt:
- never edit `Pronouns / Reference` sections
- preserve existing character profiles
- prefer updating `plot-state.md`, `timeline.md`, and `unresolved-threads.md`
- edit `characters.md` only for small additive current-status facts
4. Add regression tests for unterminated delimiter parsing.
Longer-term follow-up:
- Add deterministic writing-state validation that rejects changes to protected canon sections such as `Pronouns / Reference`.
- Move character canon into structured data so pronoun constraints can be validated directly.
## Planned Changes
- Update delimiter block parsing in `nightshift/patches.py`.
- Add parser tests in `tests/test_patches.py`.
- Tighten `state-updater.md` in the tutorial novel template.
- Run focused parser tests and the full suite.
## Changes Made
- `parse_file_updates` now accepts delimiter-style file blocks that omit `---END---` when the next `FILE:` header or EOF clearly terminates the block.
- Added regression coverage for:
- unterminated delimiter blocks before another `FILE:`
- mixed terminated and unterminated delimiter blocks
- Strengthened the tutorial novel state updater prompt to protect character canon:
- never change `Pronouns / Reference`
- never change canonical pronouns, narrative reference, identity, or core wound
- prefer state/timeline/thread files over `characters.md`
- edit `characters.md` only for small additive current-status facts or new named characters
- Added deterministic protection in file-block patch generation:
- changes to existing `Pronouns / Reference` sections in `story/characters.md` are rejected before a patch is generated
- Added regression coverage for rejecting protected pronoun canon changes.
## Verification
Focused tests:
```powershell
python -m pytest tests/test_patches.py tests/test_pipeline.py -q
```
Result:
```text
56 passed
```
Full suite:
```powershell
python -m pytest -q
```
Result:
```text
199 passed, 4 subtests passed
```

View File

@ -99,33 +99,17 @@ Examples:
This keeps the initial useful output visible even when strict rerun output is worse. This keeps the initial useful output visible even when strict rerun output is worse.
## P1: Store Raw Agent Invocations As JSON ## P1: Classify Writing Review Failures For Repair Routing
The human-readable agent artifact wraps stdout, stderr, and prompts in markdown fences. Nested markdown fences from model output can confuse downstream parsing. The tutorial novel now has a short-term editor stage for review failures, but review failures should eventually be classified before routing.
Write a machine-readable artifact alongside the markdown artifact: Candidate classes:
```text - `local_edit`: pronoun drift, small continuity issue, missing beat, light style correction
<stage>-agent-output.json - `redraft`: wrong premise, broken scene structure, impossible chronology, severe acceptance mismatch
``` - `escalate`: ambiguous canon conflict or user preference needed
Suggested fields: Route `local_edit` to the editor, `redraft` to the drafter, and `escalate` to a clear user-facing failure. Keep original draft and edited draft artifacts side by side for comparison.
```json
{
"agent_id": "drafter",
"stage_id": "draft_scene",
"command": "POST http://localhost:11434/api/generate",
"exit_code": 0,
"timed_out": false,
"duration_seconds": 12.3,
"stdout": "...",
"stderr": "...",
"prompt": "..."
}
```
Pipeline parsing should read raw JSON fields instead of recovering stdout from markdown.
## P1: Add A Writing-Mode Validator ## P1: Add A Writing-Mode Validator
@ -140,6 +124,31 @@ Add deterministic checks for prose workflows:
This should run before model review stages. This should run before model review stages.
## P1: Use Structured State Events For Writing Workflows
Replace model-written full state-file rewrites with compact structured state events, then let NightShift deterministically merge them into durable files such as:
- `story/plot-state.md`
- `story/characters.md`
- `story/timeline.md`
- `story/unresolved-threads.md`
Candidate state updater output:
```yaml
events:
- file: story/plot-state.md
section: Completed Scenes
add:
- SCENE-001 complete; Saint and Miette introduced.
- file: story/unresolved-threads.md
section: Open Threads
add:
- Saint depends emotionally on Miette and needs compute tokens to keep her present.
```
NightShift would validate allowed files/sections, reject unknown targets, and apply append/update operations deterministically. This avoids asking a writing model to rewrite entire durable state files after every scene.
## P2: Add A Test Analyzer Agent For TDD ## P2: Add A Test Analyzer Agent For TDD
Defer until generated tests are stable. Defer until generated tests are stable.

136
docs/writer-and-coder.md Normal file
View File

@ -0,0 +1,136 @@
# Writer And Coder Compatibility Audit
Date: 2026-05-22
## Summary
The recent writer workflow changes do not intentionally alter the code-generation templates or their stage routing.
During this audit, one possible shared-pipeline regression was found and fixed: generic `file_writer` stages were compacting large previous outputs on the first attempt. Since coding templates use `file_writer` for implementation, that could have reduced coding context before the implementer saw it. The behavior now preserves full first-attempt previous outputs while still stripping wrapped agent prompts from prior agent artifacts.
After that correction, the automated test suite passes.
## Writer Changes Reviewed
- Tutorial novel added a scene editor repair path:
- failed continuity/style review routes to `edit_scene`
- edited scene is normalized, validated, applied, then routed back to review
- passing `style_review` skips editor and routes to `update_state`
- Tutorial novel prompts now include stricter pronoun and state-update guidance.
- State update file-writer stages receive focused current state context.
- Scene editor file-writer stages receive `current_scene_file`.
- Agent invocations now write a sibling JSON artifact for reliable stdout/stderr extraction.
- Pipeline config now supports optional `on_pass` routing.
## Coding Impact Findings
### Finding 1: Coding templates were not directly changed
No non-novel project template files changed in the current diff:
- `basic`
- `real-simple`
- `real-long-running`
- `tutorial-deaddrop`
- `tutorial-imageboard`
- `tutorial-lisp`
The new `editor` agent and review repair routing are only configured in `tutorial-novel/nightshift.yaml`.
### Finding 2: `on_pass` is inert for existing coding configs
`on_pass` defaults to `None`, so existing coding templates keep their prior linear pass behavior unless they explicitly opt in.
Passing review stages still ignore model-provided `next_stage` values. This preserves the existing safety behavior where reviewers cannot jump around the pipeline on a pass unless the config has an explicit `on_pass`.
### Finding 3: Code writer stages still use the same direct patch path
`code_writer` stages still:
- call the configured agent
- parse stdout as a unified diff
- support lookup-request reruns
- write implementation summaries
- feed patch normalizer/validator/apply stages as before
The JSON agent artifact change only changes how NightShift reads agent stdout internally; it does not change the prompt contract or patch contract.
### Finding 4: File-writer implementers had one possible context regression; fixed
Potential issue found:
- `_file_writer_previous_outputs` had started compacting large previous outputs even on first attempt.
- Coding templates such as DeadDrop use `file_writer` for implementation.
- That could have shortened planner/context output before the implementer saw it.
Fix applied:
- First-attempt `file_writer` stages now preserve full previous outputs.
- Retry attempts still compact large previous outputs to control prompt bloat.
- Wrapped agent artifacts still strip down to stdout so old prompts do not pollute later prompts.
Regression coverage added:
- `test_file_writer_first_attempt_preserves_large_previous_outputs`
### Finding 5: State/editor special context branches are narrowly gated
The new context enrichment branches are guarded by stage shape:
- state update branch only applies to `file_writer` stages whose allowed paths are state files:
- `story/plot-state.md`
- `story/characters.md`
- `story/timeline.md`
- `story/unresolved-threads.md`
- scene editor branch only applies to `file_writer` stages whose id starts with `edit_` and whose allowed paths include `story/chapters`
Normal coding implementer stages such as `implement`, `implement_junior`, and `implement_senior` do not match either branch.
## Template Validation Notes
Validated successfully:
- `basic`
- `tutorial-deaddrop`
- `tutorial-novel`
Validation still fails for these templates because `debugger` is configured but `.nightshift/agents/debugger.md` is missing:
- `real-simple`
- `real-long-running`
- `tutorial-imageboard`
- `tutorial-lisp`
Those failures are not caused by the writer changes; there is no current diff in those template directories.
## Verification
Focused tests:
```powershell
python -m pytest tests/test_pipeline.py tests/test_config.py tests/test_agents.py -q
```
Result:
```text
71 passed, 4 subtests passed
```
Full suite:
```powershell
python -m pytest -q
```
Result:
```text
196 passed, 4 subtests passed
```
## Conclusion
After the first-attempt file-writer context fix, I do not see evidence that the writer workflow changes degrade code generation. The shared changes are either opt-in (`on_pass`), artifact-reading improvements (JSON stdout), or narrowly gated to novel state/editor stages.
Remaining non-writer issue: several coding-oriented templates still reference a missing `debugger.md` prompt. That should be handled separately from this writer/coder compatibility pass.

View File

@ -3,6 +3,7 @@
from __future__ import annotations from __future__ import annotations
from dataclasses import dataclass from dataclasses import dataclass
from dataclasses import asdict
import json import json
import os import os
from pathlib import Path from pathlib import Path
@ -119,6 +120,11 @@ class AgentExecutor:
output_filename = stage.output or f"{stage.id}.md" output_filename = stage.output or f"{stage.id}.md"
output = format_agent_invocation(stage.id, invocation) output = format_agent_invocation(stage.id, invocation)
output_path = self.artifacts.write_stage_output(task.id, output_filename, output) output_path = self.artifacts.write_stage_output(task.id, output_filename, output)
json_output_path = self.artifacts.write_stage_output(
task.id,
_agent_invocation_json_filename(output_filename),
format_agent_invocation_json(stage.id, invocation),
)
self.logger.event( self.logger.event(
"artifact.write", "artifact.write",
"Wrote agent artifact", "Wrote agent artifact",
@ -126,6 +132,7 @@ class AgentExecutor:
task_id=task.id, task_id=task.id,
agent_id=agent.id, agent_id=agent.id,
artifact_path=output_path.relative_to(self.project_root), artifact_path=output_path.relative_to(self.project_root),
json_artifact_path=json_output_path.relative_to(self.project_root),
) )
if invocation.timed_out: if invocation.timed_out:
@ -520,13 +527,30 @@ def _file_writer_block_contract(stage: StageConfig) -> str:
return "\n".join( return "\n".join(
[ [
"Use exactly this delimiter format for the scene file:", "Use exactly this delimiter format for the scene file:",
"FILE: story/chapters/chapter-001/scene-001.md", "FILE: <the exact story/chapters path listed under Writes in the current task>",
"---CONTENT---", "---CONTENT---",
"<complete scene prose>", "<complete scene prose>",
"---END---", "---END---",
"Do not use markdown code fences for prose scene output.", "Do not use markdown code fences for prose scene output.",
] ]
) )
state_paths = {
"story/plot-state.md",
"story/characters.md",
"story/timeline.md",
"story/unresolved-threads.md",
}
if set(normalized).issubset(state_paths) and normalized:
return "\n".join(
[
"Use exactly this delimiter format for each state file you update:",
"FILE: story/plot-state.md",
"---CONTENT---",
"<complete updated state file>",
"---END---",
"Do not use markdown code fences for state update output.",
]
)
return "\n".join( return "\n".join(
[ [
"Use one fenced block per file with this exact opening form:", "Use one fenced block per file with this exact opening form:",
@ -622,3 +646,18 @@ def format_agent_invocation(stage_id: str, invocation: AgentInvocation) -> str:
"", "",
] ]
) )
def format_agent_invocation_json(stage_id: str, invocation: AgentInvocation) -> str:
data = {
**asdict(invocation),
"stage_id": stage_id,
}
return json.dumps(data, ensure_ascii=False, indent=2) + "\n"
def _agent_invocation_json_filename(output_filename: str) -> str:
path = Path(output_filename)
if path.suffix:
return path.with_suffix(".json").as_posix()
return path.with_name(path.name + ".json").as_posix()

View File

@ -61,6 +61,7 @@ class StageConfig:
commands: tuple[str, ...] = () commands: tuple[str, ...] = ()
output: str | None = None output: str | None = None
on_fail: str | None = None on_fail: str | None = None
on_pass: str | None = None
shell: bool = True shell: bool = True
timeout_seconds: int | None = None timeout_seconds: int | None = None
working_dir: Path | None = None working_dir: Path | None = None
@ -392,6 +393,7 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
commands=commands, commands=commands,
output=_optional_string(stage_raw.get("output"), f"{stage_context}.output"), output=_optional_string(stage_raw.get("output"), f"{stage_context}.output"),
on_fail=_optional_string(stage_raw.get("on_fail"), f"{stage_context}.on_fail"), on_fail=_optional_string(stage_raw.get("on_fail"), f"{stage_context}.on_fail"),
on_pass=_optional_string(stage_raw.get("on_pass"), f"{stage_context}.on_pass"),
shell=_optional_bool(stage_raw.get("shell", True), f"{stage_context}.shell"), shell=_optional_bool(stage_raw.get("shell", True), f"{stage_context}.shell"),
timeout_seconds=timeout_seconds, timeout_seconds=timeout_seconds,
working_dir=Path(working_dir_raw) if working_dir_raw else None, working_dir=Path(working_dir_raw) if working_dir_raw else None,
@ -416,6 +418,10 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
raise ConfigError( raise ConfigError(
f"Config error: stage '{stage.id}' on_fail references unknown stage '{stage.on_fail}'." f"Config error: stage '{stage.id}' on_fail references unknown stage '{stage.on_fail}'."
) )
if stage.on_pass and stage.on_pass not in stage_ids:
raise ConfigError(
f"Config error: stage '{stage.id}' on_pass references unknown stage '{stage.on_pass}'."
)
return NightShiftConfig( return NightShiftConfig(
path=config_path, path=config_path,

View File

@ -112,10 +112,26 @@ def parse_file_updates(text: str) -> tuple[FileUpdate, ...]:
def _parse_delimited_file_updates(text: str) -> list[FileUpdate]: def _parse_delimited_file_updates(text: str) -> list[FileUpdate]:
updates: list[FileUpdate] = []
header_pattern = re.compile(r"(?m)^FILE:\s*(?P<path>[^\n]+)\n---CONTENT---\n")
matches = list(header_pattern.finditer(text))
for index, match in enumerate(matches):
path = match.group("path").strip().strip("`")
content_start = match.end()
next_file_start = matches[index + 1].start() if index + 1 < len(matches) else len(text)
raw_content = text[content_start:next_file_start]
end_match = re.search(r"(?m)^---END---\s*$", raw_content)
if end_match:
raw_content = raw_content[: end_match.start()]
content = raw_content.rstrip("\r\n") + "\n"
if path:
updates.append(FileUpdate(path=path, content=content))
if updates:
return updates
pattern = re.compile( pattern = re.compile(
r"(?ms)^FILE:\s*(?P<path>[^\n]+)\n---CONTENT---\n(?P<content>.*?)\n---END---\s*$" r"(?ms)^FILE:\s*(?P<path>[^\n]+)\n---CONTENT---\n(?P<content>.*?)\n---END---\s*$"
) )
updates: list[FileUpdate] = []
for match in pattern.finditer(text): for match in pattern.finditer(text):
path = match.group("path").strip().strip("`") path = match.group("path").strip().strip("`")
content = match.group("content") content = match.group("content")
@ -146,6 +162,7 @@ def generate_patch_from_file_updates(
_validate_allowed_patch_path(normalized_path, root, allowed_paths) _validate_allowed_patch_path(normalized_path, root, allowed_paths)
file_path = resolve_inside_root(root, normalized_path, f"file update '{normalized_path}'") file_path = resolve_inside_root(root, normalized_path, f"file update '{normalized_path}'")
old_text = file_path.read_text(encoding="utf-8", errors="replace") if file_path.exists() else "" old_text = file_path.read_text(encoding="utf-8", errors="replace") if file_path.exists() else ""
_validate_protected_character_canon(normalized_path, old_text, update.content)
if old_text == update.content: if old_text == update.content:
continue continue
patch_parts.extend(_diff_for_file(normalized_path, old_text, update.content, file_path.exists())) patch_parts.extend(_diff_for_file(normalized_path, old_text, update.content, file_path.exists()))
@ -208,6 +225,51 @@ def _validate_allowed_patch_path(path_text: str, root: Path, allowed_paths: tupl
) )
def _validate_protected_character_canon(path_text: str, old_text: str, new_text: str) -> None:
if path_text.replace("\\", "/") != "story/characters.md" or not old_text:
return
old_sections = _pronoun_reference_sections(old_text)
if not old_sections:
return
new_sections = _pronoun_reference_sections(new_text)
changed = [
character
for character, old_section in old_sections.items()
if new_sections.get(character) != old_section
]
if changed:
names = ", ".join(changed)
raise PipelineError(
"File writer error: protected character pronoun canon changed in "
f"`story/characters.md` for: {names}."
)
def _pronoun_reference_sections(text: str) -> dict[str, str]:
sections: dict[str, str] = {}
current_character: str | None = None
lines = text.splitlines()
index = 0
while index < len(lines):
line = lines[index]
if line.startswith("## "):
current_character = line[3:].strip()
index += 1
continue
if current_character and line.strip() == "### Pronouns / Reference":
start = index
index += 1
while index < len(lines):
candidate = lines[index]
if candidate.startswith("## ") or candidate.startswith("### "):
break
index += 1
sections[current_character] = "\n".join(lines[start:index]).strip()
continue
index += 1
return sections
def format_validation_result(result: PatchValidationResult) -> str: def format_validation_result(result: PatchValidationResult) -> str:
return "\n".join( return "\n".join(
[ [

View File

@ -3,6 +3,7 @@
from __future__ import annotations from __future__ import annotations
from dataclasses import dataclass, replace from dataclasses import dataclass, replace
import json
from pathlib import Path from pathlib import Path
import re import re
import subprocess import subprocess
@ -181,7 +182,7 @@ class PipelineRunner:
stage_results.append(result) stage_results.append(result)
if stage.id in previous_outputs: if stage.id in previous_outputs:
del previous_outputs[stage.id] del previous_outputs[stage.id]
previous_outputs[stage.id] = self._read_output(result.output_path) previous_outputs[stage.id] = self._read_context_output(result.output_path)
telemetry_entries.append(self._telemetry_entry(stage, result, retry_count)) telemetry_entries.append(self._telemetry_entry(stage, result, retry_count))
self._write_telemetry(task.id, telemetry_entries) self._write_telemetry(task.id, telemetry_entries)
self.logger.event( self.logger.event(
@ -198,6 +199,7 @@ class PipelineRunner:
retry_notes.append(f"Context update from '{stage.id}': {result.context_update}") retry_notes.append(f"Context update from '{stage.id}': {result.context_update}")
if result.status == "pass": if result.status == "pass":
pass_target_stage = result.next_stage or stage.on_pass
if stage.type in {"agent_review", "review"} and result.next_stage: if stage.type in {"agent_review", "review"} and result.next_stage:
self.logger.event( self.logger.event(
"stage.next_ignored", "stage.next_ignored",
@ -207,13 +209,12 @@ class PipelineRunner:
stage_id=stage.id, stage_id=stage.id,
requested_next_stage=result.next_stage, requested_next_stage=result.next_stage,
) )
index += 1 pass_target_stage = stage.on_pass
continue if pass_target_stage:
if result.next_stage: if pass_target_stage not in stage_indexes:
if result.next_stage not in stage_indexes:
final_status = "failed" final_status = "failed"
final_reason = ( final_reason = (
f"Stage '{stage.id}' requested unknown next stage '{result.next_stage}'." f"Stage '{stage.id}' requested unknown next stage '{pass_target_stage}'."
) )
break break
self.logger.event( self.logger.event(
@ -222,18 +223,14 @@ class PipelineRunner:
run_id=self.artifacts.run_id, run_id=self.artifacts.run_id,
task_id=task.id, task_id=task.id,
stage_id=stage.id, stage_id=stage.id,
next_stage=result.next_stage, next_stage=pass_target_stage,
) )
index = stage_indexes[result.next_stage] index = stage_indexes[pass_target_stage]
continue continue
index += 1 index += 1
continue continue
target_stage = result.next_stage or ( target_stage = _failure_target_stage(stage, result)
stage.on_fail
if not (stage.type in {"agent_review", "review"} and _is_malformed_review_result(result))
else None
)
analysis_note = self._write_failure_diagnostics(stage, task, result, retry_count) analysis_note = self._write_failure_diagnostics(stage, task, result, retry_count)
if analysis_note: if analysis_note:
retry_notes.append(analysis_note) retry_notes.append(analysis_note)
@ -629,8 +626,7 @@ class PipelineRunner:
task_context=context.task_context, task_context=context.task_context,
retry_context=context.retry_context, retry_context=context.retry_context,
) )
raw_output = self._read_output(result.output_path) stdout = self._read_agent_stdout(result.output_path)
stdout = extract_agent_stdout(raw_output)
lookup_requests = parse_lookup_requests(stdout) lookup_requests = parse_lookup_requests(stdout)
if lookup_requests and "diff --git " not in stdout: if lookup_requests and "diff --git " not in stdout:
lookup_context = self.repo_tools.execute_requests( lookup_context = self.repo_tools.execute_requests(
@ -660,8 +656,7 @@ class PipelineRunner:
task_context=context.task_context, task_context=context.task_context,
retry_context="\n".join(f"- {note}" for note in rerun_notes), retry_context="\n".join(f"- {note}" for note in rerun_notes),
) )
raw_output = self._read_output(result.output_path) stdout = self._read_agent_stdout(result.output_path)
stdout = extract_agent_stdout(raw_output)
try: try:
patch = extract_unified_diff(stdout) patch = extract_unified_diff(stdout)
except PipelineError as exc: except PipelineError as exc:
@ -709,6 +704,17 @@ class PipelineRunner:
) -> StageResult: ) -> StageResult:
if stage.agent is None: if stage.agent is None:
raise PipelineError(f"Pipeline error: file_writer stage '{stage.id}' must reference an agent.") raise PipelineError(f"Pipeline error: file_writer stage '{stage.id}' must reference an agent.")
if _is_state_update_stage(stage):
enriched_outputs = _state_update_previous_outputs(previous_outputs)
allowed_file_contents = self._allowed_file_contents(stage)
if allowed_file_contents:
enriched_outputs["current_allowed_files"] = allowed_file_contents
elif _is_scene_edit_stage(stage):
enriched_outputs = _file_writer_previous_outputs(previous_outputs, retry_count)
current_scene = self._task_scene_file_contents(task)
if current_scene:
enriched_outputs["current_scene_file"] = current_scene
else:
enriched_outputs = _file_writer_previous_outputs(previous_outputs, retry_count) enriched_outputs = _file_writer_previous_outputs(previous_outputs, retry_count)
context_pack_path = self._latest_task_artifact(task.id, "context-pack.md") context_pack_path = self._latest_task_artifact(task.id, "context-pack.md")
if context_pack_path is not None: if context_pack_path is not None:
@ -727,8 +733,7 @@ class PipelineRunner:
task_context=context.task_context, task_context=context.task_context,
retry_context=context.retry_context, retry_context=context.retry_context,
) )
raw_output = self._read_output(result.output_path) stdout = self._read_agent_stdout(result.output_path)
stdout = extract_agent_stdout(raw_output)
lookup_requests = parse_lookup_requests(stdout) lookup_requests = parse_lookup_requests(stdout)
if lookup_requests and "```file:" not in stdout.lower() and "```path:" not in stdout.lower(): if lookup_requests and "```file:" not in stdout.lower() and "```path:" not in stdout.lower():
lookup_context = self.repo_tools.execute_requests( lookup_context = self.repo_tools.execute_requests(
@ -758,8 +763,7 @@ class PipelineRunner:
task_context=context.task_context, task_context=context.task_context,
retry_context="\n".join(f"- {note}" for note in rerun_notes), retry_context="\n".join(f"- {note}" for note in rerun_notes),
) )
raw_output = self._read_output(result.output_path) stdout = self._read_agent_stdout(result.output_path)
stdout = extract_agent_stdout(raw_output)
invalid_rerun_done = False invalid_rerun_done = False
candidate_index_path: Path | None = None candidate_index_path: Path | None = None
while True: while True:
@ -803,7 +807,7 @@ class PipelineRunner:
strict_notes = [ strict_notes = [
*retry_notes, *retry_notes,
"Previous file_writer output was invalid. Return complete file blocks now. Do not output lookup_requests, prose, or 'lookup failed'.", "Previous file_writer output was invalid. Return complete file blocks now. Do not output lookup_requests, prose, or 'lookup failed'.",
"Use complete fenced file blocks with both the opening ```file:path and closing ``` fence.", _file_writer_repair_format_note(stage),
] ]
result = self.agent_executor.run_stage( result = self.agent_executor.run_stage(
agent_stage, agent_stage,
@ -814,8 +818,7 @@ class PipelineRunner:
task_context=context.task_context, task_context=context.task_context,
retry_context="\n".join(f"- {note}" for note in strict_notes), retry_context="\n".join(f"- {note}" for note in strict_notes),
) )
raw_output = self._read_output(result.output_path) stdout = self._read_agent_stdout(result.output_path)
stdout = extract_agent_stdout(raw_output)
continue continue
try: try:
patch = normalize_patch_text(stdout) patch = normalize_patch_text(stdout)
@ -923,6 +926,44 @@ class PipelineRunner:
lines.append("") lines.append("")
return self.artifacts.write_stage_output(task_id, f"{base}/index.md", "\n".join(lines)) return self.artifacts.write_stage_output(task_id, f"{base}/index.md", "\n".join(lines))
def _allowed_file_contents(self, stage: StageConfig, max_chars: int = 2400) -> str:
sections: list[str] = []
for path_text in stage.allowed_paths:
path = self.config.project.root / path_text
if not path.is_file():
continue
content = path.read_text(encoding="utf-8", errors="replace")
sections.extend(
[
f"## {path_text}",
"",
"```text",
_compact_previous_output(content, max_chars=max_chars).rstrip(),
"```",
"",
]
)
return "\n".join(sections).strip()
def _task_scene_file_contents(self, task: Task, max_chars: int = 10000) -> str:
sections: list[str] = []
for path_text in _task_story_chapter_paths(task):
path = self.config.project.root / path_text
if not path.is_file():
continue
content = path.read_text(encoding="utf-8", errors="replace")
sections.extend(
[
f"## {path_text}",
"",
"```text",
_compact_previous_output(content, max_chars=max_chars).rstrip(),
"```",
"",
]
)
return "\n".join(sections).strip()
def _writer_agent_stage(self, stage: StageConfig, retry_count: int) -> StageConfig: def _writer_agent_stage(self, stage: StageConfig, retry_count: int) -> StageConfig:
suffix = f"-{retry_count}" if retry_count else "" suffix = f"-{retry_count}" if retry_count else ""
return replace( return replace(
@ -975,7 +1016,7 @@ class PipelineRunner:
task_context=self.context.read_context(task, retry_notes).task_context, task_context=self.context.read_context(task, retry_notes).task_context,
retry_context=self.context.read_context(task, retry_notes).retry_context, retry_context=self.context.read_context(task, retry_notes).retry_context,
) )
source = extract_agent_stdout(self._read_output(result.output_path)) source = self._read_agent_stdout(result.output_path)
try: try:
patch = normalize_patch_text(source) patch = normalize_patch_text(source)
except PipelineError as exc: except PipelineError as exc:
@ -1127,8 +1168,7 @@ class PipelineRunner:
task: Task, task: Task,
result: StageResult, result: StageResult,
) -> StageResult | None: ) -> StageResult | None:
output_text = self._read_output(result.output_path) requests = parse_resource_requests(self._read_agent_stdout(result.output_path))
requests = parse_resource_requests(extract_agent_stdout(output_text))
if not requests: if not requests:
return None return None
paths = satisfy_resource_requests(self.artifacts, task.id, requests) paths = satisfy_resource_requests(self.artifacts, task.id, requests)
@ -1338,8 +1378,7 @@ class PipelineRunner:
) -> StageResult: ) -> StageResult:
if result.status != "pass" or result.output_path is None: if result.status != "pass" or result.output_path is None:
return result return result
output_text = self._read_output(result.output_path) requests = parse_lookup_requests(self._read_agent_stdout(result.output_path))
requests = parse_lookup_requests(extract_agent_stdout(output_text))
if not requests: if not requests:
return result return result
lookup_context = self.repo_tools.execute_requests( lookup_context = self.repo_tools.execute_requests(
@ -1457,6 +1496,25 @@ class PipelineRunner:
return "" return ""
return path.read_text(encoding="utf-8") return path.read_text(encoding="utf-8")
def _read_context_output(self, output_path: str | None) -> str:
stdout = self._read_agent_stdout(output_path)
return stdout if stdout else self._read_output(output_path)
def _read_agent_stdout(self, output_path: str | None) -> str:
if output_path is None:
return ""
path = self.config.project.root / Path(output_path)
json_path = _agent_invocation_json_path(path)
if json_path.exists():
try:
data = json.loads(json_path.read_text(encoding="utf-8"))
except json.JSONDecodeError:
data = {}
stdout = data.get("stdout")
if isinstance(stdout, str):
return stdout
return extract_agent_stdout(self._read_output(output_path))
def _format_retry_note( def _format_retry_note(
self, self,
retry_count: int, retry_count: int,
@ -1468,6 +1526,16 @@ class PipelineRunner:
f"Retry {retry_count}: stage '{stage.id}' returned " f"Retry {retry_count}: stage '{stage.id}' returned "
f"{result.status} ({result.reason}); redirecting to '{target_stage}'." f"{result.status} ({result.reason}); redirecting to '{target_stage}'."
) )
if (
target_stage == "update_state"
and "deletion-heavy patch" in result.reason.lower()
):
note = (
f"{note}\n"
"Repair guidance: preserve existing durable state text unless it directly conflicts "
"with the accepted scene. Make minimal additive edits instead of replacing whole "
"sections or compressing character/world files."
)
excerpt = self._failure_excerpt(result.output_path) excerpt = self._failure_excerpt(result.output_path)
if not excerpt: if not excerpt:
return note return note
@ -1683,6 +1751,16 @@ def _is_malformed_review_result(result: StageResult) -> bool:
) )
def _failure_target_stage(stage: StageConfig, result: StageResult) -> str | None:
if stage.type not in {"agent_review", "review"}:
return result.next_stage or stage.on_fail
if _is_malformed_review_result(result):
return None
if result.next_stage and result.next_stage != stage.id:
return result.next_stage
return stage.on_fail
def _review_previous_outputs(previous_outputs: dict[str, str], max_chars: int = 1600) -> dict[str, str]: def _review_previous_outputs(previous_outputs: dict[str, str], max_chars: int = 1600) -> dict[str, str]:
compacted: dict[str, str] = {} compacted: dict[str, str] = {}
priority_names = { priority_names = {
@ -1736,6 +1814,15 @@ def _file_writer_stage_guidance(stage: StageConfig) -> str:
return "" return ""
def _file_writer_repair_format_note(stage: StageConfig) -> str:
if _is_state_update_stage(stage):
return (
"Use delimiter file blocks only: FILE: path, ---CONTENT---, complete file content, "
"---END---. Do not use markdown code fences for state update output."
)
return "Use complete fenced file blocks with both the opening ```file:path and closing ``` fence."
def _candidate_artifact_name(path_text: str) -> str: def _candidate_artifact_name(path_text: str) -> str:
name = path_text.replace("\\", "/").strip().strip("/") name = path_text.replace("\\", "/").strip().strip("/")
name = re.sub(r"[^A-Za-z0-9_.-]+", "_", name) name = re.sub(r"[^A-Za-z0-9_.-]+", "_", name)
@ -1748,14 +1835,70 @@ def _file_writer_previous_outputs(
retry_count: int, retry_count: int,
max_chars: int = 1200, max_chars: int = 1200,
) -> dict[str, str]: ) -> dict[str, str]:
if retry_count <= 0:
return dict(previous_outputs)
compacted: dict[str, str] = {} compacted: dict[str, str] = {}
for name, output in previous_outputs.items(): for name, output in previous_outputs.items():
compacted[name] = _compact_previous_output(output, max_chars=max_chars) clean_output = _compact_agent_artifact_output(output)
if retry_count <= 0:
compacted[name] = clean_output
continue
compacted[name] = _compact_previous_output(clean_output, max_chars=max_chars)
return compacted return compacted
def _is_state_update_stage(stage: StageConfig) -> bool:
state_paths = {
"story/plot-state.md",
"story/characters.md",
"story/timeline.md",
"story/unresolved-threads.md",
}
allowed = {path.replace("\\", "/").rstrip("/") for path in stage.allowed_paths}
return stage.type == "file_writer" and bool(allowed) and allowed.issubset(state_paths)
def _is_scene_edit_stage(stage: StageConfig) -> bool:
allowed = {path.replace("\\", "/").rstrip("/") for path in stage.allowed_paths}
return stage.type == "file_writer" and stage.id.startswith("edit_") and "story/chapters" in allowed
def _task_story_chapter_paths(task: Task) -> tuple[str, ...]:
paths: list[str] = []
seen: set[str] = set()
for match in re.finditer(r"story/chapters/[^\s`]+?\.md", task.raw_markdown):
path = match.group(0).strip().strip("`")
if path not in seen:
paths.append(path)
seen.add(path)
return tuple(paths)
def _state_update_previous_outputs(previous_outputs: dict[str, str]) -> dict[str, str]:
compacted: dict[str, str] = {}
for name in ("draft_scene", "apply_draft", "continuity_review", "style_review"):
output = previous_outputs.get(name)
if output:
compacted[name] = _compact_previous_output(_compact_agent_artifact_output(output), max_chars=1800)
for name, output in previous_outputs.items():
if name in compacted or name in {"plan", "semantic_context", "context"}:
continue
if "draft" in name or "review" in name or "apply" in name:
compacted[name] = _compact_previous_output(_compact_agent_artifact_output(output), max_chars=1200)
return compacted
def _compact_agent_artifact_output(output: str) -> str:
if "# Agent Output:" not in output or "## Prompt" not in output:
return output
stdout = extract_agent_stdout(output).strip()
return stdout if stdout else output
def _agent_invocation_json_path(output_path: Path) -> Path:
if output_path.suffix:
return output_path.with_suffix(".json")
return output_path.with_name(output_path.name + ".json")
def _compact_previous_output(output: str, max_chars: int = 1200) -> str: def _compact_previous_output(output: str, max_chars: int = 1200) -> str:
if len(output) <= max_chars: if len(output) <= max_chars:
return output return output

View File

@ -13,11 +13,16 @@ Review the drafted scene against:
Check for: Check for:
- contradictions - contradictions
- wrong character knowledge - wrong character knowledge
- wrong character pronouns or narrative reference, using `Pronouns / Reference` in `story/characters.md` as hard canon
- impossible locations or timing - impossible locations or timing
- accidental resolution of future threads - accidental resolution of future threads
- missing required beats from the task - missing required beats from the task
- invented lore that should have been added deliberately - invented lore that should have been added deliberately
Do not fail the scene because durable state files are not updated yet. State files are updated by a later `update_state` stage after review. If the task lists `Updates:`, treat those as future state-update requirements and mention them only as `context_update` guidance.
Wrong pronouns are a continuity failure. If a drafted scene uses non-canonical pronouns for a named character, return `status: fail` and explain which character drifted. Do not pass the scene with only `context_update` guidance.
Output exactly: Output exactly:
status: pass | fail | retry | escalate status: pass | fail | retry | escalate
@ -25,4 +30,4 @@ reason: <short explanation>
next_stage: <optional stage id> next_stage: <optional stage id>
context_update: <compact useful note> context_update: <compact useful note>
When `status: pass`, leave `next_stage` blank. Use `retry` when the scene can be repaired by drafting again. When `status: pass`, leave `next_stage` blank. Use `retry` when the scene can be repaired by drafting again. For retryable scene issues, leave `next_stage` blank; NightShift will route back to the configured drafting stage.

View File

@ -7,12 +7,14 @@ Rules:
- Do not edit `story/worldbuilding.md`, `story/characters.md`, `story/style-guide.md`, `story/plot-state.md`, `story/timeline.md`, `story/unresolved-threads.md`, `story/continuity-rules.md`, or `story/outline.md`. - Do not edit `story/worldbuilding.md`, `story/characters.md`, `story/style-guide.md`, `story/plot-state.md`, `story/timeline.md`, `story/unresolved-threads.md`, `story/continuity-rules.md`, or `story/outline.md`.
- Use `story/style-guide.md` for POV, tense, tone, and prose rules. - Use `story/style-guide.md` for POV, tense, tone, and prose rules.
- Use `story/plot-state.md` and `story/timeline.md` as current state. - Use `story/plot-state.md` and `story/timeline.md` as current state.
- Use the `Pronouns / Reference` sections in `story/characters.md` as hard canon.
- Do not infer, vary, or "smooth out" character pronouns. Use canonical narrative reference exactly.
- Keep the scene bounded to the task acceptance criteria. - Keep the scene bounded to the task acceptance criteria.
- Do not resolve future plot threads unless the task explicitly asks for that. - Do not resolve future plot threads unless the task explicitly asks for that.
- Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file. - Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file.
Output only one complete file block using this delimiter format: Output only one complete file block using this delimiter format:
FILE: story/chapters/chapter-001/scene-001.md FILE: <the exact story/chapters path listed under Writes in the current task>
---CONTENT--- ---CONTENT---
<complete scene prose> <complete scene prose>
---END--- ---END---

View File

@ -0,0 +1,26 @@
You are the scene editor for a NightShift novel-writing workflow.
Edit an already drafted scene after a continuity or style review failure.
Rules:
- Preserve the existing scene's structure, voice, events, pacing, and best lines.
- Make the smallest changes needed to satisfy the review failure and task acceptance criteria.
- Do not restart, summarize, replace the scene premise, or change scene direction.
- Use `story/style-guide.md` for POV, tense, tone, and prose rules.
- Use `story/characters.md`, especially `Pronouns / Reference`, as hard canon.
- Wrong pronouns are mandatory fixes.
- Do not edit state files, worldbuilding, outline, continuity rules, or style guide.
- Do not resolve future plot threads unless the task explicitly asks for that.
- Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file.
Use the `current_scene_file` context as the source text to edit.
Use the retry notes and latest review output to identify the required repair.
Output only one complete file block using this delimiter format:
FILE: <the exact story/chapters path listed under Writes in the current task>
---CONTENT---
<complete edited scene prose>
---END---
Do not use markdown code fences for scene prose output.
Do not output a plan, notes, analysis, or any text outside the delimiter block.

View File

@ -21,8 +21,26 @@ State updates should reflect only what happened in the accepted scene:
Do not invent events that are not in the scene. Do not invent events that are not in the scene.
Preserve existing durable state. Make minimal additive edits:
- append new scene facts, timeline bullets, character knowledge, and unresolved threads
- update current locations/status only where the accepted scene changes them
- do not remove or compress existing character profiles, faction notes, world notes, or open threads
- do not rewrite whole files for style, brevity, or cleanup
- if a section already contains useful detail, keep it and add only the new facts needed
Protect character canon:
- Never change any `Pronouns / Reference` section.
- Never change a character's canonical pronouns, narrative reference, identity, or core wound.
- Prefer updating `story/plot-state.md`, `story/timeline.md`, and `story/unresolved-threads.md`.
- Edit `story/characters.md` only when the accepted scene adds a small current-status fact or introduces a new named character.
- If editing `story/characters.md`, preserve all existing sections and add only the minimal new status/detail needed.
Output only complete file content blocks. Output only complete file content blocks.
Use one fenced block per file: Use this delimiter format for each state file you update:
```file:story/plot-state.md
FILE: story/plot-state.md
---CONTENT---
<complete updated state file> <complete updated state file>
``` ---END---
Do not use markdown code fences. Do not include prose outside FILE blocks.

View File

@ -16,6 +16,8 @@ Check for:
- placeholders such as TODO, TBD, `[insert]`, or author notes - placeholders such as TODO, TBD, `[insert]`, or author notes
- scene length far outside the requested range - scene length far outside the requested range
Do not fail the scene because durable state files are not updated yet. State files are updated by a later `update_state` stage after review.
Output exactly: Output exactly:
status: pass | fail | retry | escalate status: pass | fail | retry | escalate
@ -23,4 +25,4 @@ reason: <short explanation>
next_stage: <optional stage id> next_stage: <optional stage id>
context_update: <compact useful note> context_update: <compact useful note>
When `status: pass`, leave `next_stage` blank. Use `retry` when the drafter should revise the scene. When `status: pass`, leave `next_stage` blank. Use `retry` when the drafter should revise the scene. For retryable scene issues, leave `next_stage` blank; NightShift will route back to the configured drafting stage.

View File

@ -37,6 +37,14 @@ agents:
num_predict: 8192 num_predict: 8192
system_prompt: .nightshift/agents/drafter.md system_prompt: .nightshift/agents/drafter.md
editor:
backend: ollama
model: nightshift-writer
temperature: 0.3
num_ctx: 16384
num_predict: 8192
system_prompt: .nightshift/agents/editor.md
continuity_reviewer: continuity_reviewer:
backend: ollama backend: ollama
model: nightshift-base model: nightshift-base
@ -110,13 +118,42 @@ pipeline:
type: agent_review type: agent_review
agent: continuity_reviewer agent: continuity_reviewer
output: continuity-review.md output: continuity-review.md
on_fail: draft_scene on_fail: edit_scene
- id: style_review - id: style_review
type: agent_review type: agent_review
agent: style_reviewer agent: style_reviewer
output: style-review.md output: style-review.md
on_fail: draft_scene on_fail: edit_scene
on_pass: update_state
- id: edit_scene
type: file_writer
agent: editor
output: scene-edit.patch
allowed_paths:
- story/chapters
- id: normalize_edit
type: patch_normalizer
output: normalized-edit.patch
- id: validate_edit
type: patch_validator
output: edit-validation.md
max_files: 2
max_lines: 1200
max_delete_ratio: 0.50
allowed_paths:
- story/chapters
on_fail: edit_scene
- id: apply_edit
type: patch_apply
mode: apply
output: edit-apply-output.txt
on_fail: edit_scene
on_pass: continuity_review
- id: update_state - id: update_state
type: file_writer type: file_writer

View File

@ -4,7 +4,7 @@ import unittest
from unittest.mock import MagicMock, patch from unittest.mock import MagicMock, patch
from nightshift.agents import AgentExecutor, build_prompt_bundle, parse_review_output, strip_ansi_escape_sequences from nightshift.agents import AgentExecutor, build_prompt_bundle, parse_review_output, strip_ansi_escape_sequences
from nightshift.agents import AgentInvocation, format_agent_invocation from nightshift.agents import AgentInvocation, format_agent_invocation, format_agent_invocation_json
from nightshift.artifacts import ArtifactStore from nightshift.artifacts import ArtifactStore
from nightshift.config import AgentConfig, StageConfig from nightshift.config import AgentConfig, StageConfig
from nightshift.tasks import parse_tasks from nightshift.tasks import parse_tasks
@ -93,7 +93,7 @@ class AgentExecutorTests(unittest.TestCase):
self.assertIn("Use only paths under these project-relative targets: `story/chapters`.", prompt) self.assertIn("Use only paths under these project-relative targets: `story/chapters`.", prompt)
self.assertIn("This is the drafting stage", prompt) self.assertIn("This is the drafting stage", prompt)
self.assertIn("FILE: story/chapters/chapter-001/scene-001.md", prompt) self.assertIn("FILE: <the exact story/chapters path listed under Writes in the current task>", prompt)
self.assertIn("---CONTENT---", prompt) self.assertIn("---CONTENT---", prompt)
self.assertIn("---END---", prompt) self.assertIn("---END---", prompt)
self.assertIn("Do not use markdown code fences", prompt) self.assertIn("Do not use markdown code fences", prompt)
@ -125,6 +125,10 @@ class AgentExecutorTests(unittest.TestCase):
output = (root / result.output_path).read_text(encoding="utf-8") output = (root / result.output_path).read_text(encoding="utf-8")
self.assertIn("TASK-001", output) self.assertIn("TASK-001", output)
self.assertIn("Plan carefully.", output) self.assertIn("Plan carefully.", output)
json_output = (root / ".nightshift" / "runs" / "test-run" / "tasks" / task.id / "plan.json")
self.assertTrue(json_output.exists())
self.assertIn('"stage_id": "plan"', json_output.read_text(encoding="utf-8"))
self.assertIn('"stdout"', json_output.read_text(encoding="utf-8"))
def test_review_output_parser_accepts_structured_status(self) -> None: def test_review_output_parser_accepts_structured_status(self) -> None:
status, reason, next_stage, context_update = parse_review_output( status, reason, next_stage, context_update = parse_review_output(
@ -238,6 +242,23 @@ class AgentExecutorTests(unittest.TestCase):
self.assertIn("Agent: `planner`", output) self.assertIn("Agent: `planner`", output)
self.assertIn("## stderr", output) self.assertIn("## stderr", output)
def test_agent_invocation_json_preserves_raw_streams(self) -> None:
invocation = AgentInvocation(
agent_id="planner",
command="cmd",
prompt="prompt with ``` fences",
exit_code=0,
stdout="stdout with ``` fences",
stderr="stderr",
duration_seconds=0.1,
)
output = format_agent_invocation_json("plan", invocation)
self.assertIn('"stage_id": "plan"', output)
self.assertIn('stdout with ``` fences', output)
self.assertIn('prompt with ``` fences', output)
def test_strip_ansi_escape_sequences(self) -> None: def test_strip_ansi_escape_sequences(self) -> None:
self.assertEqual(strip_ansi_escape_sequences("\x1b[?25lthinking\x1b[0m"), "thinking") self.assertEqual(strip_ansi_escape_sequences("\x1b[?25lthinking\x1b[0m"), "thinking")

View File

@ -55,6 +55,40 @@ class ConfigTests(unittest.TestCase):
with self.assertRaisesRegex(ConfigError, "on_fail references unknown stage"): with self.assertRaisesRegex(ConfigError, "on_fail references unknown stage"):
load_config(config_path) load_config(config_path)
def test_on_pass_must_reference_existing_stage(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_path.write_text(
config_path.read_text(encoding="utf-8").replace(
"on_fail: plan", "on_pass: missing_stage", 1
),
encoding="utf-8",
)
with self.assertRaisesRegex(ConfigError, "on_pass references unknown stage"):
load_config(config_path)
def test_on_pass_loads(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_path.write_text(
config_path.read_text(encoding="utf-8").replace(
" output: plan.md",
" output: plan.md\n on_pass: summarize",
1,
),
encoding="utf-8",
)
config = load_config(config_path)
plan_stage = next(stage for stage in config.pipeline.stages if stage.id == "plan")
self.assertEqual(plan_stage.on_pass, "summarize")
def test_validate_requires_prompt_files(self) -> None: def test_validate_requires_prompt_files(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
root = Path(directory) root = Path(directory)

View File

@ -260,6 +260,47 @@ Sunlight did not belong here.
self.assertEqual(updates[0].path, "story/chapters/chapter-001/scene-001.md") self.assertEqual(updates[0].path, "story/chapters/chapter-001/scene-001.md")
self.assertEqual(updates[0].content, "Sunlight did not belong here.\n") self.assertEqual(updates[0].content, "Sunlight did not belong here.\n")
def test_file_updates_parse_delimiters_without_end_before_next_file(self) -> None:
updates = parse_file_updates(
"""Intro prose is ignored.
FILE: story/plot-state.md
---CONTENT---
# Plot State
- Scene two happened.
FILE: story/timeline.md
---CONTENT---
# Timeline
- SCENE-002 complete.
"""
)
self.assertEqual(len(updates), 2)
self.assertEqual(updates[0].path, "story/plot-state.md")
self.assertEqual(updates[0].content, "# Plot State\n\n- Scene two happened.\n")
self.assertEqual(updates[1].path, "story/timeline.md")
self.assertEqual(updates[1].content, "# Timeline\n\n- SCENE-002 complete.\n")
def test_file_updates_parse_mixed_delimiter_end_and_next_file(self) -> None:
updates = parse_file_updates(
"""FILE: story/plot-state.md
---CONTENT---
first
---END---
FILE: story/timeline.md
---CONTENT---
second
"""
)
self.assertEqual(len(updates), 2)
self.assertEqual(updates[0].content, "first\n")
self.assertEqual(updates[1].content, "second\n")
def test_file_updates_reject_duplicate_blocks(self) -> None: def test_file_updates_reject_duplicate_blocks(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
root = Path(directory) root = Path(directory)
@ -334,6 +375,48 @@ new
self.assertEqual(patch.count("diff --git a/app.py b/app.py"), 1) self.assertEqual(patch.count("diff --git a/app.py b/app.py"), 1)
def test_file_updates_reject_character_pronoun_canon_changes(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
(root / "story").mkdir()
(root / "story" / "characters.md").write_text(
"""# Characters
## Cricket
### Pronouns / Reference
- Pronouns: she/her
- Narrative reference: Cricket; she/her
Scavenger.
""",
encoding="utf-8",
)
safety = SafetyConfig(
require_clean_worktree=False,
scoped_paths=("story",),
allowed_commands=(),
forbidden_commands=(),
)
updates = parse_file_updates(
"""FILE: story/characters.md
---CONTENT---
# Characters
## Cricket
### Pronouns / Reference
- Pronouns: they/them
- Narrative reference: Cricket; they/them
Scavenger.
---END---
"""
)
with self.assertRaisesRegex(PipelineError, "protected character pronoun canon changed"):
generate_patch_from_file_updates(updates, root, safety)
if __name__ == "__main__": if __name__ == "__main__":
unittest.main() unittest.main()

View File

@ -105,6 +105,30 @@ class PipelineRunnerTests(unittest.TestCase):
) )
self.assertIn("Modified Files", (root / ".nightshift" / "runs" / "test-run" / "run-summary.md").read_text(encoding="utf-8")) self.assertIn("Modified Files", (root / ".nightshift" / "runs" / "test-run" / "run-summary.md").read_text(encoding="utf-8"))
def test_on_pass_jumps_to_configured_stage(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
stages = (
StageConfig(id="first", type="agent", agent="planner", output="first.md", on_pass="third"),
StageConfig(
id="second",
type="command",
commands=('python -c "print(\'should not run\')"',),
output="second-output.txt",
),
StageConfig(id="third", type="summarize", output="final-notes.md"),
)
config = make_config(root, stages)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(TASK_MD)[0])
task_dir = root / ".nightshift" / "runs" / "test-run" / "tasks" / "TASK-001"
self.assertEqual(result.status, "complete")
self.assertEqual([item.stage_id for item in result.stage_results], ["first", "third"])
self.assertFalse((task_dir / "second-output.txt").exists())
def test_task_preflight_fails_when_task_specific_test_file_is_missing(self) -> None: def test_task_preflight_fails_when_task_specific_test_file_is_missing(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
root = Path(directory) root = Path(directory)
@ -153,6 +177,46 @@ class PipelineRunnerTests(unittest.TestCase):
self.assertIn("Retry limit reached", result.reason) self.assertIn("Retry limit reached", result.reason)
self.assertEqual([item.stage_id for item in result.stage_results], ["implement", "review", "implement", "review", "implement", "review"]) self.assertEqual([item.stage_id for item in result.stage_results], ["implement", "review", "implement", "review", "implement", "review"])
def test_failing_review_self_next_stage_routes_to_on_fail(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
config = make_config(root, (), max_retries=1)
config.agents["reviewer"] = AgentConfig(
id="reviewer",
backend="command",
command=(
"python -c \"print('status: fail\\nreason: needs draft repair\\n"
"next_stage: review\\ncontext_update: add concrete details')\""
),
system_prompt=Path("reviewer.md"),
)
config = replace(
config,
pipeline=PipelineConfig(
max_task_retries=1,
stages=(
StageConfig(id="implement", type="agent", agent="planner", output="implementation-log.md"),
StageConfig(
id="review",
type="agent_review",
agent="reviewer",
on_fail="implement",
output="review.md",
),
),
),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
task = parse_tasks(TASK_MD)[0]
result = runner.run_task(task)
self.assertEqual(result.retry_count, 1)
self.assertEqual([item.stage_id for item in result.stage_results], ["implement", "review", "implement", "review"])
log = (root / ".nightshift" / "runs" / "test-run" / "run.log").read_text(encoding="utf-8")
self.assertIn("next_stage=implement", log)
def test_malformed_review_gets_strict_retry_without_redrafting(self) -> None: def test_malformed_review_gets_strict_retry_without_redrafting(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
root = Path(directory) root = Path(directory)
@ -544,6 +608,34 @@ Acceptance Criteria:
self.assertIn("response = self.client.get('/board/general')", note) self.assertIn("response = self.client.get('/board/general')", note)
self.assertIn("self.assertEqual(response.status_code, 200)", note) self.assertIn("self.assertEqual(response.status_code, 200)", note)
def test_state_update_retry_note_guides_deletion_heavy_repairs(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
config = make_config(root, ())
runner = PipelineRunner(config, artifacts)
output_path = artifacts.write_stage_output(
"TASK-001",
"state-validation.md",
"# Patch Validation\n\nStatus: fail\nReason: Patch validation failed: deletion-heavy patch exceeds max_delete_ratio 0.35.\n",
)
note = runner._format_retry_note(
1,
StageConfig(id="validate_state", type="patch_validator", on_fail="update_state"),
StageResult(
stage_id="validate_state",
status="fail",
reason="Patch validation failed: deletion-heavy patch exceeds max_delete_ratio 0.35.",
output_path=str(output_path.relative_to(root)),
),
"update_state",
)
self.assertIn("preserve existing durable state text", note)
self.assertIn("minimal additive edits", note)
def test_code_writer_normalizer_and_validator_pipeline(self) -> None: def test_code_writer_normalizer_and_validator_pipeline(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
root = Path(directory) root = Path(directory)
@ -892,6 +984,60 @@ Acceptance Criteria:
self.assertIn("... <truncated>", retry_prompt) self.assertIn("... <truncated>", retry_prompt)
self.assertLess(len(retry_prompt), 9000) self.assertLess(len(retry_prompt), 9000)
def test_state_file_writer_invalid_output_retry_uses_delimiter_format(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
story = root / "story"
story.mkdir()
(story / "plot-state.md").write_text("old\n", encoding="utf-8")
(root / "fake_writer.py").write_text(
"\n".join(
[
"import sys",
"prompt = sys.stdin.read()",
"if 'Previous file_writer output was invalid' not in prompt:",
" print('lookup failed')",
"else:",
" (open('retry-prompt.txt', 'w', encoding='utf-8').write(prompt))",
" print('FILE: story/plot-state.md')",
" print('---CONTENT---')",
" print('old')",
" print('new')",
" print('---END---')",
]
),
encoding="utf-8",
)
stages = (
StageConfig(
id="update_state",
type="file_writer",
agent="writer",
allowed_paths=(
"story/plot-state.md",
"story/characters.md",
"story/timeline.md",
"story/unresolved-threads.md",
),
),
)
config = make_config(root, stages)
config.agents["writer"] = AgentConfig(
id="writer",
backend="command",
command="python fake_writer.py",
system_prompt=Path("planner.md"),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(TASK_MD)[0])
retry_prompt = (root / "retry-prompt.txt").read_text(encoding="utf-8")
self.assertEqual(result.status, "complete")
self.assertIn("Use delimiter file blocks only", retry_prompt)
self.assertNotIn("Use complete fenced file blocks", retry_prompt)
def test_file_writer_retry_compacts_large_previous_outputs(self) -> None: def test_file_writer_retry_compacts_large_previous_outputs(self) -> None:
outputs = { outputs = {
"scene-draft.patch": "a" * 5000, "scene-draft.patch": "a" * 5000,
@ -904,6 +1050,162 @@ Acceptance Criteria:
self.assertLess(len(compacted["scene-draft.patch"]), 180) self.assertLess(len(compacted["scene-draft.patch"]), 180)
self.assertEqual(compacted["draft-validation.md"], "Patch validation failed") self.assertEqual(compacted["draft-validation.md"], "Patch validation failed")
def test_file_writer_first_attempt_preserves_large_previous_outputs(self) -> None:
outputs = {"plan": "a" * 5000}
compacted = _file_writer_previous_outputs(outputs, retry_count=0, max_chars=100)
self.assertEqual(compacted["plan"], "a" * 5000)
def test_file_writer_previous_outputs_strip_wrapped_agent_prompts(self) -> None:
output = "\n".join(
[
"# Agent Output: plan",
"",
"## stdout",
"",
"```text",
"useful plan",
"```",
"",
"## stderr",
"",
"```text",
"```",
"",
"## Prompt",
"",
"```markdown",
"huge prompt marker",
"```",
]
)
compacted = _file_writer_previous_outputs({"plan": output}, retry_count=0)
self.assertEqual(compacted["plan"], "useful plan")
self.assertNotIn("huge prompt marker", compacted["plan"])
def test_state_update_file_writer_gets_focused_context_and_current_files(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
(root / "story").mkdir()
(root / "story" / "plot-state.md").write_text("# Plot State\n\n- Before\n", encoding="utf-8")
(root / "fake_state_writer.py").write_text(
"\n".join(
[
"import sys",
"prompt = sys.stdin.read()",
"open('state-prompt.txt', 'w', encoding='utf-8').write(prompt)",
"if 'current_allowed_files' in prompt and 'huge-plan-marker' not in prompt:",
" print('FILE: story/plot-state.md')",
" print('---CONTENT---')",
" print('# Plot State')",
" print()",
" print('- Before')",
" print('- After')",
" print('---END---')",
"else:",
" print('')",
]
),
encoding="utf-8",
)
config = make_config(
root,
(
StageConfig(id="plan", type="agent", agent="planner", output="plan.md"),
StageConfig(
id="update_state",
type="file_writer",
agent="state_updater",
allowed_paths=("story/plot-state.md",),
),
),
)
config.agents["planner"] = AgentConfig(
id="planner",
backend="command",
command="python -c \"print('huge-plan-marker' * 1000)\"",
system_prompt=Path("planner.md"),
)
config.agents["state_updater"] = AgentConfig(
id="state_updater",
backend="command",
command="python fake_state_writer.py",
system_prompt=Path("planner.md"),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(TASK_MD)[0])
prompt = (root / "state-prompt.txt").read_text(encoding="utf-8")
self.assertEqual(result.status, "complete")
self.assertIn("current_allowed_files", prompt)
self.assertIn("# Plot State", prompt)
self.assertNotIn("huge-plan-marker", prompt)
def test_scene_editor_file_writer_gets_current_scene_file(self) -> None:
task_md = """# Tasks
- [ ] SCENE-001: Edit scene
Description:
Repair the scene.
Acceptance Criteria:
- Writes:
- `story/chapters/chapter-001/scene-001.md`
"""
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
(root / "tasks.md").write_text(task_md, encoding="utf-8")
scene_path = root / "story" / "chapters" / "chapter-001" / "scene-001.md"
scene_path.parent.mkdir(parents=True)
scene_path.write_text("Proxy walked home.\n", encoding="utf-8")
(root / "fake_editor.py").write_text(
"\n".join(
[
"import sys",
"prompt = sys.stdin.read()",
"open('editor-prompt.txt', 'w', encoding='utf-8').write(prompt)",
"if 'current_scene_file' in prompt and 'Proxy walked home.' in prompt:",
" print('FILE: story/chapters/chapter-001/scene-001.md')",
" print('---CONTENT---')",
" print('Proxy walked home corrected.')",
" print('---END---')",
"else:",
" print('')",
]
),
encoding="utf-8",
)
stages = (
StageConfig(
id="edit_scene",
type="file_writer",
agent="editor",
allowed_paths=("story/chapters",),
),
)
config = make_config(root, stages)
config.agents["editor"] = AgentConfig(
id="editor",
backend="command",
command="python fake_editor.py",
system_prompt=Path("planner.md"),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(task_md)[0])
prompt = (root / "editor-prompt.txt").read_text(encoding="utf-8")
self.assertEqual(result.status, "complete")
self.assertIn("current_scene_file", prompt)
self.assertIn("Proxy walked home.", prompt)
def test_patch_validator_rejects_unsafe_patch(self) -> None: def test_patch_validator_rejects_unsafe_patch(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
root = Path(directory) root = Path(directory)