Update story mode for invariants

This commit is contained in:
K. Hodges 2026-05-22 17:13:27 -07:00
parent 4ad0c310d1
commit c4d88fced5
17 changed files with 1112 additions and 68 deletions

View File

@ -0,0 +1,119 @@
# Iteration 1: SCENE-002 Update State Failure
Date: 2026-05-22
## Run Reviewed
- Sandbox: `integ_runs/20260522T214944.385761Z`
- Run: `.nightshift/runs/20260522T215005.188534Z`
- Task: `SCENE-002`
- Final status: failed
- Failed stage: `update_state`
## What Happened
The scene workflow mostly succeeded:
- `draft_scene` wrote the scene.
- `continuity_review` correctly failed the first draft for pronoun drift.
- `edit_scene` repaired the pronoun issue.
- `continuity_review` passed after edit.
- `style_review` passed.
The remaining failure happened in `update_state`.
NightShift reported:
```text
File writer error: no file blocks found. Expected FILE: path with ---CONTENT---/---END--- or fenced blocks like ```file:path.py.
```
The model output did contain visible `FILE:` blocks, but it omitted the required `---END---` delimiter. It emitted:
```text
FILE: story/plot-state.md
---CONTENT---
...
FILE: story/characters.md
---CONTENT---
...
```
The current parser requires `---END---`, so it rejected all of the blocks.
## Additional Risk Found
The rejected state update also tried to rewrite character canon in unsafe ways:
- It changed BLOODMONEY's pronoun reference to `he/him`.
- It changed Cricket's pronoun reference to `they/them`.
- It compressed/replaced larger parts of `story/characters.md`.
That means simply accepting unterminated blocks is not enough. The parser can be more tolerant, but the state updater still needs stronger constraints so durable canon does not drift.
## Suggested Fixes
Short-term fixes for this iteration:
1. Make `parse_file_updates` tolerate delimiter blocks that omit `---END---` when a new `FILE:` block or EOF clearly terminates the previous block.
2. Keep strict path validation and duplicate-file validation unchanged.
3. Strengthen the state-updater prompt:
- never edit `Pronouns / Reference` sections
- preserve existing character profiles
- prefer updating `plot-state.md`, `timeline.md`, and `unresolved-threads.md`
- edit `characters.md` only for small additive current-status facts
4. Add regression tests for unterminated delimiter parsing.
Longer-term follow-up:
- Add deterministic writing-state validation that rejects changes to protected canon sections such as `Pronouns / Reference`.
- Move character canon into structured data so pronoun constraints can be validated directly.
## Planned Changes
- Update delimiter block parsing in `nightshift/patches.py`.
- Add parser tests in `tests/test_patches.py`.
- Tighten `state-updater.md` in the tutorial novel template.
- Run focused parser tests and the full suite.
## Changes Made
- `parse_file_updates` now accepts delimiter-style file blocks that omit `---END---` when the next `FILE:` header or EOF clearly terminates the block.
- Added regression coverage for:
- unterminated delimiter blocks before another `FILE:`
- mixed terminated and unterminated delimiter blocks
- Strengthened the tutorial novel state updater prompt to protect character canon:
- never change `Pronouns / Reference`
- never change canonical pronouns, narrative reference, identity, or core wound
- prefer state/timeline/thread files over `characters.md`
- edit `characters.md` only for small additive current-status facts or new named characters
- Added deterministic protection in file-block patch generation:
- changes to existing `Pronouns / Reference` sections in `story/characters.md` are rejected before a patch is generated
- Added regression coverage for rejecting protected pronoun canon changes.
## Verification
Focused tests:
```powershell
python -m pytest tests/test_patches.py tests/test_pipeline.py -q
```
Result:
```text
56 passed
```
Full suite:
```powershell
python -m pytest -q
```
Result:
```text
199 passed, 4 subtests passed
```

View File

@ -99,33 +99,17 @@ Examples:
This keeps the initial useful output visible even when strict rerun output is worse.
## P1: Store Raw Agent Invocations As JSON
## P1: Classify Writing Review Failures For Repair Routing
The human-readable agent artifact wraps stdout, stderr, and prompts in markdown fences. Nested markdown fences from model output can confuse downstream parsing.
The tutorial novel now has a short-term editor stage for review failures, but review failures should eventually be classified before routing.
Write a machine-readable artifact alongside the markdown artifact:
Candidate classes:
```text
<stage>-agent-output.json
```
- `local_edit`: pronoun drift, small continuity issue, missing beat, light style correction
- `redraft`: wrong premise, broken scene structure, impossible chronology, severe acceptance mismatch
- `escalate`: ambiguous canon conflict or user preference needed
Suggested fields:
```json
{
"agent_id": "drafter",
"stage_id": "draft_scene",
"command": "POST http://localhost:11434/api/generate",
"exit_code": 0,
"timed_out": false,
"duration_seconds": 12.3,
"stdout": "...",
"stderr": "...",
"prompt": "..."
}
```
Pipeline parsing should read raw JSON fields instead of recovering stdout from markdown.
Route `local_edit` to the editor, `redraft` to the drafter, and `escalate` to a clear user-facing failure. Keep original draft and edited draft artifacts side by side for comparison.
## P1: Add A Writing-Mode Validator
@ -140,6 +124,31 @@ Add deterministic checks for prose workflows:
This should run before model review stages.
## P1: Use Structured State Events For Writing Workflows
Replace model-written full state-file rewrites with compact structured state events, then let NightShift deterministically merge them into durable files such as:
- `story/plot-state.md`
- `story/characters.md`
- `story/timeline.md`
- `story/unresolved-threads.md`
Candidate state updater output:
```yaml
events:
- file: story/plot-state.md
section: Completed Scenes
add:
- SCENE-001 complete; Saint and Miette introduced.
- file: story/unresolved-threads.md
section: Open Threads
add:
- Saint depends emotionally on Miette and needs compute tokens to keep her present.
```
NightShift would validate allowed files/sections, reject unknown targets, and apply append/update operations deterministically. This avoids asking a writing model to rewrite entire durable state files after every scene.
## P2: Add A Test Analyzer Agent For TDD
Defer until generated tests are stable.

136
docs/writer-and-coder.md Normal file
View File

@ -0,0 +1,136 @@
# Writer And Coder Compatibility Audit
Date: 2026-05-22
## Summary
The recent writer workflow changes do not intentionally alter the code-generation templates or their stage routing.
During this audit, one possible shared-pipeline regression was found and fixed: generic `file_writer` stages were compacting large previous outputs on the first attempt. Since coding templates use `file_writer` for implementation, that could have reduced coding context before the implementer saw it. The behavior now preserves full first-attempt previous outputs while still stripping wrapped agent prompts from prior agent artifacts.
After that correction, the automated test suite passes.
## Writer Changes Reviewed
- Tutorial novel added a scene editor repair path:
- failed continuity/style review routes to `edit_scene`
- edited scene is normalized, validated, applied, then routed back to review
- passing `style_review` skips editor and routes to `update_state`
- Tutorial novel prompts now include stricter pronoun and state-update guidance.
- State update file-writer stages receive focused current state context.
- Scene editor file-writer stages receive `current_scene_file`.
- Agent invocations now write a sibling JSON artifact for reliable stdout/stderr extraction.
- Pipeline config now supports optional `on_pass` routing.
## Coding Impact Findings
### Finding 1: Coding templates were not directly changed
No non-novel project template files changed in the current diff:
- `basic`
- `real-simple`
- `real-long-running`
- `tutorial-deaddrop`
- `tutorial-imageboard`
- `tutorial-lisp`
The new `editor` agent and review repair routing are only configured in `tutorial-novel/nightshift.yaml`.
### Finding 2: `on_pass` is inert for existing coding configs
`on_pass` defaults to `None`, so existing coding templates keep their prior linear pass behavior unless they explicitly opt in.
Passing review stages still ignore model-provided `next_stage` values. This preserves the existing safety behavior where reviewers cannot jump around the pipeline on a pass unless the config has an explicit `on_pass`.
### Finding 3: Code writer stages still use the same direct patch path
`code_writer` stages still:
- call the configured agent
- parse stdout as a unified diff
- support lookup-request reruns
- write implementation summaries
- feed patch normalizer/validator/apply stages as before
The JSON agent artifact change only changes how NightShift reads agent stdout internally; it does not change the prompt contract or patch contract.
### Finding 4: File-writer implementers had one possible context regression; fixed
Potential issue found:
- `_file_writer_previous_outputs` had started compacting large previous outputs even on first attempt.
- Coding templates such as DeadDrop use `file_writer` for implementation.
- That could have shortened planner/context output before the implementer saw it.
Fix applied:
- First-attempt `file_writer` stages now preserve full previous outputs.
- Retry attempts still compact large previous outputs to control prompt bloat.
- Wrapped agent artifacts still strip down to stdout so old prompts do not pollute later prompts.
Regression coverage added:
- `test_file_writer_first_attempt_preserves_large_previous_outputs`
### Finding 5: State/editor special context branches are narrowly gated
The new context enrichment branches are guarded by stage shape:
- state update branch only applies to `file_writer` stages whose allowed paths are state files:
- `story/plot-state.md`
- `story/characters.md`
- `story/timeline.md`
- `story/unresolved-threads.md`
- scene editor branch only applies to `file_writer` stages whose id starts with `edit_` and whose allowed paths include `story/chapters`
Normal coding implementer stages such as `implement`, `implement_junior`, and `implement_senior` do not match either branch.
## Template Validation Notes
Validated successfully:
- `basic`
- `tutorial-deaddrop`
- `tutorial-novel`
Validation still fails for these templates because `debugger` is configured but `.nightshift/agents/debugger.md` is missing:
- `real-simple`
- `real-long-running`
- `tutorial-imageboard`
- `tutorial-lisp`
Those failures are not caused by the writer changes; there is no current diff in those template directories.
## Verification
Focused tests:
```powershell
python -m pytest tests/test_pipeline.py tests/test_config.py tests/test_agents.py -q
```
Result:
```text
71 passed, 4 subtests passed
```
Full suite:
```powershell
python -m pytest -q
```
Result:
```text
196 passed, 4 subtests passed
```
## Conclusion
After the first-attempt file-writer context fix, I do not see evidence that the writer workflow changes degrade code generation. The shared changes are either opt-in (`on_pass`), artifact-reading improvements (JSON stdout), or narrowly gated to novel state/editor stages.
Remaining non-writer issue: several coding-oriented templates still reference a missing `debugger.md` prompt. That should be handled separately from this writer/coder compatibility pass.

View File

@ -3,6 +3,7 @@
from __future__ import annotations
from dataclasses import dataclass
from dataclasses import asdict
import json
import os
from pathlib import Path
@ -119,6 +120,11 @@ class AgentExecutor:
output_filename = stage.output or f"{stage.id}.md"
output = format_agent_invocation(stage.id, invocation)
output_path = self.artifacts.write_stage_output(task.id, output_filename, output)
json_output_path = self.artifacts.write_stage_output(
task.id,
_agent_invocation_json_filename(output_filename),
format_agent_invocation_json(stage.id, invocation),
)
self.logger.event(
"artifact.write",
"Wrote agent artifact",
@ -126,6 +132,7 @@ class AgentExecutor:
task_id=task.id,
agent_id=agent.id,
artifact_path=output_path.relative_to(self.project_root),
json_artifact_path=json_output_path.relative_to(self.project_root),
)
if invocation.timed_out:
@ -520,13 +527,30 @@ def _file_writer_block_contract(stage: StageConfig) -> str:
return "\n".join(
[
"Use exactly this delimiter format for the scene file:",
"FILE: story/chapters/chapter-001/scene-001.md",
"FILE: <the exact story/chapters path listed under Writes in the current task>",
"---CONTENT---",
"<complete scene prose>",
"---END---",
"Do not use markdown code fences for prose scene output.",
]
)
state_paths = {
"story/plot-state.md",
"story/characters.md",
"story/timeline.md",
"story/unresolved-threads.md",
}
if set(normalized).issubset(state_paths) and normalized:
return "\n".join(
[
"Use exactly this delimiter format for each state file you update:",
"FILE: story/plot-state.md",
"---CONTENT---",
"<complete updated state file>",
"---END---",
"Do not use markdown code fences for state update output.",
]
)
return "\n".join(
[
"Use one fenced block per file with this exact opening form:",
@ -622,3 +646,18 @@ def format_agent_invocation(stage_id: str, invocation: AgentInvocation) -> str:
"",
]
)
def format_agent_invocation_json(stage_id: str, invocation: AgentInvocation) -> str:
data = {
**asdict(invocation),
"stage_id": stage_id,
}
return json.dumps(data, ensure_ascii=False, indent=2) + "\n"
def _agent_invocation_json_filename(output_filename: str) -> str:
path = Path(output_filename)
if path.suffix:
return path.with_suffix(".json").as_posix()
return path.with_name(path.name + ".json").as_posix()

View File

@ -61,6 +61,7 @@ class StageConfig:
commands: tuple[str, ...] = ()
output: str | None = None
on_fail: str | None = None
on_pass: str | None = None
shell: bool = True
timeout_seconds: int | None = None
working_dir: Path | None = None
@ -392,6 +393,7 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
commands=commands,
output=_optional_string(stage_raw.get("output"), f"{stage_context}.output"),
on_fail=_optional_string(stage_raw.get("on_fail"), f"{stage_context}.on_fail"),
on_pass=_optional_string(stage_raw.get("on_pass"), f"{stage_context}.on_pass"),
shell=_optional_bool(stage_raw.get("shell", True), f"{stage_context}.shell"),
timeout_seconds=timeout_seconds,
working_dir=Path(working_dir_raw) if working_dir_raw else None,
@ -416,6 +418,10 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
raise ConfigError(
f"Config error: stage '{stage.id}' on_fail references unknown stage '{stage.on_fail}'."
)
if stage.on_pass and stage.on_pass not in stage_ids:
raise ConfigError(
f"Config error: stage '{stage.id}' on_pass references unknown stage '{stage.on_pass}'."
)
return NightShiftConfig(
path=config_path,

View File

@ -112,10 +112,26 @@ def parse_file_updates(text: str) -> tuple[FileUpdate, ...]:
def _parse_delimited_file_updates(text: str) -> list[FileUpdate]:
updates: list[FileUpdate] = []
header_pattern = re.compile(r"(?m)^FILE:\s*(?P<path>[^\n]+)\n---CONTENT---\n")
matches = list(header_pattern.finditer(text))
for index, match in enumerate(matches):
path = match.group("path").strip().strip("`")
content_start = match.end()
next_file_start = matches[index + 1].start() if index + 1 < len(matches) else len(text)
raw_content = text[content_start:next_file_start]
end_match = re.search(r"(?m)^---END---\s*$", raw_content)
if end_match:
raw_content = raw_content[: end_match.start()]
content = raw_content.rstrip("\r\n") + "\n"
if path:
updates.append(FileUpdate(path=path, content=content))
if updates:
return updates
pattern = re.compile(
r"(?ms)^FILE:\s*(?P<path>[^\n]+)\n---CONTENT---\n(?P<content>.*?)\n---END---\s*$"
)
updates: list[FileUpdate] = []
for match in pattern.finditer(text):
path = match.group("path").strip().strip("`")
content = match.group("content")
@ -146,6 +162,7 @@ def generate_patch_from_file_updates(
_validate_allowed_patch_path(normalized_path, root, allowed_paths)
file_path = resolve_inside_root(root, normalized_path, f"file update '{normalized_path}'")
old_text = file_path.read_text(encoding="utf-8", errors="replace") if file_path.exists() else ""
_validate_protected_character_canon(normalized_path, old_text, update.content)
if old_text == update.content:
continue
patch_parts.extend(_diff_for_file(normalized_path, old_text, update.content, file_path.exists()))
@ -208,6 +225,51 @@ def _validate_allowed_patch_path(path_text: str, root: Path, allowed_paths: tupl
)
def _validate_protected_character_canon(path_text: str, old_text: str, new_text: str) -> None:
if path_text.replace("\\", "/") != "story/characters.md" or not old_text:
return
old_sections = _pronoun_reference_sections(old_text)
if not old_sections:
return
new_sections = _pronoun_reference_sections(new_text)
changed = [
character
for character, old_section in old_sections.items()
if new_sections.get(character) != old_section
]
if changed:
names = ", ".join(changed)
raise PipelineError(
"File writer error: protected character pronoun canon changed in "
f"`story/characters.md` for: {names}."
)
def _pronoun_reference_sections(text: str) -> dict[str, str]:
sections: dict[str, str] = {}
current_character: str | None = None
lines = text.splitlines()
index = 0
while index < len(lines):
line = lines[index]
if line.startswith("## "):
current_character = line[3:].strip()
index += 1
continue
if current_character and line.strip() == "### Pronouns / Reference":
start = index
index += 1
while index < len(lines):
candidate = lines[index]
if candidate.startswith("## ") or candidate.startswith("### "):
break
index += 1
sections[current_character] = "\n".join(lines[start:index]).strip()
continue
index += 1
return sections
def format_validation_result(result: PatchValidationResult) -> str:
return "\n".join(
[

View File

@ -3,6 +3,7 @@
from __future__ import annotations
from dataclasses import dataclass, replace
import json
from pathlib import Path
import re
import subprocess
@ -181,7 +182,7 @@ class PipelineRunner:
stage_results.append(result)
if stage.id in previous_outputs:
del previous_outputs[stage.id]
previous_outputs[stage.id] = self._read_output(result.output_path)
previous_outputs[stage.id] = self._read_context_output(result.output_path)
telemetry_entries.append(self._telemetry_entry(stage, result, retry_count))
self._write_telemetry(task.id, telemetry_entries)
self.logger.event(
@ -198,6 +199,7 @@ class PipelineRunner:
retry_notes.append(f"Context update from '{stage.id}': {result.context_update}")
if result.status == "pass":
pass_target_stage = result.next_stage or stage.on_pass
if stage.type in {"agent_review", "review"} and result.next_stage:
self.logger.event(
"stage.next_ignored",
@ -207,13 +209,12 @@ class PipelineRunner:
stage_id=stage.id,
requested_next_stage=result.next_stage,
)
index += 1
continue
if result.next_stage:
if result.next_stage not in stage_indexes:
pass_target_stage = stage.on_pass
if pass_target_stage:
if pass_target_stage not in stage_indexes:
final_status = "failed"
final_reason = (
f"Stage '{stage.id}' requested unknown next stage '{result.next_stage}'."
f"Stage '{stage.id}' requested unknown next stage '{pass_target_stage}'."
)
break
self.logger.event(
@ -222,18 +223,14 @@ class PipelineRunner:
run_id=self.artifacts.run_id,
task_id=task.id,
stage_id=stage.id,
next_stage=result.next_stage,
next_stage=pass_target_stage,
)
index = stage_indexes[result.next_stage]
index = stage_indexes[pass_target_stage]
continue
index += 1
continue
target_stage = result.next_stage or (
stage.on_fail
if not (stage.type in {"agent_review", "review"} and _is_malformed_review_result(result))
else None
)
target_stage = _failure_target_stage(stage, result)
analysis_note = self._write_failure_diagnostics(stage, task, result, retry_count)
if analysis_note:
retry_notes.append(analysis_note)
@ -629,8 +626,7 @@ class PipelineRunner:
task_context=context.task_context,
retry_context=context.retry_context,
)
raw_output = self._read_output(result.output_path)
stdout = extract_agent_stdout(raw_output)
stdout = self._read_agent_stdout(result.output_path)
lookup_requests = parse_lookup_requests(stdout)
if lookup_requests and "diff --git " not in stdout:
lookup_context = self.repo_tools.execute_requests(
@ -660,8 +656,7 @@ class PipelineRunner:
task_context=context.task_context,
retry_context="\n".join(f"- {note}" for note in rerun_notes),
)
raw_output = self._read_output(result.output_path)
stdout = extract_agent_stdout(raw_output)
stdout = self._read_agent_stdout(result.output_path)
try:
patch = extract_unified_diff(stdout)
except PipelineError as exc:
@ -709,7 +704,18 @@ class PipelineRunner:
) -> StageResult:
if stage.agent is None:
raise PipelineError(f"Pipeline error: file_writer stage '{stage.id}' must reference an agent.")
enriched_outputs = _file_writer_previous_outputs(previous_outputs, retry_count)
if _is_state_update_stage(stage):
enriched_outputs = _state_update_previous_outputs(previous_outputs)
allowed_file_contents = self._allowed_file_contents(stage)
if allowed_file_contents:
enriched_outputs["current_allowed_files"] = allowed_file_contents
elif _is_scene_edit_stage(stage):
enriched_outputs = _file_writer_previous_outputs(previous_outputs, retry_count)
current_scene = self._task_scene_file_contents(task)
if current_scene:
enriched_outputs["current_scene_file"] = current_scene
else:
enriched_outputs = _file_writer_previous_outputs(previous_outputs, retry_count)
context_pack_path = self._latest_task_artifact(task.id, "context-pack.md")
if context_pack_path is not None:
enriched_outputs["context-pack.md"] = context_pack_path.read_text(encoding="utf-8", errors="replace")
@ -727,8 +733,7 @@ class PipelineRunner:
task_context=context.task_context,
retry_context=context.retry_context,
)
raw_output = self._read_output(result.output_path)
stdout = extract_agent_stdout(raw_output)
stdout = self._read_agent_stdout(result.output_path)
lookup_requests = parse_lookup_requests(stdout)
if lookup_requests and "```file:" not in stdout.lower() and "```path:" not in stdout.lower():
lookup_context = self.repo_tools.execute_requests(
@ -758,8 +763,7 @@ class PipelineRunner:
task_context=context.task_context,
retry_context="\n".join(f"- {note}" for note in rerun_notes),
)
raw_output = self._read_output(result.output_path)
stdout = extract_agent_stdout(raw_output)
stdout = self._read_agent_stdout(result.output_path)
invalid_rerun_done = False
candidate_index_path: Path | None = None
while True:
@ -803,7 +807,7 @@ class PipelineRunner:
strict_notes = [
*retry_notes,
"Previous file_writer output was invalid. Return complete file blocks now. Do not output lookup_requests, prose, or 'lookup failed'.",
"Use complete fenced file blocks with both the opening ```file:path and closing ``` fence.",
_file_writer_repair_format_note(stage),
]
result = self.agent_executor.run_stage(
agent_stage,
@ -814,8 +818,7 @@ class PipelineRunner:
task_context=context.task_context,
retry_context="\n".join(f"- {note}" for note in strict_notes),
)
raw_output = self._read_output(result.output_path)
stdout = extract_agent_stdout(raw_output)
stdout = self._read_agent_stdout(result.output_path)
continue
try:
patch = normalize_patch_text(stdout)
@ -923,6 +926,44 @@ class PipelineRunner:
lines.append("")
return self.artifacts.write_stage_output(task_id, f"{base}/index.md", "\n".join(lines))
def _allowed_file_contents(self, stage: StageConfig, max_chars: int = 2400) -> str:
sections: list[str] = []
for path_text in stage.allowed_paths:
path = self.config.project.root / path_text
if not path.is_file():
continue
content = path.read_text(encoding="utf-8", errors="replace")
sections.extend(
[
f"## {path_text}",
"",
"```text",
_compact_previous_output(content, max_chars=max_chars).rstrip(),
"```",
"",
]
)
return "\n".join(sections).strip()
def _task_scene_file_contents(self, task: Task, max_chars: int = 10000) -> str:
sections: list[str] = []
for path_text in _task_story_chapter_paths(task):
path = self.config.project.root / path_text
if not path.is_file():
continue
content = path.read_text(encoding="utf-8", errors="replace")
sections.extend(
[
f"## {path_text}",
"",
"```text",
_compact_previous_output(content, max_chars=max_chars).rstrip(),
"```",
"",
]
)
return "\n".join(sections).strip()
def _writer_agent_stage(self, stage: StageConfig, retry_count: int) -> StageConfig:
suffix = f"-{retry_count}" if retry_count else ""
return replace(
@ -975,7 +1016,7 @@ class PipelineRunner:
task_context=self.context.read_context(task, retry_notes).task_context,
retry_context=self.context.read_context(task, retry_notes).retry_context,
)
source = extract_agent_stdout(self._read_output(result.output_path))
source = self._read_agent_stdout(result.output_path)
try:
patch = normalize_patch_text(source)
except PipelineError as exc:
@ -1127,8 +1168,7 @@ class PipelineRunner:
task: Task,
result: StageResult,
) -> StageResult | None:
output_text = self._read_output(result.output_path)
requests = parse_resource_requests(extract_agent_stdout(output_text))
requests = parse_resource_requests(self._read_agent_stdout(result.output_path))
if not requests:
return None
paths = satisfy_resource_requests(self.artifacts, task.id, requests)
@ -1338,8 +1378,7 @@ class PipelineRunner:
) -> StageResult:
if result.status != "pass" or result.output_path is None:
return result
output_text = self._read_output(result.output_path)
requests = parse_lookup_requests(extract_agent_stdout(output_text))
requests = parse_lookup_requests(self._read_agent_stdout(result.output_path))
if not requests:
return result
lookup_context = self.repo_tools.execute_requests(
@ -1457,6 +1496,25 @@ class PipelineRunner:
return ""
return path.read_text(encoding="utf-8")
def _read_context_output(self, output_path: str | None) -> str:
stdout = self._read_agent_stdout(output_path)
return stdout if stdout else self._read_output(output_path)
def _read_agent_stdout(self, output_path: str | None) -> str:
if output_path is None:
return ""
path = self.config.project.root / Path(output_path)
json_path = _agent_invocation_json_path(path)
if json_path.exists():
try:
data = json.loads(json_path.read_text(encoding="utf-8"))
except json.JSONDecodeError:
data = {}
stdout = data.get("stdout")
if isinstance(stdout, str):
return stdout
return extract_agent_stdout(self._read_output(output_path))
def _format_retry_note(
self,
retry_count: int,
@ -1468,6 +1526,16 @@ class PipelineRunner:
f"Retry {retry_count}: stage '{stage.id}' returned "
f"{result.status} ({result.reason}); redirecting to '{target_stage}'."
)
if (
target_stage == "update_state"
and "deletion-heavy patch" in result.reason.lower()
):
note = (
f"{note}\n"
"Repair guidance: preserve existing durable state text unless it directly conflicts "
"with the accepted scene. Make minimal additive edits instead of replacing whole "
"sections or compressing character/world files."
)
excerpt = self._failure_excerpt(result.output_path)
if not excerpt:
return note
@ -1683,6 +1751,16 @@ def _is_malformed_review_result(result: StageResult) -> bool:
)
def _failure_target_stage(stage: StageConfig, result: StageResult) -> str | None:
if stage.type not in {"agent_review", "review"}:
return result.next_stage or stage.on_fail
if _is_malformed_review_result(result):
return None
if result.next_stage and result.next_stage != stage.id:
return result.next_stage
return stage.on_fail
def _review_previous_outputs(previous_outputs: dict[str, str], max_chars: int = 1600) -> dict[str, str]:
compacted: dict[str, str] = {}
priority_names = {
@ -1736,6 +1814,15 @@ def _file_writer_stage_guidance(stage: StageConfig) -> str:
return ""
def _file_writer_repair_format_note(stage: StageConfig) -> str:
if _is_state_update_stage(stage):
return (
"Use delimiter file blocks only: FILE: path, ---CONTENT---, complete file content, "
"---END---. Do not use markdown code fences for state update output."
)
return "Use complete fenced file blocks with both the opening ```file:path and closing ``` fence."
def _candidate_artifact_name(path_text: str) -> str:
name = path_text.replace("\\", "/").strip().strip("/")
name = re.sub(r"[^A-Za-z0-9_.-]+", "_", name)
@ -1748,14 +1835,70 @@ def _file_writer_previous_outputs(
retry_count: int,
max_chars: int = 1200,
) -> dict[str, str]:
if retry_count <= 0:
return dict(previous_outputs)
compacted: dict[str, str] = {}
for name, output in previous_outputs.items():
compacted[name] = _compact_previous_output(output, max_chars=max_chars)
clean_output = _compact_agent_artifact_output(output)
if retry_count <= 0:
compacted[name] = clean_output
continue
compacted[name] = _compact_previous_output(clean_output, max_chars=max_chars)
return compacted
def _is_state_update_stage(stage: StageConfig) -> bool:
state_paths = {
"story/plot-state.md",
"story/characters.md",
"story/timeline.md",
"story/unresolved-threads.md",
}
allowed = {path.replace("\\", "/").rstrip("/") for path in stage.allowed_paths}
return stage.type == "file_writer" and bool(allowed) and allowed.issubset(state_paths)
def _is_scene_edit_stage(stage: StageConfig) -> bool:
allowed = {path.replace("\\", "/").rstrip("/") for path in stage.allowed_paths}
return stage.type == "file_writer" and stage.id.startswith("edit_") and "story/chapters" in allowed
def _task_story_chapter_paths(task: Task) -> tuple[str, ...]:
paths: list[str] = []
seen: set[str] = set()
for match in re.finditer(r"story/chapters/[^\s`]+?\.md", task.raw_markdown):
path = match.group(0).strip().strip("`")
if path not in seen:
paths.append(path)
seen.add(path)
return tuple(paths)
def _state_update_previous_outputs(previous_outputs: dict[str, str]) -> dict[str, str]:
compacted: dict[str, str] = {}
for name in ("draft_scene", "apply_draft", "continuity_review", "style_review"):
output = previous_outputs.get(name)
if output:
compacted[name] = _compact_previous_output(_compact_agent_artifact_output(output), max_chars=1800)
for name, output in previous_outputs.items():
if name in compacted or name in {"plan", "semantic_context", "context"}:
continue
if "draft" in name or "review" in name or "apply" in name:
compacted[name] = _compact_previous_output(_compact_agent_artifact_output(output), max_chars=1200)
return compacted
def _compact_agent_artifact_output(output: str) -> str:
if "# Agent Output:" not in output or "## Prompt" not in output:
return output
stdout = extract_agent_stdout(output).strip()
return stdout if stdout else output
def _agent_invocation_json_path(output_path: Path) -> Path:
if output_path.suffix:
return output_path.with_suffix(".json")
return output_path.with_name(output_path.name + ".json")
def _compact_previous_output(output: str, max_chars: int = 1200) -> str:
if len(output) <= max_chars:
return output

View File

@ -13,11 +13,16 @@ Review the drafted scene against:
Check for:
- contradictions
- wrong character knowledge
- wrong character pronouns or narrative reference, using `Pronouns / Reference` in `story/characters.md` as hard canon
- impossible locations or timing
- accidental resolution of future threads
- missing required beats from the task
- invented lore that should have been added deliberately
Do not fail the scene because durable state files are not updated yet. State files are updated by a later `update_state` stage after review. If the task lists `Updates:`, treat those as future state-update requirements and mention them only as `context_update` guidance.
Wrong pronouns are a continuity failure. If a drafted scene uses non-canonical pronouns for a named character, return `status: fail` and explain which character drifted. Do not pass the scene with only `context_update` guidance.
Output exactly:
status: pass | fail | retry | escalate
@ -25,4 +30,4 @@ reason: <short explanation>
next_stage: <optional stage id>
context_update: <compact useful note>
When `status: pass`, leave `next_stage` blank. Use `retry` when the scene can be repaired by drafting again.
When `status: pass`, leave `next_stage` blank. Use `retry` when the scene can be repaired by drafting again. For retryable scene issues, leave `next_stage` blank; NightShift will route back to the configured drafting stage.

View File

@ -7,12 +7,14 @@ Rules:
- Do not edit `story/worldbuilding.md`, `story/characters.md`, `story/style-guide.md`, `story/plot-state.md`, `story/timeline.md`, `story/unresolved-threads.md`, `story/continuity-rules.md`, or `story/outline.md`.
- Use `story/style-guide.md` for POV, tense, tone, and prose rules.
- Use `story/plot-state.md` and `story/timeline.md` as current state.
- Use the `Pronouns / Reference` sections in `story/characters.md` as hard canon.
- Do not infer, vary, or "smooth out" character pronouns. Use canonical narrative reference exactly.
- Keep the scene bounded to the task acceptance criteria.
- Do not resolve future plot threads unless the task explicitly asks for that.
- Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file.
Output only one complete file block using this delimiter format:
FILE: story/chapters/chapter-001/scene-001.md
FILE: <the exact story/chapters path listed under Writes in the current task>
---CONTENT---
<complete scene prose>
---END---

View File

@ -0,0 +1,26 @@
You are the scene editor for a NightShift novel-writing workflow.
Edit an already drafted scene after a continuity or style review failure.
Rules:
- Preserve the existing scene's structure, voice, events, pacing, and best lines.
- Make the smallest changes needed to satisfy the review failure and task acceptance criteria.
- Do not restart, summarize, replace the scene premise, or change scene direction.
- Use `story/style-guide.md` for POV, tense, tone, and prose rules.
- Use `story/characters.md`, especially `Pronouns / Reference`, as hard canon.
- Wrong pronouns are mandatory fixes.
- Do not edit state files, worldbuilding, outline, continuity rules, or style guide.
- Do not resolve future plot threads unless the task explicitly asks for that.
- Do not include author notes, TODOs, bracket placeholders, or analysis in the scene file.
Use the `current_scene_file` context as the source text to edit.
Use the retry notes and latest review output to identify the required repair.
Output only one complete file block using this delimiter format:
FILE: <the exact story/chapters path listed under Writes in the current task>
---CONTENT---
<complete edited scene prose>
---END---
Do not use markdown code fences for scene prose output.
Do not output a plan, notes, analysis, or any text outside the delimiter block.

View File

@ -21,8 +21,26 @@ State updates should reflect only what happened in the accepted scene:
Do not invent events that are not in the scene.
Preserve existing durable state. Make minimal additive edits:
- append new scene facts, timeline bullets, character knowledge, and unresolved threads
- update current locations/status only where the accepted scene changes them
- do not remove or compress existing character profiles, faction notes, world notes, or open threads
- do not rewrite whole files for style, brevity, or cleanup
- if a section already contains useful detail, keep it and add only the new facts needed
Protect character canon:
- Never change any `Pronouns / Reference` section.
- Never change a character's canonical pronouns, narrative reference, identity, or core wound.
- Prefer updating `story/plot-state.md`, `story/timeline.md`, and `story/unresolved-threads.md`.
- Edit `story/characters.md` only when the accepted scene adds a small current-status fact or introduces a new named character.
- If editing `story/characters.md`, preserve all existing sections and add only the minimal new status/detail needed.
Output only complete file content blocks.
Use one fenced block per file:
```file:story/plot-state.md
Use this delimiter format for each state file you update:
FILE: story/plot-state.md
---CONTENT---
<complete updated state file>
```
---END---
Do not use markdown code fences. Do not include prose outside FILE blocks.

View File

@ -16,6 +16,8 @@ Check for:
- placeholders such as TODO, TBD, `[insert]`, or author notes
- scene length far outside the requested range
Do not fail the scene because durable state files are not updated yet. State files are updated by a later `update_state` stage after review.
Output exactly:
status: pass | fail | retry | escalate
@ -23,4 +25,4 @@ reason: <short explanation>
next_stage: <optional stage id>
context_update: <compact useful note>
When `status: pass`, leave `next_stage` blank. Use `retry` when the drafter should revise the scene.
When `status: pass`, leave `next_stage` blank. Use `retry` when the drafter should revise the scene. For retryable scene issues, leave `next_stage` blank; NightShift will route back to the configured drafting stage.

View File

@ -37,6 +37,14 @@ agents:
num_predict: 8192
system_prompt: .nightshift/agents/drafter.md
editor:
backend: ollama
model: nightshift-writer
temperature: 0.3
num_ctx: 16384
num_predict: 8192
system_prompt: .nightshift/agents/editor.md
continuity_reviewer:
backend: ollama
model: nightshift-base
@ -110,13 +118,42 @@ pipeline:
type: agent_review
agent: continuity_reviewer
output: continuity-review.md
on_fail: draft_scene
on_fail: edit_scene
- id: style_review
type: agent_review
agent: style_reviewer
output: style-review.md
on_fail: draft_scene
on_fail: edit_scene
on_pass: update_state
- id: edit_scene
type: file_writer
agent: editor
output: scene-edit.patch
allowed_paths:
- story/chapters
- id: normalize_edit
type: patch_normalizer
output: normalized-edit.patch
- id: validate_edit
type: patch_validator
output: edit-validation.md
max_files: 2
max_lines: 1200
max_delete_ratio: 0.50
allowed_paths:
- story/chapters
on_fail: edit_scene
- id: apply_edit
type: patch_apply
mode: apply
output: edit-apply-output.txt
on_fail: edit_scene
on_pass: continuity_review
- id: update_state
type: file_writer

View File

@ -4,7 +4,7 @@ import unittest
from unittest.mock import MagicMock, patch
from nightshift.agents import AgentExecutor, build_prompt_bundle, parse_review_output, strip_ansi_escape_sequences
from nightshift.agents import AgentInvocation, format_agent_invocation
from nightshift.agents import AgentInvocation, format_agent_invocation, format_agent_invocation_json
from nightshift.artifacts import ArtifactStore
from nightshift.config import AgentConfig, StageConfig
from nightshift.tasks import parse_tasks
@ -93,7 +93,7 @@ class AgentExecutorTests(unittest.TestCase):
self.assertIn("Use only paths under these project-relative targets: `story/chapters`.", prompt)
self.assertIn("This is the drafting stage", prompt)
self.assertIn("FILE: story/chapters/chapter-001/scene-001.md", prompt)
self.assertIn("FILE: <the exact story/chapters path listed under Writes in the current task>", prompt)
self.assertIn("---CONTENT---", prompt)
self.assertIn("---END---", prompt)
self.assertIn("Do not use markdown code fences", prompt)
@ -125,6 +125,10 @@ class AgentExecutorTests(unittest.TestCase):
output = (root / result.output_path).read_text(encoding="utf-8")
self.assertIn("TASK-001", output)
self.assertIn("Plan carefully.", output)
json_output = (root / ".nightshift" / "runs" / "test-run" / "tasks" / task.id / "plan.json")
self.assertTrue(json_output.exists())
self.assertIn('"stage_id": "plan"', json_output.read_text(encoding="utf-8"))
self.assertIn('"stdout"', json_output.read_text(encoding="utf-8"))
def test_review_output_parser_accepts_structured_status(self) -> None:
status, reason, next_stage, context_update = parse_review_output(
@ -238,6 +242,23 @@ class AgentExecutorTests(unittest.TestCase):
self.assertIn("Agent: `planner`", output)
self.assertIn("## stderr", output)
def test_agent_invocation_json_preserves_raw_streams(self) -> None:
invocation = AgentInvocation(
agent_id="planner",
command="cmd",
prompt="prompt with ``` fences",
exit_code=0,
stdout="stdout with ``` fences",
stderr="stderr",
duration_seconds=0.1,
)
output = format_agent_invocation_json("plan", invocation)
self.assertIn('"stage_id": "plan"', output)
self.assertIn('stdout with ``` fences', output)
self.assertIn('prompt with ``` fences', output)
def test_strip_ansi_escape_sequences(self) -> None:
self.assertEqual(strip_ansi_escape_sequences("\x1b[?25lthinking\x1b[0m"), "thinking")

View File

@ -55,6 +55,40 @@ class ConfigTests(unittest.TestCase):
with self.assertRaisesRegex(ConfigError, "on_fail references unknown stage"):
load_config(config_path)
def test_on_pass_must_reference_existing_stage(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_path.write_text(
config_path.read_text(encoding="utf-8").replace(
"on_fail: plan", "on_pass: missing_stage", 1
),
encoding="utf-8",
)
with self.assertRaisesRegex(ConfigError, "on_pass references unknown stage"):
load_config(config_path)
def test_on_pass_loads(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_path.write_text(
config_path.read_text(encoding="utf-8").replace(
" output: plan.md",
" output: plan.md\n on_pass: summarize",
1,
),
encoding="utf-8",
)
config = load_config(config_path)
plan_stage = next(stage for stage in config.pipeline.stages if stage.id == "plan")
self.assertEqual(plan_stage.on_pass, "summarize")
def test_validate_requires_prompt_files(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)

View File

@ -260,6 +260,47 @@ Sunlight did not belong here.
self.assertEqual(updates[0].path, "story/chapters/chapter-001/scene-001.md")
self.assertEqual(updates[0].content, "Sunlight did not belong here.\n")
def test_file_updates_parse_delimiters_without_end_before_next_file(self) -> None:
updates = parse_file_updates(
"""Intro prose is ignored.
FILE: story/plot-state.md
---CONTENT---
# Plot State
- Scene two happened.
FILE: story/timeline.md
---CONTENT---
# Timeline
- SCENE-002 complete.
"""
)
self.assertEqual(len(updates), 2)
self.assertEqual(updates[0].path, "story/plot-state.md")
self.assertEqual(updates[0].content, "# Plot State\n\n- Scene two happened.\n")
self.assertEqual(updates[1].path, "story/timeline.md")
self.assertEqual(updates[1].content, "# Timeline\n\n- SCENE-002 complete.\n")
def test_file_updates_parse_mixed_delimiter_end_and_next_file(self) -> None:
updates = parse_file_updates(
"""FILE: story/plot-state.md
---CONTENT---
first
---END---
FILE: story/timeline.md
---CONTENT---
second
"""
)
self.assertEqual(len(updates), 2)
self.assertEqual(updates[0].content, "first\n")
self.assertEqual(updates[1].content, "second\n")
def test_file_updates_reject_duplicate_blocks(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
@ -334,6 +375,48 @@ new
self.assertEqual(patch.count("diff --git a/app.py b/app.py"), 1)
def test_file_updates_reject_character_pronoun_canon_changes(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
(root / "story").mkdir()
(root / "story" / "characters.md").write_text(
"""# Characters
## Cricket
### Pronouns / Reference
- Pronouns: she/her
- Narrative reference: Cricket; she/her
Scavenger.
""",
encoding="utf-8",
)
safety = SafetyConfig(
require_clean_worktree=False,
scoped_paths=("story",),
allowed_commands=(),
forbidden_commands=(),
)
updates = parse_file_updates(
"""FILE: story/characters.md
---CONTENT---
# Characters
## Cricket
### Pronouns / Reference
- Pronouns: they/them
- Narrative reference: Cricket; they/them
Scavenger.
---END---
"""
)
with self.assertRaisesRegex(PipelineError, "protected character pronoun canon changed"):
generate_patch_from_file_updates(updates, root, safety)
if __name__ == "__main__":
unittest.main()

View File

@ -105,6 +105,30 @@ class PipelineRunnerTests(unittest.TestCase):
)
self.assertIn("Modified Files", (root / ".nightshift" / "runs" / "test-run" / "run-summary.md").read_text(encoding="utf-8"))
def test_on_pass_jumps_to_configured_stage(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
stages = (
StageConfig(id="first", type="agent", agent="planner", output="first.md", on_pass="third"),
StageConfig(
id="second",
type="command",
commands=('python -c "print(\'should not run\')"',),
output="second-output.txt",
),
StageConfig(id="third", type="summarize", output="final-notes.md"),
)
config = make_config(root, stages)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(TASK_MD)[0])
task_dir = root / ".nightshift" / "runs" / "test-run" / "tasks" / "TASK-001"
self.assertEqual(result.status, "complete")
self.assertEqual([item.stage_id for item in result.stage_results], ["first", "third"])
self.assertFalse((task_dir / "second-output.txt").exists())
def test_task_preflight_fails_when_task_specific_test_file_is_missing(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
@ -153,6 +177,46 @@ class PipelineRunnerTests(unittest.TestCase):
self.assertIn("Retry limit reached", result.reason)
self.assertEqual([item.stage_id for item in result.stage_results], ["implement", "review", "implement", "review", "implement", "review"])
def test_failing_review_self_next_stage_routes_to_on_fail(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
config = make_config(root, (), max_retries=1)
config.agents["reviewer"] = AgentConfig(
id="reviewer",
backend="command",
command=(
"python -c \"print('status: fail\\nreason: needs draft repair\\n"
"next_stage: review\\ncontext_update: add concrete details')\""
),
system_prompt=Path("reviewer.md"),
)
config = replace(
config,
pipeline=PipelineConfig(
max_task_retries=1,
stages=(
StageConfig(id="implement", type="agent", agent="planner", output="implementation-log.md"),
StageConfig(
id="review",
type="agent_review",
agent="reviewer",
on_fail="implement",
output="review.md",
),
),
),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
task = parse_tasks(TASK_MD)[0]
result = runner.run_task(task)
self.assertEqual(result.retry_count, 1)
self.assertEqual([item.stage_id for item in result.stage_results], ["implement", "review", "implement", "review"])
log = (root / ".nightshift" / "runs" / "test-run" / "run.log").read_text(encoding="utf-8")
self.assertIn("next_stage=implement", log)
def test_malformed_review_gets_strict_retry_without_redrafting(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
@ -544,6 +608,34 @@ Acceptance Criteria:
self.assertIn("response = self.client.get('/board/general')", note)
self.assertIn("self.assertEqual(response.status_code, 200)", note)
def test_state_update_retry_note_guides_deletion_heavy_repairs(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
config = make_config(root, ())
runner = PipelineRunner(config, artifacts)
output_path = artifacts.write_stage_output(
"TASK-001",
"state-validation.md",
"# Patch Validation\n\nStatus: fail\nReason: Patch validation failed: deletion-heavy patch exceeds max_delete_ratio 0.35.\n",
)
note = runner._format_retry_note(
1,
StageConfig(id="validate_state", type="patch_validator", on_fail="update_state"),
StageResult(
stage_id="validate_state",
status="fail",
reason="Patch validation failed: deletion-heavy patch exceeds max_delete_ratio 0.35.",
output_path=str(output_path.relative_to(root)),
),
"update_state",
)
self.assertIn("preserve existing durable state text", note)
self.assertIn("minimal additive edits", note)
def test_code_writer_normalizer_and_validator_pipeline(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
@ -892,6 +984,60 @@ Acceptance Criteria:
self.assertIn("... <truncated>", retry_prompt)
self.assertLess(len(retry_prompt), 9000)
def test_state_file_writer_invalid_output_retry_uses_delimiter_format(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
story = root / "story"
story.mkdir()
(story / "plot-state.md").write_text("old\n", encoding="utf-8")
(root / "fake_writer.py").write_text(
"\n".join(
[
"import sys",
"prompt = sys.stdin.read()",
"if 'Previous file_writer output was invalid' not in prompt:",
" print('lookup failed')",
"else:",
" (open('retry-prompt.txt', 'w', encoding='utf-8').write(prompt))",
" print('FILE: story/plot-state.md')",
" print('---CONTENT---')",
" print('old')",
" print('new')",
" print('---END---')",
]
),
encoding="utf-8",
)
stages = (
StageConfig(
id="update_state",
type="file_writer",
agent="writer",
allowed_paths=(
"story/plot-state.md",
"story/characters.md",
"story/timeline.md",
"story/unresolved-threads.md",
),
),
)
config = make_config(root, stages)
config.agents["writer"] = AgentConfig(
id="writer",
backend="command",
command="python fake_writer.py",
system_prompt=Path("planner.md"),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(TASK_MD)[0])
retry_prompt = (root / "retry-prompt.txt").read_text(encoding="utf-8")
self.assertEqual(result.status, "complete")
self.assertIn("Use delimiter file blocks only", retry_prompt)
self.assertNotIn("Use complete fenced file blocks", retry_prompt)
def test_file_writer_retry_compacts_large_previous_outputs(self) -> None:
outputs = {
"scene-draft.patch": "a" * 5000,
@ -904,6 +1050,162 @@ Acceptance Criteria:
self.assertLess(len(compacted["scene-draft.patch"]), 180)
self.assertEqual(compacted["draft-validation.md"], "Patch validation failed")
def test_file_writer_first_attempt_preserves_large_previous_outputs(self) -> None:
outputs = {"plan": "a" * 5000}
compacted = _file_writer_previous_outputs(outputs, retry_count=0, max_chars=100)
self.assertEqual(compacted["plan"], "a" * 5000)
def test_file_writer_previous_outputs_strip_wrapped_agent_prompts(self) -> None:
output = "\n".join(
[
"# Agent Output: plan",
"",
"## stdout",
"",
"```text",
"useful plan",
"```",
"",
"## stderr",
"",
"```text",
"```",
"",
"## Prompt",
"",
"```markdown",
"huge prompt marker",
"```",
]
)
compacted = _file_writer_previous_outputs({"plan": output}, retry_count=0)
self.assertEqual(compacted["plan"], "useful plan")
self.assertNotIn("huge prompt marker", compacted["plan"])
def test_state_update_file_writer_gets_focused_context_and_current_files(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
(root / "story").mkdir()
(root / "story" / "plot-state.md").write_text("# Plot State\n\n- Before\n", encoding="utf-8")
(root / "fake_state_writer.py").write_text(
"\n".join(
[
"import sys",
"prompt = sys.stdin.read()",
"open('state-prompt.txt', 'w', encoding='utf-8').write(prompt)",
"if 'current_allowed_files' in prompt and 'huge-plan-marker' not in prompt:",
" print('FILE: story/plot-state.md')",
" print('---CONTENT---')",
" print('# Plot State')",
" print()",
" print('- Before')",
" print('- After')",
" print('---END---')",
"else:",
" print('')",
]
),
encoding="utf-8",
)
config = make_config(
root,
(
StageConfig(id="plan", type="agent", agent="planner", output="plan.md"),
StageConfig(
id="update_state",
type="file_writer",
agent="state_updater",
allowed_paths=("story/plot-state.md",),
),
),
)
config.agents["planner"] = AgentConfig(
id="planner",
backend="command",
command="python -c \"print('huge-plan-marker' * 1000)\"",
system_prompt=Path("planner.md"),
)
config.agents["state_updater"] = AgentConfig(
id="state_updater",
backend="command",
command="python fake_state_writer.py",
system_prompt=Path("planner.md"),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(TASK_MD)[0])
prompt = (root / "state-prompt.txt").read_text(encoding="utf-8")
self.assertEqual(result.status, "complete")
self.assertIn("current_allowed_files", prompt)
self.assertIn("# Plot State", prompt)
self.assertNotIn("huge-plan-marker", prompt)
def test_scene_editor_file_writer_gets_current_scene_file(self) -> None:
task_md = """# Tasks
- [ ] SCENE-001: Edit scene
Description:
Repair the scene.
Acceptance Criteria:
- Writes:
- `story/chapters/chapter-001/scene-001.md`
"""
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
_write_common_files(root)
(root / "tasks.md").write_text(task_md, encoding="utf-8")
scene_path = root / "story" / "chapters" / "chapter-001" / "scene-001.md"
scene_path.parent.mkdir(parents=True)
scene_path.write_text("Proxy walked home.\n", encoding="utf-8")
(root / "fake_editor.py").write_text(
"\n".join(
[
"import sys",
"prompt = sys.stdin.read()",
"open('editor-prompt.txt', 'w', encoding='utf-8').write(prompt)",
"if 'current_scene_file' in prompt and 'Proxy walked home.' in prompt:",
" print('FILE: story/chapters/chapter-001/scene-001.md')",
" print('---CONTENT---')",
" print('Proxy walked home corrected.')",
" print('---END---')",
"else:",
" print('')",
]
),
encoding="utf-8",
)
stages = (
StageConfig(
id="edit_scene",
type="file_writer",
agent="editor",
allowed_paths=("story/chapters",),
),
)
config = make_config(root, stages)
config.agents["editor"] = AgentConfig(
id="editor",
backend="command",
command="python fake_editor.py",
system_prompt=Path("planner.md"),
)
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
result = runner.run_task(parse_tasks(task_md)[0])
prompt = (root / "editor-prompt.txt").read_text(encoding="utf-8")
self.assertEqual(result.status, "complete")
self.assertIn("current_scene_file", prompt)
self.assertIn("Proxy walked home.", prompt)
def test_patch_validator_rejects_unsafe_patch(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)