The ollama backend now uses Ollama’s HTTP API instead of ollama run

This commit is contained in:
K. Hodges 2026-05-17 14:23:31 -07:00
parent db9b24379e
commit 42564c6867
10 changed files with 132 additions and 114 deletions

View File

@ -83,6 +83,8 @@ tasks/TASK-001/context-out.md
tasks/TASK-001/final-notes.md tasks/TASK-001/final-notes.md
``` ```
Retry attempts preserve separate artifacts with numeric suffixes, such as `repair-1.patch`, `normalized-1.patch`, `patch-validation-1.md`, `applied-1.patch`, and `patch-apply-output-1.txt`.
## Example Templates ## Example Templates
Example run files are available in `examples/templates/`. Example run files are available in `examples/templates/`.
@ -363,4 +365,6 @@ After a run, inspect:
.nightshift/runs/<run-id>/tasks/TASK-001/final-notes.md .nightshift/runs/<run-id>/tasks/TASK-001/final-notes.md
``` ```
If validation or later stages retry implementation, inspect the suffixed retry artifacts too, for example `repair-1.patch` and `patch-validation-1.md`.
The useful signal is whether NightShift selected the right task, respected dependencies, generated context, validated and applied a patch, ran tests, wrote artifacts, updated task completion, and produced a clear summary. The useful signal is whether NightShift selected the right task, respected dependencies, generated context, validated and applied a patch, ran tests, wrote artifacts, updated task completion, and produced a clear summary.

View File

@ -195,10 +195,13 @@ agents:
implementer: implementer:
backend: ollama backend: ollama
model: qwen2.5-coder:14b model: qwen2.5-coder:14b
base_url: http://localhost:11434
temperature: 0.2 temperature: 0.2
system_prompt: agents/implementer.md system_prompt: agents/implementer.md
``` ```
The Ollama backend uses the local HTTP API instead of `ollama run`, which keeps exact patch output away from terminal rendering and line wrapping.
Example OpenAI-compatible agent: Example OpenAI-compatible agent:
```yaml ```yaml
@ -212,7 +215,7 @@ agents:
system_prompt: agents/implementer.md system_prompt: agents/implementer.md
``` ```
NightShift passes prompt bundles to agents and persists stdout, stderr, exit code, duration, and prompt artifacts. Code writer agents should return unified diffs. NightShift passes prompt bundles to agents and persists stdout, stderr, exit code, duration, and prompt artifacts. Code writer agents should return unified diffs. On retries, patch artifacts are versioned by attempt, for example `repair-1.patch`, `normalized-1.patch`, and `patch-validation-1.md`.
Review agents should emit: Review agents should emit:
@ -274,10 +277,15 @@ A run creates human-readable artifacts:
context-pack.md context-pack.md
plan.md plan.md
proposed.patch proposed.patch
repair-1.patch
normalized.patch normalized.patch
normalized-1.patch
patch-validation.md patch-validation.md
patch-validation-1.md
applied.patch applied.patch
applied-1.patch
patch-apply-output.txt patch-apply-output.txt
patch-apply-output-1.txt
test-output.txt test-output.txt
review.md review.md
stage-results.md stage-results.md

View File

@ -27,7 +27,8 @@ NightShift config is YAML.
Supported backends: Supported backends:
- `command`: runs a local command with the prompt on stdin. - `command`: runs a local command with the prompt on stdin.
- `ollama`: runs `ollama run <model>` with the prompt on stdin. - `ollama`: calls the local Ollama HTTP API at `http://localhost:11434/api/generate` by default.
- `openai_compatible`: calls a Chat Completions-compatible HTTP API.
Command agent: Command agent:
@ -44,6 +45,7 @@ Ollama agent:
planner: planner:
backend: ollama backend: ollama
model: qwen2.5-coder:14b model: qwen2.5-coder:14b
base_url: http://localhost:11434
system_prompt: agents/planner.md system_prompt: agents/planner.md
``` ```

View File

@ -865,7 +865,7 @@ NightShift currently provides:
* Command, agent, agent-review, review, summarize, repo-context, code-writer, patch-normalizer, patch-validator, and patch-apply stage handling * Command, agent, agent-review, review, summarize, repo-context, code-writer, patch-normalizer, patch-validator, and patch-apply stage handling
* Retry redirection with a configured task retry limit * Retry redirection with a configured task retry limit
* Command-backed agents * Command-backed agents
* Ollama-backed local model agents * Ollama-backed local model agents through the local HTTP API
* OpenAI-compatible local/server model agents * OpenAI-compatible local/server model agents
* Per-agent temperature settings * Per-agent temperature settings
* Scoped repo lookup tools: `list_files`, `read_file`, and `grep` * Scoped repo lookup tools: `list_files`, `read_file`, and `grep`
@ -874,6 +874,7 @@ NightShift currently provides:
* Context pack generation * Context pack generation
* Unified diff code-writing contract * Unified diff code-writing contract
* Patch normalization, validation, dry-run, and apply modes * Patch normalization, validation, dry-run, and apply modes
* Per-attempt retry patch artifacts such as `repair-1.patch`, `normalized-1.patch`, and `patch-validation-1.md`
* Test/static failure repair loops via bounded stage retries * Test/static failure repair loops via bounded stage retries
* Prompt bundle construction with project, task, retry, and previous-stage context * Prompt bundle construction with project, task, retry, and previous-stage context
* Prompt snapshots and run metadata for experiment comparison * Prompt snapshots and run metadata for experiment comparison
@ -1014,13 +1015,13 @@ The next important additions are:
Move max files, max lines, forbidden paths, allowed file types, binary rejection, and protected files into a reusable project-level write policy. Move max files, max lines, forbidden paths, allowed file types, binary rejection, and protected files into a reusable project-level write policy.
5. Better model backend support 5. Better model backend support
Expand OpenAI-compatible behavior, add request metadata artifacts, support response format hints, and document local server patterns. Prefer non-terminal APIs for machine-readable model output. In particular, avoid relying on interactive CLI streaming paths such as `ollama run` when exact patch text matters; use the Ollama HTTP API or OpenAI-compatible endpoint so terminal rendering, spinners, and line-wrapping behavior cannot corrupt artifacts. Expand OpenAI-compatible behavior, add request metadata artifacts, support response format hints, and document local server patterns. Machine-readable Ollama output now uses the HTTP API instead of the interactive `ollama run` terminal path; keep this non-terminal capture policy for future model backends where exact patch text matters.
6. Deterministic diff generation 6. Deterministic diff generation
Reduce direct reliance on models emitting perfect unified diffs. Add a workflow where the model returns complete file contents or a structured edit description, then NightShift writes the unified diff deterministically from before/after file snapshots. Keep the existing unified-diff contract for advanced agents, but make deterministic diff generation the preferred path for smaller local models. Reduce direct reliance on models emitting perfect unified diffs. Add a workflow where the model returns complete file contents or a structured edit description, then NightShift writes the unified diff deterministically from before/after file snapshots. Keep the existing unified-diff contract for advanced agents, but make deterministic diff generation the preferred path for smaller local models.
7. Retry artifact versioning 7. Retry artifact versioning
Preserve per-attempt artifacts instead of overwriting fixed filenames such as `proposed.patch`, `normalized.patch`, and `patch-validation.md`. Retry artifacts should include attempt numbers, while summary artifacts can point to the latest attempt. This makes repeated validation and repair failures diagnosable. Continue improving per-attempt artifact preservation. Patch retries now preserve files such as `repair-1.patch`, `normalized-1.patch`, and `patch-validation-1.md`; future work should add richer latest-attempt indexes and dashboard navigation.
8. Patch repair stage 8. Patch repair stage
Add an explicit patch repair or strict normalizer stage that receives the invalid patch, validation error, and relevant source excerpts, then returns a complete replacement patch. This stage should remain bounded by strict validation and should not silently guess intent for arbitrary malformed hunks. Add an explicit patch repair or strict normalizer stage that receives the invalid patch, validation error, and relevant source excerpts, then returns a complete replacement patch. This stage should remain bounded by strict validation and should not silently guess intent for arbitrary malformed hunks.
@ -1042,7 +1043,7 @@ The next important additions are:
Implementation note: Implementation note:
Recent local-model patch experiments exposed repeated line-fragment artifacts where long generated lines were split and the tail was duplicated on the following line. This affected prose and unified diffs, producing malformed hunk lines that strict validation correctly rejected. Treat this as a backend/output-capture and patch-contract problem before adding editor or linter agents: remove terminal streaming from model capture, preserve retry artifacts, and prefer deterministic diff generation when exact syntax matters. Recent local-model patch experiments exposed repeated line-fragment artifacts where long generated lines were split and the tail was duplicated on the following line. This affected prose and unified diffs, producing malformed hunk lines that strict validation correctly rejected. Treat this as a backend/output-capture and patch-contract problem before adding editor or linter agents: avoid terminal streaming for machine output, preserve retry artifacts, and prefer deterministic diff generation when exact syntax matters.
--- ---
# Appendix A: Design Decisions and Rationale # Appendix A: Design Decisions and Rationale

View File

@ -18,7 +18,7 @@ If `require_clean_worktree: true`, NightShift blocks dirty repositories before c
## Ollama backend fails ## Ollama backend fails
The `ollama` backend requires the `ollama` executable to be installed and the configured model to be available. Tests do not require Ollama. The `ollama` backend uses Ollama's local HTTP API, normally at `http://localhost:11434/api/generate`. Confirm Ollama is running and the configured model is available with `ollama list` or `ollama pull <model>`. Tests do not require Ollama.
## Flask dashboard fails ## Flask dashboard fails

View File

@ -29,10 +29,10 @@ Install and start Ollama, then make sure the model is available:
```bash ```bash
ollama pull qwen2.5-coder:14b ollama pull qwen2.5-coder:14b
ollama run qwen2.5-coder:14b ollama list
``` ```
Stop the interactive `ollama run` session after confirming the model responds. NightShift will invoke Ollama itself. Keep Ollama running. NightShift uses Ollama's local HTTP API, normally at `http://localhost:11434`, rather than the interactive `ollama run` terminal path.
## 1. Create a Scratch Target Project ## 1. Create a Scratch Target Project
@ -189,6 +189,8 @@ Inspect these artifacts:
.nightshift/runs/<run-id>/tasks/TASK-001/final-notes.md .nightshift/runs/<run-id>/tasks/TASK-001/final-notes.md
``` ```
If a later stage routes back to `implement`, retry artifacts are written with attempt suffixes such as `repair-1.patch`, `normalized-1.patch`, `patch-validation-1.md`, `applied-1.patch`, and `patch-apply-output-1.txt`.
In dry-run mode, the patch should be validated and checked with `git apply --check`, but files should not change. In dry-run mode, the patch should be validated and checked with `git apply --check`, but files should not change.
## 5. Apply The Patch ## 5. Apply The Patch
@ -215,6 +217,7 @@ If the model generates a valid patch, NightShift will:
- apply the patch with `git apply` - apply the patch with `git apply`
- run `python -m unittest discover -v` - run `python -m unittest discover -v`
- retry through the implementer if the test stage fails and `max_task_retries` allows it - retry through the implementer if the test stage fails and `max_task_retries` allows it
- preserve per-attempt retry patch artifacts with numeric suffixes
- mark the task complete only if the pipeline completes - mark the task complete only if the pipeline completes
## 6. Monitor From The Web Dashboard ## 6. Monitor From The Web Dashboard
@ -277,13 +280,13 @@ Once you trust the workflow, consider setting `require_clean_worktree: true` in
## Troubleshooting ## Troubleshooting
If Ollama is not found: If Ollama is unavailable:
```text ```text
Agent exited with code 127 Agent exited with code 1
``` ```
Confirm `ollama` is installed and available on `PATH`. Confirm Ollama is running at the configured `base_url` and the model appears in `ollama list`.
If the model returns prose instead of a patch, tighten `agents/implementer.md`. The implementation stage requires a unified diff. If the model returns prose instead of a patch, tighten `agents/implementer.md`. The implementation stage requires a unified diff.
@ -291,8 +294,11 @@ If patch validation fails, inspect:
```text ```text
patch-validation.md patch-validation.md
patch-validation-1.md
normalized.patch normalized.patch
normalized-1.patch
proposed.patch proposed.patch
repair-1.patch
``` ```
If patch apply fails, inspect: If patch apply fails, inspect:

View File

@ -8,7 +8,6 @@ import os
from pathlib import Path from pathlib import Path
import re import re
import subprocess import subprocess
import tempfile
import time import time
from urllib import request from urllib import request
from urllib.error import URLError from urllib.error import URLError
@ -23,7 +22,6 @@ from .tasks import Task
DEFAULT_AGENT_TIMEOUT_SECONDS = 600 DEFAULT_AGENT_TIMEOUT_SECONDS = 600
OLLAMA_HEARTBEAT_SECONDS = 30.0
@dataclass(frozen=True) @dataclass(frozen=True)
@ -222,96 +220,58 @@ class AgentExecutor:
def _invoke_ollama(self, agent: AgentConfig, prompt: str) -> AgentInvocation: def _invoke_ollama(self, agent: AgentConfig, prompt: str) -> AgentInvocation:
if not agent.model: if not agent.model:
raise AgentError(f"Agent error: ollama backend agent '{agent.id}' has no model.") raise AgentError(f"Agent error: ollama backend agent '{agent.id}' has no model.")
command = f"ollama run {agent.model}" base_url = (agent.base_url or "http://localhost:11434").rstrip("/")
prompt_input = prompt url = base_url + "/api/generate"
command = f"POST {url}"
body: dict[str, object] = {
"model": agent.model,
"prompt": prompt,
"stream": False,
}
if agent.temperature is not None: if agent.temperature is not None:
prompt_input = f"/set parameter temperature {agent.temperature}\n{prompt}" body["options"] = {"temperature": agent.temperature}
headers = {"Content-Type": "application/json"}
started = time.monotonic() started = time.monotonic()
self.logger.event( self.logger.event(
"ollama.start", "ollama.start",
"Starting Ollama model invocation", "Starting Ollama HTTP model invocation",
agent_id=agent.id, agent_id=agent.id,
model=agent.model, model=agent.model,
timeout_seconds=self.timeout_seconds, timeout_seconds=self.timeout_seconds,
) )
try: try:
with tempfile.TemporaryFile("w+", encoding="utf-8", errors="replace") as stdout_file: payload = json.dumps(body).encode("utf-8")
with tempfile.TemporaryFile("w+", encoding="utf-8", errors="replace") as stderr_file: req = request.Request(url, data=payload, headers=headers, method="POST")
process = subprocess.Popen( with request.urlopen(req, timeout=self.timeout_seconds) as response:
["ollama", "run", agent.model], raw = response.read().decode("utf-8", errors="replace")
cwd=self.project_root,
stdin=subprocess.PIPE,
stdout=stdout_file,
stderr=stderr_file,
text=True,
encoding="utf-8",
errors="replace",
)
assert process.stdin is not None
process.stdin.write(prompt_input)
process.stdin.close()
last_heartbeat = started
timed_out = False
while process.poll() is None:
now = time.monotonic()
elapsed = now - started
if elapsed > self.timeout_seconds:
process.kill()
timed_out = True
break
if now - last_heartbeat >= OLLAMA_HEARTBEAT_SECONDS:
self.logger.event(
"ollama.wait",
"Ollama invocation still running",
agent_id=agent.id,
model=agent.model,
elapsed=f"{elapsed:.0f}s",
)
last_heartbeat = now
time.sleep(1.0)
process.wait()
duration = time.monotonic() - started duration = time.monotonic() - started
stdout_file.seek(0)
stderr_file.seek(0)
stdout = stdout_file.read()
stderr = stderr_file.read()
if timed_out:
return AgentInvocation( return AgentInvocation(
agent_id=agent.id, agent_id=agent.id,
command=command, command=command,
prompt=prompt_input, prompt=prompt,
exit_code=0,
stdout=_extract_ollama_response(raw),
stderr="",
duration_seconds=duration,
)
except TimeoutError:
duration = time.monotonic() - started
return AgentInvocation(
agent_id=agent.id,
command=command,
prompt=prompt,
exit_code=-1, exit_code=-1,
stdout=stdout, stdout="",
stderr=stderr, stderr="Request timed out.",
duration_seconds=duration, duration_seconds=duration,
timed_out=True, timed_out=True,
) )
return AgentInvocation( except (OSError, URLError) as exc:
agent_id=agent.id,
command=command,
prompt=prompt_input,
exit_code=process.returncode or 0,
stdout=stdout,
stderr=stderr,
duration_seconds=duration,
)
except FileNotFoundError as exc:
duration = time.monotonic() - started duration = time.monotonic() - started
return AgentInvocation( return AgentInvocation(
agent_id=agent.id, agent_id=agent.id,
command=command, command=command,
prompt=prompt_input, prompt=prompt,
exit_code=127,
stdout="",
stderr=str(exc),
duration_seconds=duration,
)
except OSError as exc:
duration = time.monotonic() - started
return AgentInvocation(
agent_id=agent.id,
command=command,
prompt=prompt_input,
exit_code=1, exit_code=1,
stdout="", stdout="",
stderr=str(exc), stderr=str(exc),
@ -461,11 +421,22 @@ def _extract_openai_content(raw: str) -> str:
return raw return raw
def _extract_ollama_response(raw: str) -> str:
try:
data = json.loads(raw)
response = data.get("response")
if isinstance(response, str):
return response
except (json.JSONDecodeError, AttributeError):
pass
return raw
def output_contract_for(stage: StageConfig) -> str: def output_contract_for(stage: StageConfig) -> str:
if stage.type == "code_writer": if stage.type == "code_writer":
return "\n".join( return "\n".join(
[ [
"Return a unified diff only, suitable for saving as proposed.patch.", "Return a unified diff only, suitable for saving as proposed.patch or repair-N.patch.",
"Do not include prose outside the patch.", "Do not include prose outside the patch.",
"Use diff --git headers and hunk headers.", "Use diff --git headers and hunk headers.",
"For existing files, do not use new file mode or /dev/null headers.", "For existing files, do not use new file mode or /dev/null headers.",

View File

@ -370,11 +370,11 @@ class PipelineRunner:
if stage.type == "code_writer": if stage.type == "code_writer":
return self._run_code_writer_stage(stage, task, previous_outputs, retry_notes, retry_count) return self._run_code_writer_stage(stage, task, previous_outputs, retry_notes, retry_count)
if stage.type == "patch_normalizer": if stage.type == "patch_normalizer":
return self._run_patch_normalizer_stage(stage, task, previous_outputs, retry_notes) return self._run_patch_normalizer_stage(stage, task, previous_outputs, retry_notes, retry_count)
if stage.type == "patch_validator": if stage.type == "patch_validator":
return self._run_patch_validator_stage(stage, task, previous_outputs) return self._run_patch_validator_stage(stage, task, previous_outputs, retry_count)
if stage.type == "patch_apply": if stage.type == "patch_apply":
return self._run_patch_apply_stage(stage, task, previous_outputs) return self._run_patch_apply_stage(stage, task, previous_outputs, retry_count)
if stage.type == "repo_context": if stage.type == "repo_context":
output_path = self.artifacts.write_stage_output( output_path = self.artifacts.write_stage_output(
task.id, task.id,
@ -488,7 +488,7 @@ class PipelineRunner:
f"# Implementation Summary\n\nStatus: fail\nReason: {exc}\n", f"# Implementation Summary\n\nStatus: fail\nReason: {exc}\n",
) )
return StageResult(stage.id, "fail", str(exc), output_path=result.output_path) return StageResult(stage.id, "fail", str(exc), output_path=result.output_path)
patch_filename = stage.output or ("proposed.patch" if retry_count == 0 else f"repair-{retry_count}.patch") patch_filename = "repair-{0}.patch".format(retry_count) if retry_count else (stage.output or "proposed.patch")
summary_filename = "implementation-summary.md" if retry_count == 0 else f"repair-summary-{retry_count}.md" summary_filename = "implementation-summary.md" if retry_count == 0 else f"repair-summary-{retry_count}.md"
proposed_path = self.artifacts.write_stage_output(task.id, patch_filename, patch) proposed_path = self.artifacts.write_stage_output(task.id, patch_filename, patch)
summary_path = self.artifacts.write_stage_output( summary_path = self.artifacts.write_stage_output(
@ -522,6 +522,7 @@ class PipelineRunner:
task: Task, task: Task,
previous_outputs: dict[str, str], previous_outputs: dict[str, str],
retry_notes: list[str], retry_notes: list[str],
retry_count: int = 0,
) -> StageResult: ) -> StageResult:
source = _latest_patch_like_output(previous_outputs) source = _latest_patch_like_output(previous_outputs)
if stage.agent is not None: if stage.agent is not None:
@ -539,7 +540,11 @@ class PipelineRunner:
patch = normalize_patch_text(source) patch = normalize_patch_text(source)
except PipelineError as exc: except PipelineError as exc:
return StageResult(stage.id, "fail", str(exc)) return StageResult(stage.id, "fail", str(exc))
output_path = self.artifacts.write_stage_output(task.id, stage.output or "normalized.patch", patch) output_path = self.artifacts.write_stage_output(
task.id,
_attempt_filename(stage.output or "normalized.patch", retry_count),
patch,
)
self.logger.event( self.logger.event(
"artifact.write", "artifact.write",
"Wrote normalized patch", "Wrote normalized patch",
@ -559,7 +564,9 @@ class PipelineRunner:
stage: StageConfig, stage: StageConfig,
task: Task, task: Task,
previous_outputs: dict[str, str], previous_outputs: dict[str, str],
retry_count: int = 0,
) -> StageResult: ) -> StageResult:
output_filename = _attempt_filename(stage.output or "patch-validation.md", retry_count)
source = _latest_patch_like_output(previous_outputs) source = _latest_patch_like_output(previous_outputs)
try: try:
patch = normalize_patch_text(source) patch = normalize_patch_text(source)
@ -574,7 +581,7 @@ class PipelineRunner:
except PipelineError as exc: except PipelineError as exc:
output_path = self.artifacts.write_stage_output( output_path = self.artifacts.write_stage_output(
task.id, task.id,
stage.output or "patch-validation.md", output_filename,
f"# Patch Validation\n\nStatus: fail\nReason: {exc}\n", f"# Patch Validation\n\nStatus: fail\nReason: {exc}\n",
) )
return StageResult( return StageResult(
@ -585,7 +592,7 @@ class PipelineRunner:
) )
output_path = self.artifacts.write_stage_output( output_path = self.artifacts.write_stage_output(
task.id, task.id,
stage.output or "patch-validation.md", output_filename,
format_validation_result(result), format_validation_result(result),
) )
return StageResult( return StageResult(
@ -600,7 +607,9 @@ class PipelineRunner:
stage: StageConfig, stage: StageConfig,
task: Task, task: Task,
previous_outputs: dict[str, str], previous_outputs: dict[str, str],
retry_count: int = 0,
) -> StageResult: ) -> StageResult:
output_filename = _attempt_filename(stage.output or "patch-apply-output.txt", retry_count)
source = _latest_patch_like_output(previous_outputs) source = _latest_patch_like_output(previous_outputs)
try: try:
patch = normalize_patch_text(source) patch = normalize_patch_text(source)
@ -615,7 +624,7 @@ class PipelineRunner:
except PipelineError as exc: except PipelineError as exc:
output_path = self.artifacts.write_stage_output( output_path = self.artifacts.write_stage_output(
task.id, task.id,
stage.output or "patch-apply-output.txt", output_filename,
f"# Patch Apply\n\nStatus: fail\nReason: {exc}\n", f"# Patch Apply\n\nStatus: fail\nReason: {exc}\n",
) )
return StageResult( return StageResult(
@ -625,14 +634,18 @@ class PipelineRunner:
output_path=str(output_path.relative_to(self.config.project.root)), output_path=str(output_path.relative_to(self.config.project.root)),
) )
applied_path = self.artifacts.write_stage_output(task.id, "applied.patch", patch) applied_path = self.artifacts.write_stage_output(
task.id,
_attempt_filename("applied.patch", retry_count),
patch,
)
write_git_artifacts(self.artifacts, task.id, "before-patch-apply") write_git_artifacts(self.artifacts, task.id, "before-patch-apply")
mode = stage.mode or "dry_run" mode = stage.mode or "dry_run"
apply_result = apply_patch_with_git(applied_path, self.config.project.root, mode=mode) apply_result = apply_patch_with_git(applied_path, self.config.project.root, mode=mode)
write_git_artifacts(self.artifacts, task.id, "after-patch-apply") write_git_artifacts(self.artifacts, task.id, "after-patch-apply")
output_path = self.artifacts.write_stage_output( output_path = self.artifacts.write_stage_output(
task.id, task.id,
stage.output or "patch-apply-output.txt", output_filename,
format_patch_apply_result( format_patch_apply_result(
apply_result, apply_result,
applied_path.relative_to(self.config.project.root).as_posix(), applied_path.relative_to(self.config.project.root).as_posix(),
@ -839,6 +852,19 @@ def _latest_patch_like_output(previous_outputs: dict[str, str]) -> str:
raise PipelineError("Patch error: no previous patch output found.") raise PipelineError("Patch error: no previous patch output found.")
def _attempt_filename(filename: str, retry_count: int) -> str:
if retry_count <= 0:
return filename
path = Path(filename)
suffix = "".join(path.suffixes)
if suffix:
stem = path.name[: -len(suffix)]
name = f"{stem}-{retry_count}{suffix}"
else:
name = f"{path.name}-{retry_count}"
return path.with_name(name).as_posix()
def format_aggregate_run_summary(results: list[PipelineResult], status: str, reason: str) -> str: def format_aggregate_run_summary(results: list[PipelineResult], status: str, reason: str) -> str:
lines = [ lines = [
"# Run Summary", "# Run Summary",

View File

@ -1,5 +1,4 @@
from pathlib import Path from pathlib import Path
import io
import tempfile import tempfile
import unittest import unittest
from unittest.mock import MagicMock, patch from unittest.mock import MagicMock, patch
@ -119,27 +118,20 @@ class AgentExecutorTests(unittest.TestCase):
task = parse_tasks(TASK_MD)[0] task = parse_tasks(TASK_MD)[0]
stage = StageConfig(id="plan", type="agent", agent="planner", output="plan.md") stage = StageConfig(id="plan", type="agent", agent="planner", output="plan.md")
class FakePopen: response = MagicMock()
def __init__(self, args, cwd=None, stdin=None, stdout=None, stderr=None, **kwargs): response.__enter__.return_value.read.return_value = b'{"response":"ollama output"}'
self.args = args
self.stdin = io.StringIO()
self.returncode = 0
stdout.write("ollama output")
def poll(self): with patch("nightshift.agents.request.urlopen", return_value=response) as urlopen:
return self.returncode
def wait(self):
return self.returncode
with patch("nightshift.agents.subprocess.Popen", side_effect=FakePopen) as popen:
result = executor.run_stage(stage, task) result = executor.run_stage(stage, task)
self.assertEqual(result.status, "pass") self.assertEqual(result.status, "pass")
popen.assert_called_once() request_obj = urlopen.call_args.args[0]
self.assertEqual(popen.call_args.args[0], ["ollama", "run", "tiny-model"]) body = request_obj.data.decode("utf-8")
self.assertIn('"model": "tiny-model"', body)
self.assertIn('"stream": false', body)
output = (root / result.output_path).read_text(encoding="utf-8") output = (root / result.output_path).read_text(encoding="utf-8")
self.assertIn("ollama run tiny-model", output) self.assertIn("POST http://localhost:11434/api/generate", output)
self.assertIn("ollama output", output)
def test_openai_compatible_agent_sends_temperature(self) -> None: def test_openai_compatible_agent_sends_temperature(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:

View File

@ -484,7 +484,7 @@ Acceptance Criteria:
encoding="utf-8", encoding="utf-8",
) )
stages = ( stages = (
StageConfig(id="write", type="code_writer", agent="writer"), StageConfig(id="write", type="code_writer", agent="writer", output="proposed.patch"),
StageConfig(id="normalize", type="patch_normalizer"), StageConfig(id="normalize", type="patch_normalizer"),
StageConfig(id="validate", type="patch_validator", on_fail="write"), StageConfig(id="validate", type="patch_validator", on_fail="write"),
) )
@ -506,6 +506,10 @@ Acceptance Criteria:
any("creates existing file" in stage.reason for stage in result.stage_results) any("creates existing file" in stage.reason for stage in result.stage_results)
) )
self.assertTrue((task_dir / "repair-1.patch").exists()) self.assertTrue((task_dir / "repair-1.patch").exists())
self.assertTrue((task_dir / "normalized.patch").exists())
self.assertTrue((task_dir / "normalized-1.patch").exists())
self.assertTrue((task_dir / "patch-validation.md").exists())
self.assertTrue((task_dir / "patch-validation-1.md").exists())
def test_patch_apply_stage_applies_patch(self) -> None: def test_patch_apply_stage_applies_patch(self) -> None:
with tempfile.TemporaryDirectory() as directory: with tempfile.TemporaryDirectory() as directory:
@ -615,6 +619,10 @@ Acceptance Criteria:
self.assertEqual((root / "app.py").read_text(encoding="utf-8"), "new\n") self.assertEqual((root / "app.py").read_text(encoding="utf-8"), "new\n")
self.assertTrue((task_dir / "repair-1.patch").exists()) self.assertTrue((task_dir / "repair-1.patch").exists())
self.assertTrue((task_dir / "repair-summary-1.md").exists()) self.assertTrue((task_dir / "repair-summary-1.md").exists())
self.assertTrue((task_dir / "normalized-1.patch").exists())
self.assertTrue((task_dir / "patch-validation-1.md").exists())
self.assertTrue((task_dir / "applied-1.patch").exists())
self.assertTrue((task_dir / "patch-apply-output-1.txt").exists())
def _write_common_files(root: Path) -> None: def _write_common_files(root: Path) -> None: