mirror of
https://github.com/khodges42/nightShift.git
synced 2026-06-14 18:18:36 +00:00
Repo Lookup, Request Context, Planner, Context Stage, QoL improvements
- Added operational run logging via nightshift/runlog.py. - CLI now streams progress during run / run --all. - Runs write .nightshift/runs/<run-id>/run.log and aggregate .nightshift/nightshift.log. - Web dashboard now shows the last 100 run log lines. - Added agent temperature config. - Added minimal openai_compatible backend and temperature passing for it. - Added Ollama temperature handling. - Added scoped repo lookup tools in nightshift/repo_tools.py: list_files, read_file, grep. - Planner agents can request lookup context with lookup_requests; NightShift saves files-inspected.md and reruns the planner with retrieved context. - Added repo_context stage type that writes context-pack.md. - Marked phases 23-27 complete in docs/design.md:990.
This commit is contained in:
parent
86aa7dd13c
commit
646c655314
|
|
@ -989,18 +989,18 @@ NightShift should make active runs easier to observe from both the CLI and the w
|
||||||
|
|
||||||
Implementation tasks:
|
Implementation tasks:
|
||||||
|
|
||||||
* [ ] Add a small logging module with structured operational events.
|
* [x] Add a small logging module with structured operational events.
|
||||||
* [ ] Stream human-readable progress to the CLI during `run` and `run --all`.
|
* [x] Stream human-readable progress to the CLI during `run` and `run --all`.
|
||||||
* [ ] Include run id, task id, stage id, agent/backend, command index, retry count, status, duration, and artifact path where available.
|
* [x] Include run id, task id, stage id, agent/backend, command index, retry count, status, duration, and artifact path where available.
|
||||||
* [ ] Write a per-run log file such as `.nightshift/runs/<run-id>/run.log`.
|
* [x] Write a per-run log file such as `.nightshift/runs/<run-id>/run.log`.
|
||||||
* [ ] Optionally write or rotate an aggregate `.nightshift/nightshift.log` for cross-run troubleshooting.
|
* [x] Optionally write or rotate an aggregate `.nightshift/nightshift.log` for cross-run troubleshooting.
|
||||||
* [ ] Keep logs operational; do not duplicate full prompts, full model responses, or full command output that already lives in artifacts.
|
* [x] Keep logs operational; do not duplicate full prompts, full model responses, or full command output that already lives in artifacts.
|
||||||
* [ ] Redact or avoid secrets from logged environment/config values.
|
* [x] Redact or avoid secrets from logged environment/config values.
|
||||||
* [ ] Add dashboard support for viewing the latest log tail.
|
* [x] Add dashboard support for viewing the latest log tail.
|
||||||
* [ ] Cap the dashboard log view to the last 100 lines by default.
|
* [x] Cap the dashboard log view to the last 100 lines by default.
|
||||||
* [ ] Keep the full per-run log file available as an artifact unless a later size cap is configured.
|
* [x] Keep the full per-run log file available as an artifact unless a later size cap is configured.
|
||||||
* [ ] Auto-refresh the dashboard log view with the existing dashboard refresh model.
|
* [x] Auto-refresh the dashboard log view with the existing dashboard refresh model.
|
||||||
* [ ] Add tests for log writing, CLI progress hooks, dashboard log rendering, missing log files, and the 100-line cap.
|
* [x] Add tests for log writing, CLI progress hooks, dashboard log rendering, missing log files, and the 100-line cap.
|
||||||
|
|
||||||
Acceptance Criteria:
|
Acceptance Criteria:
|
||||||
|
|
||||||
|
|
@ -1019,35 +1019,35 @@ Notes:
|
||||||
|
|
||||||
## Phase 24: Per-Agent Model Parameters
|
## Phase 24: Per-Agent Model Parameters
|
||||||
|
|
||||||
- [ ] Add `temperature` to agent config.
|
- [x] Add `temperature` to agent config.
|
||||||
- [ ] Pass temperature to Ollama/OpenAI-compatible backends.
|
- [x] Pass temperature to Ollama/OpenAI-compatible backends.
|
||||||
- [ ] Default safely if omitted.
|
- [x] Default safely if omitted.
|
||||||
- [ ] Add config validation tests.
|
- [x] Add config validation tests.
|
||||||
|
|
||||||
## Phase 25: Repo Lookup Tools MVP
|
## Phase 25: Repo Lookup Tools MVP
|
||||||
|
|
||||||
- [ ] Add tool interface for repo operations.
|
- [x] Add tool interface for repo operations.
|
||||||
- [ ] Implement scoped `list_files`.
|
- [x] Implement scoped `list_files`.
|
||||||
- [ ] Implement scoped `read_file`.
|
- [x] Implement scoped `read_file`.
|
||||||
- [ ] Implement scoped `grep`.
|
- [x] Implement scoped `grep`.
|
||||||
- [ ] Enforce existing path safety rules.
|
- [x] Enforce existing path safety rules.
|
||||||
- [ ] Log tool calls as artifacts.
|
- [x] Log tool calls as artifacts.
|
||||||
|
|
||||||
## Phase 26: Planner Code-Discovery Support
|
## Phase 26: Planner Code-Discovery Support
|
||||||
|
|
||||||
- [ ] Teach planner prompt to request needed code context.
|
- [x] Teach planner prompt to request needed code context.
|
||||||
- [ ] Add structured planner output for lookup requests.
|
- [x] Add structured planner output for lookup requests.
|
||||||
- [ ] Execute requested lookup tools.
|
- [x] Execute requested lookup tools.
|
||||||
- [ ] Save `files-inspected.md`.
|
- [x] Save `files-inspected.md`.
|
||||||
- [ ] Re-run planner with retrieved context.
|
- [x] Re-run planner with retrieved context.
|
||||||
|
|
||||||
## Phase 27: Context Pack Builder
|
## Phase 27: Context Pack Builder
|
||||||
|
|
||||||
- [ ] Add `repo_context` stage.
|
- [x] Add `repo_context` stage.
|
||||||
- [ ] Generate `context-pack.md`.
|
- [x] Generate `context-pack.md`.
|
||||||
- [ ] Include task, acceptance criteria, relevant files, snippets, and constraints.
|
- [x] Include task, acceptance criteria, relevant files, snippets, and constraints.
|
||||||
- [ ] Add line-numbered excerpts.
|
- [x] Add line-numbered excerpts.
|
||||||
- [ ] Add context-size caps.
|
- [x] Add context-size caps.
|
||||||
|
|
||||||
## Phase 28: Project Context Chart MVP
|
## Phase 28: Project Context Chart MVP
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -3,13 +3,18 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
|
import json
|
||||||
|
import os
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import subprocess
|
import subprocess
|
||||||
import time
|
import time
|
||||||
|
from urllib import request
|
||||||
|
from urllib.error import URLError
|
||||||
|
|
||||||
from .artifacts import ArtifactStore
|
from .artifacts import ArtifactStore
|
||||||
from .config import AgentConfig, StageConfig
|
from .config import AgentConfig, StageConfig
|
||||||
from .errors import AgentError, SafetyError
|
from .errors import AgentError, SafetyError
|
||||||
|
from .runlog import NullRunLogger, RunLogger
|
||||||
from .safety import resolve_inside_root, resolve_project_root
|
from .safety import resolve_inside_root, resolve_project_root
|
||||||
from .stages import StageResult, StageStatus
|
from .stages import StageResult, StageStatus
|
||||||
from .tasks import Task
|
from .tasks import Task
|
||||||
|
|
@ -43,11 +48,13 @@ class AgentExecutor:
|
||||||
agents: dict[str, AgentConfig],
|
agents: dict[str, AgentConfig],
|
||||||
artifacts: ArtifactStore,
|
artifacts: ArtifactStore,
|
||||||
timeout_seconds: int = DEFAULT_AGENT_TIMEOUT_SECONDS,
|
timeout_seconds: int = DEFAULT_AGENT_TIMEOUT_SECONDS,
|
||||||
|
logger: RunLogger | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
self.project_root = resolve_project_root(project_root)
|
self.project_root = resolve_project_root(project_root)
|
||||||
self.agents = agents
|
self.agents = agents
|
||||||
self.artifacts = artifacts
|
self.artifacts = artifacts
|
||||||
self.timeout_seconds = timeout_seconds
|
self.timeout_seconds = timeout_seconds
|
||||||
|
self.logger = logger or NullRunLogger()
|
||||||
|
|
||||||
def run_stage(
|
def run_stage(
|
||||||
self,
|
self,
|
||||||
|
|
@ -64,7 +71,7 @@ class AgentExecutor:
|
||||||
agent = self.agents.get(stage.agent)
|
agent = self.agents.get(stage.agent)
|
||||||
if agent is None:
|
if agent is None:
|
||||||
raise AgentError(f"Agent error: unknown agent '{stage.agent}' for stage '{stage.id}'.")
|
raise AgentError(f"Agent error: unknown agent '{stage.agent}' for stage '{stage.id}'.")
|
||||||
if agent.backend not in {"command", "ollama"}:
|
if agent.backend not in {"command", "ollama", "openai_compatible"}:
|
||||||
raise AgentError(
|
raise AgentError(
|
||||||
f"Agent error: agent '{agent.id}' uses unsupported backend '{agent.backend}'."
|
f"Agent error: agent '{agent.id}' uses unsupported backend '{agent.backend}'."
|
||||||
)
|
)
|
||||||
|
|
@ -72,6 +79,10 @@ class AgentExecutor:
|
||||||
raise AgentError(f"Agent error: command backend agent '{agent.id}' has no command.")
|
raise AgentError(f"Agent error: command backend agent '{agent.id}' has no command.")
|
||||||
if agent.backend == "ollama" and not agent.model:
|
if agent.backend == "ollama" and not agent.model:
|
||||||
raise AgentError(f"Agent error: ollama backend agent '{agent.id}' has no model.")
|
raise AgentError(f"Agent error: ollama backend agent '{agent.id}' has no model.")
|
||||||
|
if agent.backend == "openai_compatible" and not agent.model:
|
||||||
|
raise AgentError(f"Agent error: openai_compatible backend agent '{agent.id}' has no model.")
|
||||||
|
if agent.backend == "openai_compatible" and not agent.base_url:
|
||||||
|
raise AgentError(f"Agent error: openai_compatible backend agent '{agent.id}' has no base_url.")
|
||||||
|
|
||||||
system_prompt = self._read_system_prompt(agent)
|
system_prompt = self._read_system_prompt(agent)
|
||||||
prompt = build_prompt_bundle(
|
prompt = build_prompt_bundle(
|
||||||
|
|
@ -84,10 +95,37 @@ class AgentExecutor:
|
||||||
retry_notes=retry_notes or [],
|
retry_notes=retry_notes or [],
|
||||||
retry_context=retry_context,
|
retry_context=retry_context,
|
||||||
)
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"agent.start",
|
||||||
|
"Starting agent",
|
||||||
|
stage_id=stage.id,
|
||||||
|
agent_id=agent.id,
|
||||||
|
backend=agent.backend,
|
||||||
|
model=agent.model,
|
||||||
|
temperature=agent.temperature,
|
||||||
|
)
|
||||||
invocation = self._invoke(agent, prompt)
|
invocation = self._invoke(agent, prompt)
|
||||||
|
self.logger.event(
|
||||||
|
"agent.finish",
|
||||||
|
"Finished agent",
|
||||||
|
stage_id=stage.id,
|
||||||
|
agent_id=agent.id,
|
||||||
|
backend=agent.backend,
|
||||||
|
exit_code=invocation.exit_code,
|
||||||
|
duration=f"{invocation.duration_seconds:.3f}s",
|
||||||
|
timed_out=str(invocation.timed_out).lower(),
|
||||||
|
)
|
||||||
output_filename = stage.output or f"{stage.id}.md"
|
output_filename = stage.output or f"{stage.id}.md"
|
||||||
output = format_agent_invocation(stage.id, invocation)
|
output = format_agent_invocation(stage.id, invocation)
|
||||||
output_path = self.artifacts.write_stage_output(task.id, output_filename, output)
|
output_path = self.artifacts.write_stage_output(task.id, output_filename, output)
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote agent artifact",
|
||||||
|
stage_id=stage.id,
|
||||||
|
task_id=task.id,
|
||||||
|
agent_id=agent.id,
|
||||||
|
artifact_path=output_path.relative_to(self.project_root),
|
||||||
|
)
|
||||||
|
|
||||||
if invocation.timed_out:
|
if invocation.timed_out:
|
||||||
status: StageStatus = "fail"
|
status: StageStatus = "fail"
|
||||||
|
|
@ -135,6 +173,8 @@ class AgentExecutor:
|
||||||
def _invoke(self, agent: AgentConfig, prompt: str) -> AgentInvocation:
|
def _invoke(self, agent: AgentConfig, prompt: str) -> AgentInvocation:
|
||||||
if agent.backend == "ollama":
|
if agent.backend == "ollama":
|
||||||
return self._invoke_ollama(agent, prompt)
|
return self._invoke_ollama(agent, prompt)
|
||||||
|
if agent.backend == "openai_compatible":
|
||||||
|
return self._invoke_openai_compatible(agent, prompt)
|
||||||
return self._invoke_command(agent, prompt)
|
return self._invoke_command(agent, prompt)
|
||||||
|
|
||||||
def _invoke_command(self, agent: AgentConfig, prompt: str) -> AgentInvocation:
|
def _invoke_command(self, agent: AgentConfig, prompt: str) -> AgentInvocation:
|
||||||
|
|
@ -180,12 +220,15 @@ class AgentExecutor:
|
||||||
if not agent.model:
|
if not agent.model:
|
||||||
raise AgentError(f"Agent error: ollama backend agent '{agent.id}' has no model.")
|
raise AgentError(f"Agent error: ollama backend agent '{agent.id}' has no model.")
|
||||||
command = f"ollama run {agent.model}"
|
command = f"ollama run {agent.model}"
|
||||||
|
prompt_input = prompt
|
||||||
|
if agent.temperature is not None:
|
||||||
|
prompt_input = f"/set parameter temperature {agent.temperature}\n{prompt}"
|
||||||
started = time.monotonic()
|
started = time.monotonic()
|
||||||
try:
|
try:
|
||||||
completed = subprocess.run(
|
completed = subprocess.run(
|
||||||
["ollama", "run", agent.model],
|
["ollama", "run", agent.model],
|
||||||
cwd=self.project_root,
|
cwd=self.project_root,
|
||||||
input=prompt,
|
input=prompt_input,
|
||||||
capture_output=True,
|
capture_output=True,
|
||||||
text=True,
|
text=True,
|
||||||
encoding="utf-8",
|
encoding="utf-8",
|
||||||
|
|
@ -196,7 +239,7 @@ class AgentExecutor:
|
||||||
return AgentInvocation(
|
return AgentInvocation(
|
||||||
agent_id=agent.id,
|
agent_id=agent.id,
|
||||||
command=command,
|
command=command,
|
||||||
prompt=prompt,
|
prompt=prompt_input,
|
||||||
exit_code=completed.returncode,
|
exit_code=completed.returncode,
|
||||||
stdout=_coerce_output(completed.stdout),
|
stdout=_coerce_output(completed.stdout),
|
||||||
stderr=_coerce_output(completed.stderr),
|
stderr=_coerce_output(completed.stderr),
|
||||||
|
|
@ -207,7 +250,7 @@ class AgentExecutor:
|
||||||
return AgentInvocation(
|
return AgentInvocation(
|
||||||
agent_id=agent.id,
|
agent_id=agent.id,
|
||||||
command=command,
|
command=command,
|
||||||
prompt=prompt,
|
prompt=prompt_input,
|
||||||
exit_code=127,
|
exit_code=127,
|
||||||
stdout="",
|
stdout="",
|
||||||
stderr=str(exc),
|
stderr=str(exc),
|
||||||
|
|
@ -218,7 +261,7 @@ class AgentExecutor:
|
||||||
return AgentInvocation(
|
return AgentInvocation(
|
||||||
agent_id=agent.id,
|
agent_id=agent.id,
|
||||||
command=command,
|
command=command,
|
||||||
prompt=prompt,
|
prompt=prompt_input,
|
||||||
exit_code=-1,
|
exit_code=-1,
|
||||||
stdout=_coerce_output(exc.stdout),
|
stdout=_coerce_output(exc.stdout),
|
||||||
stderr=_coerce_output(exc.stderr),
|
stderr=_coerce_output(exc.stderr),
|
||||||
|
|
@ -226,6 +269,63 @@ class AgentExecutor:
|
||||||
timed_out=True,
|
timed_out=True,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
def _invoke_openai_compatible(self, agent: AgentConfig, prompt: str) -> AgentInvocation:
|
||||||
|
if not agent.model or not agent.base_url:
|
||||||
|
raise AgentError(f"Agent error: openai_compatible backend agent '{agent.id}' is incomplete.")
|
||||||
|
url = agent.base_url.rstrip("/") + "/chat/completions"
|
||||||
|
command = f"POST {url}"
|
||||||
|
body: dict[str, object] = {
|
||||||
|
"model": agent.model,
|
||||||
|
"messages": [{"role": "user", "content": prompt}],
|
||||||
|
}
|
||||||
|
if agent.temperature is not None:
|
||||||
|
body["temperature"] = agent.temperature
|
||||||
|
headers = {"Content-Type": "application/json"}
|
||||||
|
api_key_env = agent.api_key_env or "OPENAI_API_KEY"
|
||||||
|
api_key = os.environ.get(api_key_env)
|
||||||
|
if api_key:
|
||||||
|
headers["Authorization"] = f"Bearer {api_key}"
|
||||||
|
|
||||||
|
started = time.monotonic()
|
||||||
|
try:
|
||||||
|
payload = json.dumps(body).encode("utf-8")
|
||||||
|
req = request.Request(url, data=payload, headers=headers, method="POST")
|
||||||
|
with request.urlopen(req, timeout=self.timeout_seconds) as response:
|
||||||
|
raw = response.read().decode("utf-8", errors="replace")
|
||||||
|
duration = time.monotonic() - started
|
||||||
|
return AgentInvocation(
|
||||||
|
agent_id=agent.id,
|
||||||
|
command=command,
|
||||||
|
prompt=prompt,
|
||||||
|
exit_code=0,
|
||||||
|
stdout=_extract_openai_content(raw),
|
||||||
|
stderr="",
|
||||||
|
duration_seconds=duration,
|
||||||
|
)
|
||||||
|
except TimeoutError:
|
||||||
|
duration = time.monotonic() - started
|
||||||
|
return AgentInvocation(
|
||||||
|
agent_id=agent.id,
|
||||||
|
command=command,
|
||||||
|
prompt=prompt,
|
||||||
|
exit_code=-1,
|
||||||
|
stdout="",
|
||||||
|
stderr="Request timed out.",
|
||||||
|
duration_seconds=duration,
|
||||||
|
timed_out=True,
|
||||||
|
)
|
||||||
|
except (OSError, URLError) as exc:
|
||||||
|
duration = time.monotonic() - started
|
||||||
|
return AgentInvocation(
|
||||||
|
agent_id=agent.id,
|
||||||
|
command=command,
|
||||||
|
prompt=prompt,
|
||||||
|
exit_code=1,
|
||||||
|
stdout="",
|
||||||
|
stderr=str(exc),
|
||||||
|
duration_seconds=duration,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def build_prompt_bundle(
|
def build_prompt_bundle(
|
||||||
system_prompt: str,
|
system_prompt: str,
|
||||||
|
|
@ -294,6 +394,20 @@ def _coerce_output(value: str | bytes | None) -> str:
|
||||||
return value
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_openai_content(raw: str) -> str:
|
||||||
|
try:
|
||||||
|
data = json.loads(raw)
|
||||||
|
choices = data.get("choices", [])
|
||||||
|
if choices:
|
||||||
|
message = choices[0].get("message", {})
|
||||||
|
content = message.get("content")
|
||||||
|
if isinstance(content, str):
|
||||||
|
return content
|
||||||
|
except (json.JSONDecodeError, AttributeError):
|
||||||
|
pass
|
||||||
|
return raw
|
||||||
|
|
||||||
|
|
||||||
def output_contract_for(stage: StageConfig) -> str:
|
def output_contract_for(stage: StageConfig) -> str:
|
||||||
if stage.type in {"agent_review", "review"}:
|
if stage.type in {"agent_review", "review"}:
|
||||||
return "\n".join(
|
return "\n".join(
|
||||||
|
|
@ -305,6 +419,20 @@ def output_contract_for(stage: StageConfig) -> str:
|
||||||
"context_update: <compact useful note>",
|
"context_update: <compact useful note>",
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
if stage.type == "agent" and ("plan" in stage.id.lower() or stage.agent == "planner"):
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"Write the requested stage output in concise markdown.",
|
||||||
|
"",
|
||||||
|
"If you need repository context before finalizing the plan, include:",
|
||||||
|
"lookup_requests:",
|
||||||
|
"- tool: list_files | read_file | grep",
|
||||||
|
" path: <relative path>",
|
||||||
|
" pattern: <glob for list_files or regex for grep>",
|
||||||
|
"",
|
||||||
|
"NightShift will run these read-only lookup tools, save files-inspected.md, and re-run this planner stage with the retrieved context.",
|
||||||
|
]
|
||||||
|
)
|
||||||
return "Write the requested stage output in concise markdown."
|
return "Write the requested stage output in concise markdown."
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -39,6 +39,8 @@ class ArtifactStore:
|
||||||
self.project_context_path = self.artifact_root / "project-context.md"
|
self.project_context_path = self.artifact_root / "project-context.md"
|
||||||
self.run_summary_path = self.run_dir / "run-summary.md"
|
self.run_summary_path = self.run_dir / "run-summary.md"
|
||||||
self.config_snapshot_path = self.run_dir / "config.snapshot.yaml"
|
self.config_snapshot_path = self.run_dir / "config.snapshot.yaml"
|
||||||
|
self.run_log_path = self.run_dir / "run.log"
|
||||||
|
self.aggregate_log_path = self.artifact_root / "nightshift.log"
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def from_config(cls, config: NightShiftConfig, run_id: str | None = None) -> "ArtifactStore":
|
def from_config(cls, config: NightShiftConfig, run_id: str | None = None) -> "ArtifactStore":
|
||||||
|
|
|
||||||
|
|
@ -10,6 +10,7 @@ from .config import validate_config
|
||||||
from .errors import NightShiftError
|
from .errors import NightShiftError
|
||||||
from .init import init_project
|
from .init import init_project
|
||||||
from .pipeline import PipelineRunner
|
from .pipeline import PipelineRunner
|
||||||
|
from .runlog import RunLogger
|
||||||
from .status import build_status, format_status
|
from .status import build_status, format_status
|
||||||
from .tasks import (
|
from .tasks import (
|
||||||
ensure_dependencies_satisfied,
|
ensure_dependencies_satisfied,
|
||||||
|
|
@ -80,7 +81,7 @@ def main(argv: list[str] | None = None) -> int:
|
||||||
validate_task_dependencies(tasks)
|
validate_task_dependencies(tasks)
|
||||||
if args.all and args.task:
|
if args.all and args.task:
|
||||||
parser.error("run accepts either --all or --task, not both.")
|
parser.error("run accepts either --all or --task, not both.")
|
||||||
runner = PipelineRunner(config)
|
runner = PipelineRunner(config, logger=RunLogger(console=print))
|
||||||
if args.all:
|
if args.all:
|
||||||
selected = [task for task in tasks if not task.completed]
|
selected = [task for task in tasks if not task.completed]
|
||||||
result = runner.run_tasks(selected)
|
result = runner.run_tasks(selected)
|
||||||
|
|
|
||||||
|
|
@ -12,6 +12,7 @@ import time
|
||||||
from .artifacts import ArtifactStore
|
from .artifacts import ArtifactStore
|
||||||
from .config import SafetyConfig, StageConfig
|
from .config import SafetyConfig, StageConfig
|
||||||
from .errors import CommandError, SafetyError
|
from .errors import CommandError, SafetyError
|
||||||
|
from .runlog import NullRunLogger, RunLogger
|
||||||
from .safety import ensure_command_allowed, resolve_inside_root, resolve_project_root
|
from .safety import ensure_command_allowed, resolve_inside_root, resolve_project_root
|
||||||
from .stages import StageResult
|
from .stages import StageResult
|
||||||
|
|
||||||
|
|
@ -38,11 +39,13 @@ class CommandExecutor:
|
||||||
safety: SafetyConfig,
|
safety: SafetyConfig,
|
||||||
artifacts: ArtifactStore,
|
artifacts: ArtifactStore,
|
||||||
timeout_seconds: int = DEFAULT_COMMAND_TIMEOUT_SECONDS,
|
timeout_seconds: int = DEFAULT_COMMAND_TIMEOUT_SECONDS,
|
||||||
|
logger: RunLogger | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
self.project_root = resolve_project_root(project_root)
|
self.project_root = resolve_project_root(project_root)
|
||||||
self.safety = safety
|
self.safety = safety
|
||||||
self.artifacts = artifacts
|
self.artifacts = artifacts
|
||||||
self.timeout_seconds = timeout_seconds
|
self.timeout_seconds = timeout_seconds
|
||||||
|
self.logger = logger or NullRunLogger()
|
||||||
|
|
||||||
def run_stage(self, stage: StageConfig, task_id: str) -> StageResult:
|
def run_stage(self, stage: StageConfig, task_id: str) -> StageResult:
|
||||||
if stage.type != "command":
|
if stage.type != "command":
|
||||||
|
|
@ -56,7 +59,14 @@ class CommandExecutor:
|
||||||
status = "pass"
|
status = "pass"
|
||||||
reason = "All commands passed."
|
reason = "All commands passed."
|
||||||
|
|
||||||
for command in stage.commands:
|
for index, command in enumerate(stage.commands, start=1):
|
||||||
|
self.logger.event(
|
||||||
|
"command.start",
|
||||||
|
"Starting command",
|
||||||
|
stage_id=stage.id,
|
||||||
|
command_index=index,
|
||||||
|
command=command,
|
||||||
|
)
|
||||||
run = self.run_command(
|
run = self.run_command(
|
||||||
command,
|
command,
|
||||||
shell=stage.shell,
|
shell=stage.shell,
|
||||||
|
|
@ -64,6 +74,15 @@ class CommandExecutor:
|
||||||
working_dir=stage.working_dir,
|
working_dir=stage.working_dir,
|
||||||
)
|
)
|
||||||
runs.append(run)
|
runs.append(run)
|
||||||
|
self.logger.event(
|
||||||
|
"command.finish",
|
||||||
|
"Finished command",
|
||||||
|
stage_id=stage.id,
|
||||||
|
command_index=index,
|
||||||
|
exit_code=run.exit_code,
|
||||||
|
duration=f"{run.duration_seconds:.3f}s",
|
||||||
|
timed_out=str(run.timed_out).lower(),
|
||||||
|
)
|
||||||
if run.timed_out:
|
if run.timed_out:
|
||||||
status = "fail"
|
status = "fail"
|
||||||
timeout = stage.timeout_seconds or self.timeout_seconds
|
timeout = stage.timeout_seconds or self.timeout_seconds
|
||||||
|
|
@ -80,6 +99,13 @@ class CommandExecutor:
|
||||||
output_filename,
|
output_filename,
|
||||||
format_command_runs(stage.id, runs),
|
format_command_runs(stage.id, runs),
|
||||||
)
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote command artifact",
|
||||||
|
stage_id=stage.id,
|
||||||
|
task_id=task_id,
|
||||||
|
artifact_path=output_path.relative_to(self.project_root),
|
||||||
|
)
|
||||||
return StageResult(
|
return StageResult(
|
||||||
stage_id=stage.id,
|
stage_id=stage.id,
|
||||||
status=status, # type: ignore[arg-type]
|
status=status, # type: ignore[arg-type]
|
||||||
|
|
|
||||||
|
|
@ -43,6 +43,9 @@ class AgentConfig:
|
||||||
system_prompt: Path
|
system_prompt: Path
|
||||||
model: str | None = None
|
model: str | None = None
|
||||||
role: str | None = None
|
role: str | None = None
|
||||||
|
temperature: float | None = None
|
||||||
|
base_url: str | None = None
|
||||||
|
api_key_env: str | None = None
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
|
|
@ -83,7 +86,7 @@ class NightShiftConfig:
|
||||||
|
|
||||||
AGENT_STAGE_TYPES = {"agent", "agent_review", "review"}
|
AGENT_STAGE_TYPES = {"agent", "agent_review", "review"}
|
||||||
COMMAND_STAGE_TYPES = {"command"}
|
COMMAND_STAGE_TYPES = {"command"}
|
||||||
SUPPORTED_STAGE_TYPES = AGENT_STAGE_TYPES | COMMAND_STAGE_TYPES | {"summarize"}
|
SUPPORTED_STAGE_TYPES = AGENT_STAGE_TYPES | COMMAND_STAGE_TYPES | {"repo_context", "summarize"}
|
||||||
|
|
||||||
|
|
||||||
def load_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
|
def load_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
|
||||||
|
|
@ -181,10 +184,20 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
|
||||||
backend = _require_string(agent_raw, "backend", f"agents.{agent_id}")
|
backend = _require_string(agent_raw, "backend", f"agents.{agent_id}")
|
||||||
command = _optional_string(agent_raw.get("command"), f"agents.{agent_id}.command")
|
command = _optional_string(agent_raw.get("command"), f"agents.{agent_id}.command")
|
||||||
model = _optional_string(agent_raw.get("model"), f"agents.{agent_id}.model")
|
model = _optional_string(agent_raw.get("model"), f"agents.{agent_id}.model")
|
||||||
if backend not in {"command", "ollama"}:
|
base_url = _optional_string(agent_raw.get("base_url"), f"agents.{agent_id}.base_url")
|
||||||
|
api_key_env = _optional_string(agent_raw.get("api_key_env"), f"agents.{agent_id}.api_key_env")
|
||||||
|
temperature = _optional_float_or_none(
|
||||||
|
agent_raw.get("temperature"),
|
||||||
|
f"agents.{agent_id}.temperature",
|
||||||
|
)
|
||||||
|
if temperature is not None and temperature < 0:
|
||||||
|
raise ConfigError(
|
||||||
|
f"Config error: agents.{agent_id}.temperature must be zero or greater."
|
||||||
|
)
|
||||||
|
if backend not in {"command", "ollama", "openai_compatible"}:
|
||||||
raise ConfigError(
|
raise ConfigError(
|
||||||
f"Config error: agent '{agent_id}' uses unsupported backend '{backend}'. "
|
f"Config error: agent '{agent_id}' uses unsupported backend '{backend}'. "
|
||||||
"Supported backends: command, ollama."
|
"Supported backends: command, ollama, openai_compatible."
|
||||||
)
|
)
|
||||||
if backend == "command" and command is None:
|
if backend == "command" and command is None:
|
||||||
raise ConfigError(
|
raise ConfigError(
|
||||||
|
|
@ -194,6 +207,14 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
|
||||||
raise ConfigError(
|
raise ConfigError(
|
||||||
f"Config error: ollama backend agent '{agent_id}' must define model."
|
f"Config error: ollama backend agent '{agent_id}' must define model."
|
||||||
)
|
)
|
||||||
|
if backend == "openai_compatible" and model is None:
|
||||||
|
raise ConfigError(
|
||||||
|
f"Config error: openai_compatible backend agent '{agent_id}' must define model."
|
||||||
|
)
|
||||||
|
if backend == "openai_compatible" and base_url is None:
|
||||||
|
raise ConfigError(
|
||||||
|
f"Config error: openai_compatible backend agent '{agent_id}' must define base_url."
|
||||||
|
)
|
||||||
system_prompt = Path(_require_string(agent_raw, "system_prompt", f"agents.{agent_id}"))
|
system_prompt = Path(_require_string(agent_raw, "system_prompt", f"agents.{agent_id}"))
|
||||||
agents[str(agent_id)] = AgentConfig(
|
agents[str(agent_id)] = AgentConfig(
|
||||||
id=str(agent_id),
|
id=str(agent_id),
|
||||||
|
|
@ -202,6 +223,9 @@ def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
|
||||||
system_prompt=system_prompt,
|
system_prompt=system_prompt,
|
||||||
model=model,
|
model=model,
|
||||||
role=_optional_string(agent_raw.get("role"), f"agents.{agent_id}.role"),
|
role=_optional_string(agent_raw.get("role"), f"agents.{agent_id}.role"),
|
||||||
|
temperature=temperature,
|
||||||
|
base_url=base_url,
|
||||||
|
api_key_env=api_key_env,
|
||||||
)
|
)
|
||||||
|
|
||||||
experiment_raw = raw.get("experiment", {})
|
experiment_raw = raw.get("experiment", {})
|
||||||
|
|
@ -444,6 +468,8 @@ def _parse_scalar(value: str) -> Any:
|
||||||
return None
|
return None
|
||||||
if re.fullmatch(r"-?\d+", value):
|
if re.fullmatch(r"-?\d+", value):
|
||||||
return int(value)
|
return int(value)
|
||||||
|
if re.fullmatch(r"-?(\d+\.\d*|\d*\.\d+)", value):
|
||||||
|
return float(value)
|
||||||
if (value.startswith('"') and value.endswith('"')) or (
|
if (value.startswith('"') and value.endswith('"')) or (
|
||||||
value.startswith("'") and value.endswith("'")
|
value.startswith("'") and value.endswith("'")
|
||||||
):
|
):
|
||||||
|
|
@ -492,6 +518,14 @@ def _optional_int_or_none(value: Any, context: str) -> int | None:
|
||||||
return _optional_int(value, context)
|
return _optional_int(value, context)
|
||||||
|
|
||||||
|
|
||||||
|
def _optional_float_or_none(value: Any, context: str) -> float | None:
|
||||||
|
if value is None:
|
||||||
|
return None
|
||||||
|
if isinstance(value, bool) or not isinstance(value, (int, float)):
|
||||||
|
raise ConfigError(f"Config error: '{context}' must be a number when set.")
|
||||||
|
return float(value)
|
||||||
|
|
||||||
|
|
||||||
def _string_tuple(value: Any, context: str) -> tuple[str, ...]:
|
def _string_tuple(value: Any, context: str) -> tuple[str, ...]:
|
||||||
if value is None:
|
if value is None:
|
||||||
return ()
|
return ()
|
||||||
|
|
|
||||||
|
|
@ -4,6 +4,7 @@ from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
import re
|
||||||
|
|
||||||
from .agents import AgentExecutor
|
from .agents import AgentExecutor
|
||||||
from .artifacts import ArtifactStore
|
from .artifacts import ArtifactStore
|
||||||
|
|
@ -14,6 +15,8 @@ from .errors import PipelineError
|
||||||
from .errors import NightShiftError
|
from .errors import NightShiftError
|
||||||
from .git import ensure_clean_worktree, write_diff_artifact, write_git_artifacts
|
from .git import ensure_clean_worktree, write_diff_artifact, write_git_artifacts
|
||||||
from .reports import ReportGenerator
|
from .reports import ReportGenerator
|
||||||
|
from .repo_tools import RepoTools, extract_agent_stdout, parse_lookup_requests
|
||||||
|
from .runlog import RunLogger
|
||||||
from .stages import StageResult
|
from .stages import StageResult
|
||||||
from .tasks import Task, mark_task_completed
|
from .tasks import Task, mark_task_completed
|
||||||
|
|
||||||
|
|
@ -46,9 +49,11 @@ class PipelineRunner:
|
||||||
artifacts: ArtifactStore | None = None,
|
artifacts: ArtifactStore | None = None,
|
||||||
agent_timeout_seconds: int = 600,
|
agent_timeout_seconds: int = 600,
|
||||||
command_timeout_seconds: int = 300,
|
command_timeout_seconds: int = 300,
|
||||||
|
logger: RunLogger | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
self.config = config
|
self.config = config
|
||||||
self.artifacts = artifacts or ArtifactStore.from_config(config)
|
self.artifacts = artifacts or ArtifactStore.from_config(config)
|
||||||
|
self.logger = logger or RunLogger()
|
||||||
self.context = ContextManager(self.artifacts)
|
self.context = ContextManager(self.artifacts)
|
||||||
self.reports = ReportGenerator(
|
self.reports = ReportGenerator(
|
||||||
config.project.root,
|
config.project.root,
|
||||||
|
|
@ -61,17 +66,33 @@ class PipelineRunner:
|
||||||
config.agents,
|
config.agents,
|
||||||
self.artifacts,
|
self.artifacts,
|
||||||
timeout_seconds=agent_timeout_seconds,
|
timeout_seconds=agent_timeout_seconds,
|
||||||
|
logger=self.logger,
|
||||||
)
|
)
|
||||||
self.command_executor = CommandExecutor(
|
self.command_executor = CommandExecutor(
|
||||||
config.project.root,
|
config.project.root,
|
||||||
config.safety,
|
config.safety,
|
||||||
self.artifacts,
|
self.artifacts,
|
||||||
timeout_seconds=command_timeout_seconds,
|
timeout_seconds=command_timeout_seconds,
|
||||||
|
logger=self.logger,
|
||||||
|
)
|
||||||
|
self.repo_tools = RepoTools(
|
||||||
|
config.project.root,
|
||||||
|
config.safety,
|
||||||
|
self.artifacts,
|
||||||
|
logger=self.logger,
|
||||||
)
|
)
|
||||||
|
|
||||||
def run_task(self, task: Task) -> PipelineResult:
|
def run_task(self, task: Task) -> PipelineResult:
|
||||||
ensure_clean_worktree(self.config.project.root, self.config.safety.require_clean_worktree)
|
ensure_clean_worktree(self.config.project.root, self.config.safety.require_clean_worktree)
|
||||||
self.artifacts.initialize_run()
|
self.artifacts.initialize_run()
|
||||||
|
self.logger.bind(self.artifacts)
|
||||||
|
self.logger.event(
|
||||||
|
"task.start",
|
||||||
|
"Starting task",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
task_id=task.id,
|
||||||
|
task_title=task.title,
|
||||||
|
)
|
||||||
self.artifacts.write_config_snapshot(self.config.path)
|
self.artifacts.write_config_snapshot(self.config.path)
|
||||||
self.artifacts.write_prompt_snapshots(
|
self.artifacts.write_prompt_snapshots(
|
||||||
{
|
{
|
||||||
|
|
@ -97,6 +118,15 @@ class PipelineRunner:
|
||||||
|
|
||||||
while index < len(stages):
|
while index < len(stages):
|
||||||
stage = stages[index]
|
stage = stages[index]
|
||||||
|
self.logger.event(
|
||||||
|
"stage.start",
|
||||||
|
"Starting stage",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
task_id=task.id,
|
||||||
|
stage_id=stage.id,
|
||||||
|
stage_type=stage.type,
|
||||||
|
retry_count=retry_count,
|
||||||
|
)
|
||||||
try:
|
try:
|
||||||
result = self._run_stage(stage, task, previous_outputs, retry_notes)
|
result = self._run_stage(stage, task, previous_outputs, retry_notes)
|
||||||
except NightShiftError as exc:
|
except NightShiftError as exc:
|
||||||
|
|
@ -113,6 +143,16 @@ class PipelineRunner:
|
||||||
)
|
)
|
||||||
stage_results.append(result)
|
stage_results.append(result)
|
||||||
previous_outputs[stage.id] = self._read_output(result.output_path)
|
previous_outputs[stage.id] = self._read_output(result.output_path)
|
||||||
|
self.logger.event(
|
||||||
|
"stage.finish",
|
||||||
|
"Finished stage",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
task_id=task.id,
|
||||||
|
stage_id=stage.id,
|
||||||
|
status=result.status,
|
||||||
|
reason=result.reason,
|
||||||
|
artifact_path=result.output_path,
|
||||||
|
)
|
||||||
if result.context_update:
|
if result.context_update:
|
||||||
retry_notes.append(f"Context update from '{stage.id}': {result.context_update}")
|
retry_notes.append(f"Context update from '{stage.id}': {result.context_update}")
|
||||||
|
|
||||||
|
|
@ -135,6 +175,16 @@ class PipelineRunner:
|
||||||
)
|
)
|
||||||
break
|
break
|
||||||
retry_count += 1
|
retry_count += 1
|
||||||
|
self.logger.event(
|
||||||
|
"stage.retry",
|
||||||
|
"Redirecting after stage result",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
task_id=task.id,
|
||||||
|
stage_id=stage.id,
|
||||||
|
status=result.status,
|
||||||
|
retry_count=retry_count,
|
||||||
|
next_stage=target_stage,
|
||||||
|
)
|
||||||
retry_notes.append(
|
retry_notes.append(
|
||||||
f"Retry {retry_count}: stage '{stage.id}' returned "
|
f"Retry {retry_count}: stage '{stage.id}' returned "
|
||||||
f"{result.status} ({result.reason}); redirecting to '{target_stage}'."
|
f"{result.status} ({result.reason}); redirecting to '{target_stage}'."
|
||||||
|
|
@ -179,6 +229,16 @@ class PipelineRunner:
|
||||||
stage_results,
|
stage_results,
|
||||||
context_out_path=context_out_path,
|
context_out_path=context_out_path,
|
||||||
)
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"task.finish",
|
||||||
|
"Finished task",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
task_id=task.id,
|
||||||
|
status=final_status,
|
||||||
|
retry_count=retry_count,
|
||||||
|
reason=final_reason,
|
||||||
|
artifact_path=self.artifacts.create_task_dir(task.id).directory.relative_to(self.config.project.root),
|
||||||
|
)
|
||||||
|
|
||||||
return PipelineResult(
|
return PipelineResult(
|
||||||
task_id=task.id,
|
task_id=task.id,
|
||||||
|
|
@ -191,6 +251,8 @@ class PipelineRunner:
|
||||||
|
|
||||||
def run_tasks(self, tasks: list[Task] | tuple[Task, ...]) -> MultiTaskResult:
|
def run_tasks(self, tasks: list[Task] | tuple[Task, ...]) -> MultiTaskResult:
|
||||||
self.artifacts.initialize_run()
|
self.artifacts.initialize_run()
|
||||||
|
self.logger.bind(self.artifacts)
|
||||||
|
self.logger.event("run.start", "Starting multi-task run", run_id=self.artifacts.run_id)
|
||||||
results: list[PipelineResult] = []
|
results: list[PipelineResult] = []
|
||||||
known_ids = {task.id for task in tasks}
|
known_ids = {task.id for task in tasks}
|
||||||
completed_ids = {task.id for task in tasks if task.completed}
|
completed_ids = {task.id for task in tasks if task.completed}
|
||||||
|
|
@ -216,6 +278,13 @@ class PipelineRunner:
|
||||||
reason="Task blocked by " + "; ".join(reason_parts),
|
reason="Task blocked by " + "; ".join(reason_parts),
|
||||||
)
|
)
|
||||||
results.append(blocked)
|
results.append(blocked)
|
||||||
|
self.logger.event(
|
||||||
|
"task.blocked",
|
||||||
|
"Task blocked by dependencies",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
task_id=task.id,
|
||||||
|
reason=blocked.reason,
|
||||||
|
)
|
||||||
if not self.config.pipeline.continue_on_task_failure:
|
if not self.config.pipeline.continue_on_task_failure:
|
||||||
break
|
break
|
||||||
continue
|
continue
|
||||||
|
|
@ -234,6 +303,14 @@ class PipelineRunner:
|
||||||
format_aggregate_run_summary(results, status, reason),
|
format_aggregate_run_summary(results, status, reason),
|
||||||
encoding="utf-8",
|
encoding="utf-8",
|
||||||
)
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"run.finish",
|
||||||
|
"Finished multi-task run",
|
||||||
|
run_id=self.artifacts.run_id,
|
||||||
|
status=status,
|
||||||
|
completed_count=completed_count,
|
||||||
|
failed_count=failed_count,
|
||||||
|
)
|
||||||
return MultiTaskResult(
|
return MultiTaskResult(
|
||||||
status=status,
|
status=status,
|
||||||
task_results=tuple(results),
|
task_results=tuple(results),
|
||||||
|
|
@ -251,7 +328,7 @@ class PipelineRunner:
|
||||||
) -> StageResult:
|
) -> StageResult:
|
||||||
if stage.type in {"agent", "agent_review", "review"}:
|
if stage.type in {"agent", "agent_review", "review"}:
|
||||||
context = self.context.read_context(task, retry_notes)
|
context = self.context.read_context(task, retry_notes)
|
||||||
return self.agent_executor.run_stage(
|
result = self.agent_executor.run_stage(
|
||||||
stage,
|
stage,
|
||||||
task,
|
task,
|
||||||
previous_outputs,
|
previous_outputs,
|
||||||
|
|
@ -260,8 +337,39 @@ class PipelineRunner:
|
||||||
task_context=context.task_context,
|
task_context=context.task_context,
|
||||||
retry_context=context.retry_context,
|
retry_context=context.retry_context,
|
||||||
)
|
)
|
||||||
|
if stage.type == "agent":
|
||||||
|
return self._maybe_rerun_agent_with_repo_lookup(
|
||||||
|
stage,
|
||||||
|
task,
|
||||||
|
result,
|
||||||
|
previous_outputs,
|
||||||
|
retry_notes,
|
||||||
|
context.project_context,
|
||||||
|
context.task_context,
|
||||||
|
context.retry_context,
|
||||||
|
)
|
||||||
|
return result
|
||||||
if stage.type in COMMAND_STAGE_TYPES:
|
if stage.type in COMMAND_STAGE_TYPES:
|
||||||
return self.command_executor.run_stage(stage, task.id)
|
return self.command_executor.run_stage(stage, task.id)
|
||||||
|
if stage.type == "repo_context":
|
||||||
|
output_path = self.artifacts.write_stage_output(
|
||||||
|
task.id,
|
||||||
|
stage.output or "context-pack.md",
|
||||||
|
self._build_context_pack(task),
|
||||||
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote context pack",
|
||||||
|
stage_id=stage.id,
|
||||||
|
task_id=task.id,
|
||||||
|
artifact_path=output_path.relative_to(self.config.project.root),
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage_id=stage.id,
|
||||||
|
status="pass",
|
||||||
|
reason="Context pack written.",
|
||||||
|
output_path=str(output_path.relative_to(self.config.project.root)),
|
||||||
|
)
|
||||||
if stage.type == "summarize":
|
if stage.type == "summarize":
|
||||||
output_path = self.artifacts.write_stage_output(
|
output_path = self.artifacts.write_stage_output(
|
||||||
task.id,
|
task.id,
|
||||||
|
|
@ -276,6 +384,103 @@ class PipelineRunner:
|
||||||
)
|
)
|
||||||
raise PipelineError(f"Pipeline error: unsupported stage type '{stage.type}'.")
|
raise PipelineError(f"Pipeline error: unsupported stage type '{stage.type}'.")
|
||||||
|
|
||||||
|
def _maybe_rerun_agent_with_repo_lookup(
|
||||||
|
self,
|
||||||
|
stage: StageConfig,
|
||||||
|
task: Task,
|
||||||
|
result: StageResult,
|
||||||
|
previous_outputs: dict[str, str],
|
||||||
|
retry_notes: list[str],
|
||||||
|
project_context: str,
|
||||||
|
task_context: str,
|
||||||
|
retry_context: str | None,
|
||||||
|
) -> StageResult:
|
||||||
|
if result.status != "pass" or result.output_path is None:
|
||||||
|
return result
|
||||||
|
output_text = self._read_output(result.output_path)
|
||||||
|
requests = parse_lookup_requests(extract_agent_stdout(output_text))
|
||||||
|
if not requests:
|
||||||
|
return result
|
||||||
|
lookup_context = self.repo_tools.execute_requests(
|
||||||
|
task.id,
|
||||||
|
requests,
|
||||||
|
filename="files-inspected.md",
|
||||||
|
)
|
||||||
|
self.logger.event(
|
||||||
|
"agent.rerun",
|
||||||
|
"Re-running agent with repo lookup context",
|
||||||
|
stage_id=stage.id,
|
||||||
|
task_id=task.id,
|
||||||
|
lookup_count=len(requests),
|
||||||
|
)
|
||||||
|
rerun_outputs = dict(previous_outputs)
|
||||||
|
rerun_outputs["repo_lookup_results"] = lookup_context
|
||||||
|
rerun_result = self.agent_executor.run_stage(
|
||||||
|
stage,
|
||||||
|
task,
|
||||||
|
rerun_outputs,
|
||||||
|
retry_notes,
|
||||||
|
project_context=project_context,
|
||||||
|
task_context=task_context,
|
||||||
|
retry_context=retry_context,
|
||||||
|
)
|
||||||
|
return StageResult(
|
||||||
|
stage_id=rerun_result.stage_id,
|
||||||
|
status=rerun_result.status,
|
||||||
|
reason=(
|
||||||
|
"Agent completed after repo lookup."
|
||||||
|
if rerun_result.status == "pass"
|
||||||
|
else rerun_result.reason
|
||||||
|
),
|
||||||
|
output_path=rerun_result.output_path,
|
||||||
|
next_stage=rerun_result.next_stage,
|
||||||
|
context_update=rerun_result.context_update,
|
||||||
|
)
|
||||||
|
|
||||||
|
def _build_context_pack(self, task: Task) -> str:
|
||||||
|
terms = _task_search_terms(task)
|
||||||
|
files = self.repo_tools.list_files(".", pattern="*.py", max_files=80)
|
||||||
|
grep_sections: list[str] = []
|
||||||
|
for term in terms[:5]:
|
||||||
|
grep_sections.extend(
|
||||||
|
[
|
||||||
|
f"### Search: {term}",
|
||||||
|
"",
|
||||||
|
"```text",
|
||||||
|
self.repo_tools.grep(re.escape(term), ".", max_matches=20),
|
||||||
|
"```",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"# Context Pack",
|
||||||
|
"",
|
||||||
|
f"Task: `{task.id}`",
|
||||||
|
f"Title: {task.title}",
|
||||||
|
"",
|
||||||
|
"## Acceptance Criteria",
|
||||||
|
"",
|
||||||
|
"\n".join(f"- {item}" for item in task.acceptance_criteria) or "- None",
|
||||||
|
"",
|
||||||
|
"## Constraints",
|
||||||
|
"",
|
||||||
|
f"- Scoped paths: {', '.join(self.config.safety.scoped_paths) or '.'}",
|
||||||
|
"- Repository lookups are read-only.",
|
||||||
|
"- Excerpts are line-numbered where files are read directly.",
|
||||||
|
"",
|
||||||
|
"## Relevant Files",
|
||||||
|
"",
|
||||||
|
"```text",
|
||||||
|
files,
|
||||||
|
"```",
|
||||||
|
"",
|
||||||
|
"## Search Results",
|
||||||
|
"",
|
||||||
|
*grep_sections,
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
def _read_output(self, output_path: str | None) -> str:
|
def _read_output(self, output_path: str | None) -> str:
|
||||||
if output_path is None:
|
if output_path is None:
|
||||||
return ""
|
return ""
|
||||||
|
|
@ -365,9 +570,37 @@ def format_run_metadata(config: NightShiftConfig) -> str:
|
||||||
"",
|
"",
|
||||||
f"- Backend: {agent.backend}",
|
f"- Backend: {agent.backend}",
|
||||||
f"- Model: {agent.model or ''}",
|
f"- Model: {agent.model or ''}",
|
||||||
|
f"- Temperature: {agent.temperature if agent.temperature is not None else ''}",
|
||||||
|
f"- Base URL: {agent.base_url or ''}",
|
||||||
f"- Command: {agent.command or ''}",
|
f"- Command: {agent.command or ''}",
|
||||||
f"- System prompt: {agent.system_prompt}",
|
f"- System prompt: {agent.system_prompt}",
|
||||||
"",
|
"",
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _task_search_terms(task: Task) -> list[str]:
|
||||||
|
source = " ".join([task.id, task.title, *task.acceptance_criteria])
|
||||||
|
words = re.findall(r"[A-Za-z_][A-Za-z0-9_]{2,}", source)
|
||||||
|
ignored = {
|
||||||
|
"the",
|
||||||
|
"and",
|
||||||
|
"for",
|
||||||
|
"with",
|
||||||
|
"that",
|
||||||
|
"this",
|
||||||
|
"task",
|
||||||
|
"add",
|
||||||
|
"use",
|
||||||
|
"can",
|
||||||
|
"should",
|
||||||
|
"must",
|
||||||
|
}
|
||||||
|
terms: list[str] = []
|
||||||
|
for word in words:
|
||||||
|
lowered = word.lower()
|
||||||
|
if lowered in ignored or lowered in terms:
|
||||||
|
continue
|
||||||
|
terms.append(lowered)
|
||||||
|
return terms or [task.id]
|
||||||
|
|
|
||||||
250
nightshift/repo_tools.py
Normal file
250
nightshift/repo_tools.py
Normal file
|
|
@ -0,0 +1,250 @@
|
||||||
|
"""Scoped repository lookup tools."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
import fnmatch
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .artifacts import ArtifactStore
|
||||||
|
from .config import SafetyConfig
|
||||||
|
from .errors import SafetyError
|
||||||
|
from .runlog import NullRunLogger, RunLogger
|
||||||
|
from .safety import resolve_inside_root, resolve_project_root, validate_scoped_paths
|
||||||
|
|
||||||
|
|
||||||
|
DEFAULT_MAX_BYTES = 20_000
|
||||||
|
DEFAULT_MAX_MATCHES = 100
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class ToolCall:
|
||||||
|
name: str
|
||||||
|
arguments: dict[str, str]
|
||||||
|
output: str
|
||||||
|
|
||||||
|
|
||||||
|
class RepoTools:
|
||||||
|
"""Read-only repo tools constrained to configured project scope."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
project_root: str | Path,
|
||||||
|
safety: SafetyConfig,
|
||||||
|
artifacts: ArtifactStore,
|
||||||
|
logger: RunLogger | None = None,
|
||||||
|
) -> None:
|
||||||
|
self.project_root = resolve_project_root(project_root)
|
||||||
|
self.safety = safety
|
||||||
|
self.artifacts = artifacts
|
||||||
|
self.logger = logger or NullRunLogger()
|
||||||
|
self.scoped_roots = validate_scoped_paths(
|
||||||
|
self.project_root,
|
||||||
|
safety.scoped_paths or (".",),
|
||||||
|
)
|
||||||
|
|
||||||
|
def list_files(self, path: str = ".", pattern: str = "*", max_files: int = 200) -> str:
|
||||||
|
root = self._resolve_scoped(path, "list_files path")
|
||||||
|
if not root.exists():
|
||||||
|
return f"Path not found: {path}"
|
||||||
|
if root.is_file():
|
||||||
|
candidates = [root]
|
||||||
|
else:
|
||||||
|
candidates = [item for item in root.rglob("*") if item.is_file()]
|
||||||
|
relative_files = [
|
||||||
|
_relative(item, self.project_root)
|
||||||
|
for item in sorted(candidates)
|
||||||
|
if fnmatch.fnmatch(item.name, pattern)
|
||||||
|
]
|
||||||
|
lines = relative_files[:max_files]
|
||||||
|
if len(relative_files) > max_files:
|
||||||
|
lines.append(f"... truncated {len(relative_files) - max_files} files")
|
||||||
|
return "\n".join(lines) or "No files found."
|
||||||
|
|
||||||
|
def read_file(self, path: str, max_bytes: int = DEFAULT_MAX_BYTES) -> str:
|
||||||
|
file_path = self._resolve_scoped(path, "read_file path")
|
||||||
|
if not file_path.exists() or not file_path.is_file():
|
||||||
|
return f"File not found: {path}"
|
||||||
|
data = file_path.read_bytes()[:max_bytes + 1]
|
||||||
|
truncated = len(data) > max_bytes
|
||||||
|
text = data[:max_bytes].decode("utf-8", errors="replace")
|
||||||
|
numbered = _line_number(text)
|
||||||
|
if truncated:
|
||||||
|
numbered += "\n... truncated"
|
||||||
|
return numbered
|
||||||
|
|
||||||
|
def grep(
|
||||||
|
self,
|
||||||
|
pattern: str,
|
||||||
|
path: str = ".",
|
||||||
|
max_matches: int = DEFAULT_MAX_MATCHES,
|
||||||
|
) -> str:
|
||||||
|
root = self._resolve_scoped(path, "grep path")
|
||||||
|
regex = re.compile(pattern)
|
||||||
|
files = [root] if root.is_file() else [item for item in root.rglob("*") if item.is_file()]
|
||||||
|
matches: list[str] = []
|
||||||
|
for file_path in sorted(files):
|
||||||
|
try:
|
||||||
|
text = file_path.read_text(encoding="utf-8", errors="replace")
|
||||||
|
except OSError:
|
||||||
|
continue
|
||||||
|
for line_number, line in enumerate(text.splitlines(), start=1):
|
||||||
|
if regex.search(line):
|
||||||
|
matches.append(f"{_relative(file_path, self.project_root)}:{line_number}: {line}")
|
||||||
|
if len(matches) >= max_matches:
|
||||||
|
matches.append("... truncated")
|
||||||
|
return "\n".join(matches)
|
||||||
|
return "\n".join(matches) or "No matches found."
|
||||||
|
|
||||||
|
def write_tool_artifact(self, task_id: str, calls: list[ToolCall], filename: str = "repo-tools.md") -> Path:
|
||||||
|
content = format_tool_calls(calls)
|
||||||
|
path = self.artifacts.write_stage_output(task_id, filename, content)
|
||||||
|
self.logger.event(
|
||||||
|
"artifact.write",
|
||||||
|
"Wrote repo tool artifact",
|
||||||
|
task_id=task_id,
|
||||||
|
artifact_path=path.relative_to(self.project_root),
|
||||||
|
)
|
||||||
|
return path
|
||||||
|
|
||||||
|
def execute_requests(self, task_id: str, requests: list[ToolCall], filename: str = "repo-tools.md") -> str:
|
||||||
|
completed: list[ToolCall] = []
|
||||||
|
for request in requests:
|
||||||
|
self.logger.event(
|
||||||
|
"tool.call",
|
||||||
|
"Running repo lookup tool",
|
||||||
|
task_id=task_id,
|
||||||
|
tool=request.name,
|
||||||
|
**request.arguments,
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
output = self._execute_request(request)
|
||||||
|
except (SafetyError, re.error) as exc:
|
||||||
|
output = str(exc)
|
||||||
|
completed.append(ToolCall(request.name, request.arguments, output))
|
||||||
|
self.write_tool_artifact(task_id, completed, filename=filename)
|
||||||
|
return format_tool_calls(completed)
|
||||||
|
|
||||||
|
def _execute_request(self, request: ToolCall) -> str:
|
||||||
|
if request.name == "list_files":
|
||||||
|
return self.list_files(
|
||||||
|
path=request.arguments.get("path", "."),
|
||||||
|
pattern=request.arguments.get("pattern", "*"),
|
||||||
|
)
|
||||||
|
if request.name == "read_file":
|
||||||
|
path = request.arguments.get("path")
|
||||||
|
if not path:
|
||||||
|
return "Missing required argument: path"
|
||||||
|
return self.read_file(path)
|
||||||
|
if request.name == "grep":
|
||||||
|
pattern = request.arguments.get("pattern")
|
||||||
|
if not pattern:
|
||||||
|
return "Missing required argument: pattern"
|
||||||
|
return self.grep(pattern, path=request.arguments.get("path", "."))
|
||||||
|
return f"Unsupported repo lookup tool: {request.name}"
|
||||||
|
|
||||||
|
def _resolve_scoped(self, path: str, context: str) -> Path:
|
||||||
|
resolved = resolve_inside_root(self.project_root, path, context)
|
||||||
|
for scoped_root in self.scoped_roots:
|
||||||
|
try:
|
||||||
|
resolved.relative_to(scoped_root)
|
||||||
|
return resolved
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
scopes = ", ".join(_relative(item, self.project_root) for item in self.scoped_roots)
|
||||||
|
raise SafetyError(f"Safety error: {context} is outside configured scoped paths: {path}. Scopes: {scopes}")
|
||||||
|
|
||||||
|
|
||||||
|
def format_tool_calls(calls: list[ToolCall]) -> str:
|
||||||
|
lines = ["# Repo Tool Calls", ""]
|
||||||
|
if not calls:
|
||||||
|
lines.append("No tool calls.")
|
||||||
|
return "\n".join(lines)
|
||||||
|
for index, call in enumerate(calls, start=1):
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
f"## {index}. {call.name}",
|
||||||
|
"",
|
||||||
|
"Arguments:",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
for key, value in sorted(call.arguments.items()):
|
||||||
|
lines.append(f"- {key}: `{value}`")
|
||||||
|
lines.extend(["", "Output:", "", "```text", call.output.rstrip(), "```", ""])
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_lookup_requests(text: str) -> list[ToolCall]:
|
||||||
|
"""Parse a small YAML-like lookup request list from model output."""
|
||||||
|
|
||||||
|
lines = text.splitlines()
|
||||||
|
in_section = False
|
||||||
|
current: dict[str, str] = {}
|
||||||
|
requests: list[ToolCall] = []
|
||||||
|
|
||||||
|
def flush() -> None:
|
||||||
|
nonlocal current
|
||||||
|
if not current:
|
||||||
|
return
|
||||||
|
name = current.pop("tool", "").strip()
|
||||||
|
if name:
|
||||||
|
requests.append(ToolCall(name=name, arguments=dict(current), output=""))
|
||||||
|
current = {}
|
||||||
|
|
||||||
|
for raw_line in lines:
|
||||||
|
stripped = raw_line.strip()
|
||||||
|
if stripped in {"lookup_requests:", "repo_lookup:", "repo_lookups:"}:
|
||||||
|
in_section = True
|
||||||
|
continue
|
||||||
|
if not in_section:
|
||||||
|
continue
|
||||||
|
if not stripped:
|
||||||
|
continue
|
||||||
|
if not raw_line.startswith((" ", "-", "\t")) and not stripped.endswith(":"):
|
||||||
|
break
|
||||||
|
if stripped.startswith("- "):
|
||||||
|
flush()
|
||||||
|
stripped = stripped[2:].strip()
|
||||||
|
if ":" not in stripped:
|
||||||
|
continue
|
||||||
|
key, value = stripped.split(":", 1)
|
||||||
|
key = key.strip()
|
||||||
|
value = value.strip().strip('"').strip("'")
|
||||||
|
if key == "tool" and current:
|
||||||
|
flush()
|
||||||
|
current[key] = value
|
||||||
|
flush()
|
||||||
|
return requests
|
||||||
|
|
||||||
|
|
||||||
|
def extract_agent_stdout(artifact_text: str) -> str:
|
||||||
|
lines = artifact_text.splitlines()
|
||||||
|
for index, line in enumerate(lines):
|
||||||
|
if line.strip() != "## stdout":
|
||||||
|
continue
|
||||||
|
start = None
|
||||||
|
for cursor in range(index + 1, len(lines)):
|
||||||
|
if lines[cursor].strip().startswith("```"):
|
||||||
|
start = cursor + 1
|
||||||
|
break
|
||||||
|
if start is None:
|
||||||
|
return ""
|
||||||
|
end = len(lines)
|
||||||
|
for cursor in range(start, len(lines)):
|
||||||
|
if lines[cursor].strip().startswith("```"):
|
||||||
|
end = cursor
|
||||||
|
break
|
||||||
|
return "\n".join(lines[start:end])
|
||||||
|
return artifact_text
|
||||||
|
|
||||||
|
|
||||||
|
def _line_number(text: str) -> str:
|
||||||
|
return "\n".join(f"{index}: {line}" for index, line in enumerate(text.splitlines(), start=1))
|
||||||
|
|
||||||
|
|
||||||
|
def _relative(path: Path, root: Path) -> str:
|
||||||
|
try:
|
||||||
|
return path.relative_to(root).as_posix()
|
||||||
|
except ValueError:
|
||||||
|
return path.as_posix()
|
||||||
91
nightshift/runlog.py
Normal file
91
nightshift/runlog.py
Normal file
|
|
@ -0,0 +1,91 @@
|
||||||
|
"""Operational run logging for NightShift."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Callable
|
||||||
|
|
||||||
|
from .artifacts import ArtifactStore
|
||||||
|
|
||||||
|
|
||||||
|
ConsoleWriter = Callable[[str], None]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class LogEvent:
|
||||||
|
event: str
|
||||||
|
message: str
|
||||||
|
fields: dict[str, object]
|
||||||
|
|
||||||
|
|
||||||
|
class RunLogger:
|
||||||
|
"""Write concise operational events to CLI and run log artifacts."""
|
||||||
|
|
||||||
|
def __init__(self, console: ConsoleWriter | None = None) -> None:
|
||||||
|
self.console = console
|
||||||
|
self._run_log_path: Path | None = None
|
||||||
|
self._aggregate_log_path: Path | None = None
|
||||||
|
|
||||||
|
def bind(self, artifacts: ArtifactStore) -> None:
|
||||||
|
artifacts.initialize_run()
|
||||||
|
self._run_log_path = artifacts.run_log_path
|
||||||
|
self._aggregate_log_path = artifacts.aggregate_log_path
|
||||||
|
|
||||||
|
def event(self, event: str, message: str, **fields: object) -> None:
|
||||||
|
safe_fields = _redact_fields(fields)
|
||||||
|
line = format_log_line(LogEvent(event=event, message=message, fields=safe_fields))
|
||||||
|
if self.console is not None:
|
||||||
|
self.console(line)
|
||||||
|
for path in (self._run_log_path, self._aggregate_log_path):
|
||||||
|
if path is None:
|
||||||
|
continue
|
||||||
|
path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
with path.open("a", encoding="utf-8") as handle:
|
||||||
|
handle.write(line + "\n")
|
||||||
|
|
||||||
|
|
||||||
|
class NullRunLogger(RunLogger):
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__(console=None)
|
||||||
|
|
||||||
|
def bind(self, artifacts: ArtifactStore) -> None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
def event(self, event: str, message: str, **fields: object) -> None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def format_log_line(log_event: LogEvent) -> str:
|
||||||
|
timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||||
|
parts = [timestamp, log_event.event, log_event.message]
|
||||||
|
for key, value in sorted(log_event.fields.items()):
|
||||||
|
if value is None or value == "":
|
||||||
|
continue
|
||||||
|
parts.append(f"{key}={_format_value(value)}")
|
||||||
|
return " | ".join(parts)
|
||||||
|
|
||||||
|
|
||||||
|
def tail_lines(path: Path, limit: int = 100) -> list[str]:
|
||||||
|
if limit <= 0:
|
||||||
|
return []
|
||||||
|
if not path.exists() or not path.is_file():
|
||||||
|
return []
|
||||||
|
return path.read_text(encoding="utf-8", errors="replace").splitlines()[-limit:]
|
||||||
|
|
||||||
|
|
||||||
|
def _format_value(value: object) -> str:
|
||||||
|
text = str(value).replace("\n", " ").replace("\r", " ")
|
||||||
|
return text if text else ""
|
||||||
|
|
||||||
|
|
||||||
|
def _redact_fields(fields: dict[str, object]) -> dict[str, object]:
|
||||||
|
redacted: dict[str, object] = {}
|
||||||
|
for key, value in fields.items():
|
||||||
|
lowered = key.lower()
|
||||||
|
if any(marker in lowered for marker in ("secret", "token", "password", "key")):
|
||||||
|
redacted[key] = "<redacted>"
|
||||||
|
else:
|
||||||
|
redacted[key] = value
|
||||||
|
return redacted
|
||||||
|
|
@ -7,6 +7,7 @@ from html import escape
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from .errors import NightShiftError
|
from .errors import NightShiftError
|
||||||
|
from .runlog import tail_lines
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
|
|
@ -14,6 +15,7 @@ class RunInfo:
|
||||||
name: str
|
name: str
|
||||||
path: Path
|
path: Path
|
||||||
summary: str
|
summary: str
|
||||||
|
log_tail: tuple[str, ...] = ()
|
||||||
|
|
||||||
|
|
||||||
def list_runs(artifact_dir: str | Path) -> list[RunInfo]:
|
def list_runs(artifact_dir: str | Path) -> list[RunInfo]:
|
||||||
|
|
@ -24,7 +26,14 @@ def list_runs(artifact_dir: str | Path) -> list[RunInfo]:
|
||||||
for path in sorted((item for item in runs_dir.iterdir() if item.is_dir()), reverse=True):
|
for path in sorted((item for item in runs_dir.iterdir() if item.is_dir()), reverse=True):
|
||||||
summary_path = path / "run-summary.md"
|
summary_path = path / "run-summary.md"
|
||||||
summary = summary_path.read_text(encoding="utf-8") if summary_path.exists() else "No run summary yet."
|
summary = summary_path.read_text(encoding="utf-8") if summary_path.exists() else "No run summary yet."
|
||||||
runs.append(RunInfo(name=path.name, path=path, summary=summary))
|
runs.append(
|
||||||
|
RunInfo(
|
||||||
|
name=path.name,
|
||||||
|
path=path,
|
||||||
|
summary=summary,
|
||||||
|
log_tail=tuple(tail_lines(path / "run.log", limit=100)),
|
||||||
|
)
|
||||||
|
)
|
||||||
return runs
|
return runs
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -51,6 +60,10 @@ def render_dashboard(artifact_dir: str | Path) -> str:
|
||||||
"<pre>",
|
"<pre>",
|
||||||
escape(run.summary),
|
escape(run.summary),
|
||||||
"</pre>",
|
"</pre>",
|
||||||
|
"<h3>Log Tail</h3>",
|
||||||
|
"<pre>",
|
||||||
|
escape("\n".join(run.log_tail) if run.log_tail else "No run log yet."),
|
||||||
|
"</pre>",
|
||||||
"</section>",
|
"</section>",
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import tempfile
|
import tempfile
|
||||||
import unittest
|
import unittest
|
||||||
from unittest.mock import patch
|
from unittest.mock import MagicMock, patch
|
||||||
|
|
||||||
from nightshift.agents import AgentExecutor, build_prompt_bundle, parse_review_output
|
from nightshift.agents import AgentExecutor, build_prompt_bundle, parse_review_output
|
||||||
from nightshift.agents import AgentInvocation, format_agent_invocation
|
from nightshift.agents import AgentInvocation, format_agent_invocation
|
||||||
|
|
@ -132,6 +132,43 @@ class AgentExecutorTests(unittest.TestCase):
|
||||||
output = (root / result.output_path).read_text(encoding="utf-8")
|
output = (root / result.output_path).read_text(encoding="utf-8")
|
||||||
self.assertIn("ollama run tiny-model", output)
|
self.assertIn("ollama run tiny-model", output)
|
||||||
|
|
||||||
|
def test_openai_compatible_agent_sends_temperature(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
prompt_path = root / "planner.md"
|
||||||
|
prompt_path.write_text("Plan carefully.", encoding="utf-8")
|
||||||
|
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
|
||||||
|
executor = AgentExecutor(
|
||||||
|
root,
|
||||||
|
{
|
||||||
|
"planner": AgentConfig(
|
||||||
|
id="planner",
|
||||||
|
backend="openai_compatible",
|
||||||
|
command=None,
|
||||||
|
model="tiny-model",
|
||||||
|
base_url="http://localhost:11434/v1",
|
||||||
|
temperature=0.2,
|
||||||
|
system_prompt=Path("planner.md"),
|
||||||
|
)
|
||||||
|
},
|
||||||
|
artifacts,
|
||||||
|
)
|
||||||
|
task = parse_tasks(TASK_MD)[0]
|
||||||
|
stage = StageConfig(id="plan", type="agent", agent="planner", output="plan.md")
|
||||||
|
response = MagicMock()
|
||||||
|
response.__enter__.return_value.read.return_value = (
|
||||||
|
b'{"choices":[{"message":{"content":"api output"}}]}'
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch("nightshift.agents.request.urlopen", return_value=response) as urlopen:
|
||||||
|
result = executor.run_stage(stage, task)
|
||||||
|
|
||||||
|
self.assertEqual(result.status, "pass")
|
||||||
|
request_obj = urlopen.call_args.args[0]
|
||||||
|
body = request_obj.data.decode("utf-8")
|
||||||
|
self.assertIn('"temperature": 0.2', body)
|
||||||
|
self.assertIn("api output", (root / result.output_path).read_text(encoding="utf-8"))
|
||||||
|
|
||||||
def test_agent_artifact_format_tolerates_missing_streams(self) -> None:
|
def test_agent_artifact_format_tolerates_missing_streams(self) -> None:
|
||||||
invocation = AgentInvocation(
|
invocation = AgentInvocation(
|
||||||
agent_id="planner",
|
agent_id="planner",
|
||||||
|
|
|
||||||
|
|
@ -167,6 +167,24 @@ class ConfigTests(unittest.TestCase):
|
||||||
self.assertEqual(config.agents["planner"].model, "qwen2.5-coder:14b")
|
self.assertEqual(config.agents["planner"].model, "qwen2.5-coder:14b")
|
||||||
self.assertEqual(config.experiment.label, "local-test")
|
self.assertEqual(config.experiment.label, "local-test")
|
||||||
|
|
||||||
|
def test_openai_compatible_backend_loads(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
init_project(root)
|
||||||
|
config_path = root / "nightshift.yaml"
|
||||||
|
text = config_path.read_text(encoding="utf-8").replace(
|
||||||
|
"backend: command\n command: echo",
|
||||||
|
"backend: openai_compatible\n model: local-model\n base_url: http://localhost:11434/v1\n temperature: 0.1",
|
||||||
|
1,
|
||||||
|
)
|
||||||
|
config_path.write_text(text, encoding="utf-8")
|
||||||
|
|
||||||
|
config = load_config(config_path)
|
||||||
|
|
||||||
|
self.assertEqual(config.agents["planner"].backend, "openai_compatible")
|
||||||
|
self.assertEqual(config.agents["planner"].base_url, "http://localhost:11434/v1")
|
||||||
|
self.assertEqual(config.agents["planner"].temperature, 0.1)
|
||||||
|
|
||||||
def test_command_stage_options_load(self) -> None:
|
def test_command_stage_options_load(self) -> None:
|
||||||
with tempfile.TemporaryDirectory() as directory:
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
root = Path(directory)
|
root = Path(directory)
|
||||||
|
|
@ -188,6 +206,41 @@ class ConfigTests(unittest.TestCase):
|
||||||
self.assertEqual(test_stage.timeout_seconds, 30)
|
self.assertEqual(test_stage.timeout_seconds, 30)
|
||||||
self.assertEqual(test_stage.working_dir, Path("."))
|
self.assertEqual(test_stage.working_dir, Path("."))
|
||||||
|
|
||||||
|
def test_agent_temperature_loads(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
init_project(root)
|
||||||
|
config_path = root / "nightshift.yaml"
|
||||||
|
config_path.write_text(
|
||||||
|
config_path.read_text(encoding="utf-8").replace(
|
||||||
|
" system_prompt: agents/planner.md",
|
||||||
|
" system_prompt: agents/planner.md\n temperature: 0.2",
|
||||||
|
1,
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
config = load_config(config_path)
|
||||||
|
|
||||||
|
self.assertEqual(config.agents["planner"].temperature, 0.2)
|
||||||
|
|
||||||
|
def test_agent_temperature_must_be_number(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
init_project(root)
|
||||||
|
config_path = root / "nightshift.yaml"
|
||||||
|
config_path.write_text(
|
||||||
|
config_path.read_text(encoding="utf-8").replace(
|
||||||
|
" system_prompt: agents/planner.md",
|
||||||
|
" system_prompt: agents/planner.md\n temperature: low",
|
||||||
|
1,
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
with self.assertRaisesRegex(ConfigError, "temperature"):
|
||||||
|
load_config(config_path)
|
||||||
|
|
||||||
def test_non_command_stage_cannot_define_commands(self) -> None:
|
def test_non_command_stage_cannot_define_commands(self) -> None:
|
||||||
with tempfile.TemporaryDirectory() as directory:
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
root = Path(directory)
|
root = Path(directory)
|
||||||
|
|
|
||||||
|
|
@ -230,6 +230,79 @@ Acceptance Criteria:
|
||||||
self.assertEqual(result.status, "failed")
|
self.assertEqual(result.status, "failed")
|
||||||
self.assertEqual(result.task_results[0].status, "blocked")
|
self.assertEqual(result.task_results[0].status, "blocked")
|
||||||
|
|
||||||
|
def test_run_writes_operational_log(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
stages = (StageConfig(id="plan", type="agent", agent="planner", output="plan.md"),)
|
||||||
|
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
|
||||||
|
config = make_config(root, stages)
|
||||||
|
runner = PipelineRunner(config, artifacts)
|
||||||
|
task = parse_tasks(TASK_MD)[0]
|
||||||
|
|
||||||
|
runner.run_task(task)
|
||||||
|
|
||||||
|
log = (root / ".nightshift" / "runs" / "test-run" / "run.log").read_text(encoding="utf-8")
|
||||||
|
self.assertIn("task.start", log)
|
||||||
|
self.assertIn("stage.start", log)
|
||||||
|
self.assertIn("agent.finish", log)
|
||||||
|
|
||||||
|
def test_planner_lookup_requests_write_files_inspected_and_rerun(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
(root / "target.py").write_text("VALUE = 1\n", encoding="utf-8")
|
||||||
|
(root / "fake_planner.py").write_text(
|
||||||
|
"\n".join(
|
||||||
|
[
|
||||||
|
"import sys",
|
||||||
|
"prompt = sys.stdin.read()",
|
||||||
|
"if 'repo_lookup_results' in prompt:",
|
||||||
|
" print('final plan with context')",
|
||||||
|
"else:",
|
||||||
|
" print('lookup_requests:')",
|
||||||
|
" print('- tool: read_file')",
|
||||||
|
" print(' path: target.py')",
|
||||||
|
]
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
stages = (StageConfig(id="plan", type="agent", agent="planner", output="plan.md"),)
|
||||||
|
config = make_config(root, stages)
|
||||||
|
config.agents["planner"] = AgentConfig(
|
||||||
|
id="planner",
|
||||||
|
backend="command",
|
||||||
|
command="python fake_planner.py",
|
||||||
|
system_prompt=Path("planner.md"),
|
||||||
|
)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
task = parse_tasks(TASK_MD)[0]
|
||||||
|
|
||||||
|
result = runner.run_task(task)
|
||||||
|
|
||||||
|
task_dir = root / ".nightshift" / "runs" / "test-run" / "tasks" / task.id
|
||||||
|
self.assertEqual(result.status, "complete")
|
||||||
|
self.assertTrue((task_dir / "files-inspected.md").exists())
|
||||||
|
self.assertIn("1: VALUE = 1", (task_dir / "files-inspected.md").read_text(encoding="utf-8"))
|
||||||
|
self.assertIn("final plan with context", (task_dir / "plan.md").read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
def test_repo_context_stage_writes_context_pack(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
_write_common_files(root)
|
||||||
|
(root / "app.py").write_text("def run_pipeline():\n return True\n", encoding="utf-8")
|
||||||
|
stages = (StageConfig(id="context", type="repo_context", output="context-pack.md"),)
|
||||||
|
config = make_config(root, stages)
|
||||||
|
runner = PipelineRunner(config, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
task = parse_tasks(TASK_MD)[0]
|
||||||
|
|
||||||
|
result = runner.run_task(task)
|
||||||
|
|
||||||
|
pack = root / ".nightshift" / "runs" / "test-run" / "tasks" / task.id / "context-pack.md"
|
||||||
|
self.assertEqual(result.status, "complete")
|
||||||
|
self.assertIn("Context Pack", pack.read_text(encoding="utf-8"))
|
||||||
|
self.assertIn("app.py", pack.read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
|
||||||
def _write_common_files(root: Path) -> None:
|
def _write_common_files(root: Path) -> None:
|
||||||
(root / "nightshift.yaml").write_text("project:\n name: test\n", encoding="utf-8")
|
(root / "nightshift.yaml").write_text("project:\n name: test\n", encoding="utf-8")
|
||||||
|
|
|
||||||
47
tests/test_repo_tools.py
Normal file
47
tests/test_repo_tools.py
Normal file
|
|
@ -0,0 +1,47 @@
|
||||||
|
from pathlib import Path
|
||||||
|
import tempfile
|
||||||
|
import unittest
|
||||||
|
|
||||||
|
from nightshift.artifacts import ArtifactStore
|
||||||
|
from nightshift.config import SafetyConfig
|
||||||
|
from nightshift.repo_tools import RepoTools, parse_lookup_requests
|
||||||
|
|
||||||
|
|
||||||
|
class RepoToolsTests(unittest.TestCase):
|
||||||
|
def test_repo_tools_are_scoped_and_line_numbered(self) -> None:
|
||||||
|
with tempfile.TemporaryDirectory() as directory:
|
||||||
|
root = Path(directory)
|
||||||
|
(root / "src").mkdir()
|
||||||
|
(root / "src" / "app.py").write_text("def hello():\n return 'hi'\n", encoding="utf-8")
|
||||||
|
safety = SafetyConfig(
|
||||||
|
require_clean_worktree=False,
|
||||||
|
scoped_paths=("src",),
|
||||||
|
allowed_commands=(),
|
||||||
|
forbidden_commands=(),
|
||||||
|
)
|
||||||
|
tools = RepoTools(root, safety, ArtifactStore(root, ".nightshift", run_id="test-run"))
|
||||||
|
|
||||||
|
self.assertIn("src/app.py", tools.list_files("src", "*.py"))
|
||||||
|
self.assertIn("1: def hello():", tools.read_file("src/app.py"))
|
||||||
|
self.assertIn("src/app.py:1", tools.grep("hello", "src"))
|
||||||
|
|
||||||
|
def test_parse_lookup_requests(self) -> None:
|
||||||
|
output = """Plan needs context.
|
||||||
|
|
||||||
|
lookup_requests:
|
||||||
|
- tool: read_file
|
||||||
|
path: nightshift/pipeline.py
|
||||||
|
- tool: grep
|
||||||
|
path: nightshift
|
||||||
|
pattern: PipelineRunner
|
||||||
|
"""
|
||||||
|
|
||||||
|
requests = parse_lookup_requests(output)
|
||||||
|
|
||||||
|
self.assertEqual([request.name for request in requests], ["read_file", "grep"])
|
||||||
|
self.assertEqual(requests[0].arguments["path"], "nightshift/pipeline.py")
|
||||||
|
self.assertEqual(requests[1].arguments["pattern"], "PipelineRunner")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
|
|
@ -19,14 +19,23 @@ class WebDashboardTests(unittest.TestCase):
|
||||||
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
|
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
|
||||||
artifacts.initialize_run()
|
artifacts.initialize_run()
|
||||||
artifacts.run_summary_path.write_text("# Summary\n\nok", encoding="utf-8")
|
artifacts.run_summary_path.write_text("# Summary\n\nok", encoding="utf-8")
|
||||||
|
artifacts.run_log_path.write_text(
|
||||||
|
"\n".join(f"line {index}" for index in range(120)),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
runs = list_runs(root / ".nightshift")
|
runs = list_runs(root / ".nightshift")
|
||||||
content = read_artifact(root / ".nightshift" / "runs" / "test-run", "run-summary.md")
|
content = read_artifact(root / ".nightshift" / "runs" / "test-run", "run-summary.md")
|
||||||
escaped = read_artifact(root / ".nightshift" / "runs" / "test-run", "../project-context.md")
|
escaped = read_artifact(root / ".nightshift" / "runs" / "test-run", "../project-context.md")
|
||||||
|
dashboard = render_dashboard(root / ".nightshift")
|
||||||
|
|
||||||
self.assertEqual(len(runs), 1)
|
self.assertEqual(len(runs), 1)
|
||||||
|
self.assertEqual(len(runs[0].log_tail), 100)
|
||||||
self.assertIn("ok", content)
|
self.assertIn("ok", content)
|
||||||
self.assertIn("escapes", escaped)
|
self.assertIn("escapes", escaped)
|
||||||
|
self.assertIn("Log Tail", dashboard)
|
||||||
|
self.assertIn("line 119", dashboard)
|
||||||
|
self.assertNotIn("line 19\n", dashboard)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue
Block a user