Implement NightShift MVP phases 1-6

Includes starter project generation, validation for configs/tasks/commands, artifact snapshot writing, structured stage results, command output capture, devlogs for phases 1-6, and unit coverage for the implemented MVP layers.
This commit is contained in:
K. Hodges 2026-05-17 00:17:13 -07:00
parent 5958c82cf9
commit c1baf9b7d8
26 changed files with 1873 additions and 1 deletions

25
docs/devlog/phase1.md Normal file
View File

@ -0,0 +1,25 @@
# Phase 1 Devlog: Skeleton
## Implemented
- Created the `nightshift` Python package.
- Added a CLI module with `nightshift init`, `nightshift validate`, and placeholder `run` / `status` commands.
- Added `pyproject.toml` with a console entry point.
- Added starter file generation for:
- `nightshift.yaml`
- `tasks.md`
- `agents/planner.md`
- `agents/implementer.md`
- `agents/reviewer.md`
- Added unit tests for initialization behavior.
## Decisions Made
- Used `argparse` instead of a CLI dependency so the MVP works from a clean Python checkout.
- Implemented overwrite protection with a `--force` flag. Interactive confirmation was deferred to keep the command deterministic and scriptable.
- Added `run` and `status` as CLI placeholders only. The phase required an entry point, but actual execution belongs to later phases.
- Kept starter prompts short and human-readable so they can be revised easily as agent execution is implemented.
## Notes
- Phase 1 establishes the file layout expected by later phases without introducing model or pipeline execution behavior early.

29
docs/devlog/phase2.md Normal file
View File

@ -0,0 +1,29 @@
# Phase 2 Devlog: Config Loading
## Implemented
- Added typed configuration objects for project, safety, agents, pipeline, and stages.
- Added `load_config()` for parsing `nightshift.yaml`.
- Added `validate_config()` for checking referenced task and prompt files.
- Added validation for:
- required top-level sections
- required project fields
- non-empty agents
- supported stage types
- agent stage references
- command stage command lists
- duplicate stage IDs
- `on_fail` references
- Added unit tests for valid config loading and key invalid config cases.
## Decisions Made
- Used PyYAML automatically when available, but added a small standard-library fallback parser for the YAML subset emitted by `nightshift init`.
- Deferred full YAML edge-case support to a future dependency/install pass. The fallback is intentionally documented as a starter-config parser, not a general YAML implementation.
- Validation currently confirms that scoped paths resolve inside the project root, but it does not require every scoped path to already exist. That allows users to scaffold configs before creating all source/test directories.
- Kept config validation focused on structural correctness and references. Command safety enforcement is left for Phase 3.
## Notes
- The config layer now catches missing agent references with explicit messages such as `pipeline stage 'plan' references unknown agent 'critic'`.
- Tests use `unittest` from the standard library so they can run before development dependencies are introduced.

24
docs/devlog/phase3.md Normal file
View File

@ -0,0 +1,24 @@
# Phase 3 Devlog: Safety Layer
## Implemented
- Added `nightshift/safety.py`.
- Implemented project root resolution.
- Implemented path resolution that rejects traversal outside the configured project root.
- Implemented scoped path validation.
- Implemented safe artifact path construction that rejects escapes from the artifact directory.
- Implemented command allowlist checks.
- Implemented forbidden command fragment checks.
- Wired command and path safety checks into `validate_config()`.
- Added tests for path traversal, artifact escapes, allowlist behavior, and forbidden command fragments.
## Decisions Made
- Command matching uses normalized whitespace and exact allowlist entries. This keeps v1 predictable while still handling harmless spacing differences.
- Forbidden fragments are checked before allowlist acceptance, so a dangerous command cannot be made valid by adding it to `allowed_commands`.
- Scoped paths are validated for containment inside the project root, but they are not required to exist yet. This preserves the Phase 2 decision that configs can be scaffolded before all source directories exist.
- The safety layer raises `SafetyError`; config validation wraps those failures as config errors when they come from `nightshift validate`.
## Notes
- This phase does not execute commands. It only validates whether a command would be permitted. Process execution belongs to Phase 6.

22
docs/devlog/phase4.md Normal file
View File

@ -0,0 +1,22 @@
# Phase 4 Devlog: Task Parser
## Implemented
- Added `nightshift/tasks.py`.
- Implemented parsing for documented markdown checklist tasks.
- Extracted task id, title, completion state, description, acceptance criteria, dependency bullets, raw task markdown, and source line number.
- Added selection of the next incomplete task.
- Added selection of a specific task id.
- Added useful errors for malformed task headers, duplicate ids, missing acceptance criteria, missing files, traversal attempts, and unknown task ids.
- Added parser and selection tests.
## Decisions Made
- The parser intentionally supports the documented v1 format rather than broad Markdown. This keeps failure behavior explicit and testable.
- Acceptance criteria are required for each task because downstream pipeline stages need concrete review targets.
- Dependencies are parsed as simple bullets under a `Dependencies:` section, but no dependency solver is implemented in this phase.
- Completed tasks use `[x]` or `[X]`; incomplete tasks use `[ ]`.
## Notes
- Task mutation, completion updates, and dependency enforcement are deferred until later pipeline phases.

24
docs/devlog/phase5.md Normal file
View File

@ -0,0 +1,24 @@
# Phase 5 Devlog: Artifact Store
## Implemented
- Added `nightshift/artifacts.py`.
- Created `.nightshift/`, per-run directories, and per-task directories.
- Created `project-context.md` and `run-summary.md` placeholders when a run is initialized.
- Added config snapshot copying to `config.snapshot.yaml`.
- Added task snapshot writing to `task.md`.
- Added generic stage output writing.
- Added command output writing.
- Added final task notes writing.
- Added tests for artifact tree creation, snapshot writing, and task-directory escape rejection.
## Decisions Made
- `ArtifactStore` accepts an optional `run_id` so tests and future pipeline code can produce deterministic artifact paths.
- Default run ids use UTC timestamps in `YYYYMMDDTHHMMSSZ` format.
- Stage output filenames are relative to the task artifact directory and may include subdirectories, but they cannot escape that task directory.
- Project context and run summary files are initialized with simple markdown headers. Later phases can append richer content.
## Notes
- The artifact store is intentionally independent from pipeline execution so command, agent, context, and report phases can reuse it.

21
docs/devlog/phase6.md Normal file
View File

@ -0,0 +1,21 @@
# Phase 6 Devlog: Command Executor
## Implemented
- Added `nightshift/commands.py`.
- Added command-stage execution for configured `command` stages.
- Captured stdout, stderr, exit code, duration, and timeout state.
- Persisted command transcripts through the artifact store.
- Returned structured `StageResult` objects.
- Added tests for passing commands, failing commands, output persistence, and allowlist rejection.
## Decisions Made
- Commands are validated through the Phase 3 safety layer immediately before execution, even though config validation also checks them. This keeps command execution safe if called directly in later code.
- Command stages stop at the first failing or timed-out command and persist the commands that ran.
- Commands run with `shell=True` because v1 config stores commands as shell-style strings. This is constrained by exact allowlist matching and forbidden fragment checks.
- The default timeout is 300 seconds. Tests can override it later if timeout-specific behavior needs coverage.
## Notes
- This phase does not wire command execution into a full pipeline runner. That belongs to Phase 8.

View File

@ -1,6 +1,6 @@
# NIGHTSHIFT_CODEX.md
You are Codex working on **NightShift**, a local-first AI coding pipeline runner.
You are Codex working on **NightShift**, a local-first AI coding pipeline runner in python.
This file is the implementation-driving context document. Treat it as the project brief, architectural guide, and task checklist.

3
nightshift/__init__.py Normal file
View File

@ -0,0 +1,3 @@
"""NightShift package."""
__version__ = "0.1.0"

124
nightshift/artifacts.py Normal file
View File

@ -0,0 +1,124 @@
"""Artifact storage for NightShift runs."""
from __future__ import annotations
from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
import shutil
from .config import NightShiftConfig
from .errors import ArtifactError, SafetyError
from .safety import resolve_inside_root, resolve_project_root, safe_artifact_path
from .tasks import Task
@dataclass(frozen=True)
class TaskArtifactPaths:
task_id: str
directory: Path
task_snapshot: Path
class ArtifactStore:
"""Create and write the durable artifact tree for one run."""
def __init__(self, project_root: str | Path, artifact_dir: str | Path, run_id: str | None = None) -> None:
try:
self.project_root = resolve_project_root(project_root)
self.artifact_root = resolve_inside_root(
self.project_root, artifact_dir, "artifact directory"
)
except SafetyError as exc:
raise ArtifactError(str(exc)) from exc
self.run_id = run_id or default_run_id()
self.run_dir = self._artifact_path("runs", self.run_id)
self.tasks_dir = self.run_dir / "tasks"
self.project_context_path = self.artifact_root / "project-context.md"
self.run_summary_path = self.run_dir / "run-summary.md"
self.config_snapshot_path = self.run_dir / "config.snapshot.yaml"
@classmethod
def from_config(cls, config: NightShiftConfig, run_id: str | None = None) -> "ArtifactStore":
return cls(config.project.root, config.project.artifact_dir, run_id=run_id)
def initialize_run(self) -> None:
"""Create the base artifact tree for this run."""
self.artifact_root.mkdir(parents=True, exist_ok=True)
self.tasks_dir.mkdir(parents=True, exist_ok=True)
if not self.project_context_path.exists():
self.project_context_path.write_text("# Project Context\n\n", encoding="utf-8")
if not self.run_summary_path.exists():
self.run_summary_path.write_text("# Run Summary\n\n", encoding="utf-8")
def write_config_snapshot(self, config_path: str | Path) -> Path:
"""Copy the input config into the run artifact directory."""
self.initialize_run()
source = Path(config_path).resolve()
try:
source.relative_to(self.project_root)
except ValueError as exc:
raise ArtifactError(f"Artifact error: config path is outside project root: {source}") from exc
if not source.exists():
raise ArtifactError(f"Artifact error: config path does not exist: {source}")
shutil.copyfile(source, self.config_snapshot_path)
return self.config_snapshot_path
def create_task_dir(self, task_id: str) -> TaskArtifactPaths:
"""Create the artifact directory for one task."""
self.initialize_run()
task_dir = self._artifact_path("runs", self.run_id, "tasks", task_id)
task_dir.mkdir(parents=True, exist_ok=True)
return TaskArtifactPaths(
task_id=task_id,
directory=task_dir,
task_snapshot=task_dir / "task.md",
)
def write_task_snapshot(self, task: Task) -> Path:
paths = self.create_task_dir(task.id)
paths.task_snapshot.write_text(task.raw_markdown, encoding="utf-8")
return paths.task_snapshot
def write_stage_output(self, task_id: str, filename: str, content: str) -> Path:
"""Write a stage artifact under a task directory."""
task_dir = self.create_task_dir(task_id).directory
output_path = self._task_artifact_path(task_dir, filename)
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(content, encoding="utf-8")
return output_path
def write_command_output(self, task_id: str, filename: str, content: str) -> Path:
return self.write_stage_output(task_id, filename, content)
def write_final_task_notes(self, task_id: str, content: str, filename: str = "final-notes.md") -> Path:
return self.write_stage_output(task_id, filename, content)
def _artifact_path(self, *parts: str | Path) -> Path:
try:
return safe_artifact_path(self.project_root, self.artifact_root, *parts)
except SafetyError as exc:
raise ArtifactError(str(exc)) from exc
def _task_artifact_path(self, task_dir: Path, filename: str) -> Path:
candidate = Path(filename)
if candidate.is_absolute():
raise ArtifactError(f"Artifact error: stage output filename must be relative: {filename}")
resolved = (task_dir / candidate).resolve()
try:
resolved.relative_to(task_dir.resolve())
except ValueError as exc:
raise ArtifactError(f"Artifact error: stage output escapes task directory: {filename}") from exc
return resolved
def default_run_id(now: datetime | None = None) -> str:
"""Return a filesystem-friendly UTC run id."""
value = now or datetime.now(timezone.utc)
return value.strftime("%Y%m%dT%H%M%SZ")

68
nightshift/cli.py Normal file
View File

@ -0,0 +1,68 @@
"""Command line interface for NightShift."""
from __future__ import annotations
import argparse
from pathlib import Path
import sys
from .config import validate_config
from .errors import NightShiftError
from .init import init_project
from .tasks import parse_task_file
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(prog="nightshift", description="Auditable AI pipeline runner.")
parser.add_argument("--version", action="version", version="nightshift 0.1.0")
subparsers = parser.add_subparsers(dest="command", required=True)
init_parser = subparsers.add_parser("init", help="Create starter NightShift files.")
init_parser.add_argument("--root", default=".", help="Directory to initialize.")
init_parser.add_argument("--force", action="store_true", help="Overwrite existing starter files.")
validate_parser = subparsers.add_parser("validate", help="Validate nightshift.yaml.")
validate_parser.add_argument("--config", default="nightshift.yaml", help="Config file to validate.")
subparsers.add_parser("run", help="Pipeline execution is planned for a later phase.")
subparsers.add_parser("status", help="Status reporting is planned for a later phase.")
return parser
def main(argv: list[str] | None = None) -> int:
parser = build_parser()
args = parser.parse_args(argv)
try:
if args.command == "init":
written = init_project(Path(args.root), force=args.force)
print("Created NightShift starter files:")
for path in written:
print(f"- {path}")
return 0
if args.command == "validate":
config = validate_config(args.config)
tasks = parse_task_file(config.project.root, config.project.task_file)
incomplete = sum(1 for task in tasks if not task.completed)
print(f"Config valid: {config.path}")
print(f"Project: {config.project.name}")
print(f"Stages: {len(config.pipeline.stages)}")
print(f"Tasks: {len(tasks)}")
print(f"Incomplete tasks: {incomplete}")
return 0
if args.command in {"run", "status"}:
parser.error(f"'{args.command}' is not implemented yet.")
except NightShiftError as exc:
print(str(exc), file=sys.stderr)
return 1
return 0
if __name__ == "__main__":
raise SystemExit(main())

148
nightshift/commands.py Normal file
View File

@ -0,0 +1,148 @@
"""Command stage execution."""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
import subprocess
import time
from .artifacts import ArtifactStore
from .config import SafetyConfig, StageConfig
from .errors import CommandError, SafetyError
from .safety import ensure_command_allowed, resolve_project_root
from .stages import StageResult
DEFAULT_COMMAND_TIMEOUT_SECONDS = 300
@dataclass(frozen=True)
class CommandRun:
command: str
exit_code: int
stdout: str
stderr: str
duration_seconds: float
timed_out: bool = False
class CommandExecutor:
"""Run configured command stages and persist their output."""
def __init__(
self,
project_root: str | Path,
safety: SafetyConfig,
artifacts: ArtifactStore,
timeout_seconds: int = DEFAULT_COMMAND_TIMEOUT_SECONDS,
) -> None:
self.project_root = resolve_project_root(project_root)
self.safety = safety
self.artifacts = artifacts
self.timeout_seconds = timeout_seconds
def run_stage(self, stage: StageConfig, task_id: str) -> StageResult:
if stage.type != "command":
raise CommandError(
f"Command error: stage '{stage.id}' has type '{stage.type}', expected 'command'."
)
if not stage.commands:
raise CommandError(f"Command error: stage '{stage.id}' has no commands.")
runs: list[CommandRun] = []
status = "pass"
reason = "All commands passed."
for command in stage.commands:
run = self.run_command(command)
runs.append(run)
if run.timed_out:
status = "fail"
reason = f"Command timed out after {self.timeout_seconds}s: {run.command}"
break
if run.exit_code != 0:
status = "fail"
reason = f"Command exited with code {run.exit_code}: {run.command}"
break
output_filename = stage.output or f"{stage.id}-output.txt"
output_path = self.artifacts.write_command_output(
task_id,
output_filename,
format_command_runs(stage.id, runs),
)
return StageResult(
stage_id=stage.id,
status=status, # type: ignore[arg-type]
reason=reason,
output_path=str(output_path.relative_to(self.project_root)),
)
def run_command(self, command: str) -> CommandRun:
try:
normalized = ensure_command_allowed(
command,
self.safety.allowed_commands,
self.safety.forbidden_commands,
)
except SafetyError as exc:
raise CommandError(str(exc)) from exc
started = time.monotonic()
try:
completed = subprocess.run(
normalized,
cwd=self.project_root,
shell=True,
capture_output=True,
text=True,
timeout=self.timeout_seconds,
)
duration = time.monotonic() - started
return CommandRun(
command=normalized,
exit_code=completed.returncode,
stdout=completed.stdout,
stderr=completed.stderr,
duration_seconds=duration,
)
except subprocess.TimeoutExpired as exc:
duration = time.monotonic() - started
return CommandRun(
command=normalized,
exit_code=-1,
stdout=exc.stdout or "",
stderr=exc.stderr or "",
duration_seconds=duration,
timed_out=True,
)
def format_command_runs(stage_id: str, runs: list[CommandRun]) -> str:
lines = [f"# Command Output: {stage_id}", ""]
for index, run in enumerate(runs, start=1):
lines.extend(
[
f"## Command {index}",
"",
f"Command: `{run.command}`",
f"Exit code: {run.exit_code}",
f"Duration seconds: {run.duration_seconds:.3f}",
f"Timed out: {str(run.timed_out).lower()}",
"",
"### stdout",
"",
"```text",
run.stdout.rstrip(),
"```",
"",
"### stderr",
"",
"```text",
run.stderr.rstrip(),
"```",
"",
]
)
return "\n".join(lines)

407
nightshift/config.py Normal file
View File

@ -0,0 +1,407 @@
"""Typed NightShift configuration loading and validation."""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
import re
from typing import Any
from .errors import ConfigError
from .errors import SafetyError
from .safety import (
ensure_command_allowed,
resolve_inside_root,
resolve_project_root,
safe_artifact_path,
validate_scoped_paths,
)
@dataclass(frozen=True)
class ProjectConfig:
name: str
root: Path
task_file: Path
artifact_dir: Path
@dataclass(frozen=True)
class SafetyConfig:
require_clean_worktree: bool
scoped_paths: tuple[str, ...]
allowed_commands: tuple[str, ...]
forbidden_commands: tuple[str, ...]
@dataclass(frozen=True)
class AgentConfig:
id: str
backend: str
command: str | None
system_prompt: Path
model: str | None = None
role: str | None = None
@dataclass(frozen=True)
class StageConfig:
id: str
type: str
agent: str | None = None
commands: tuple[str, ...] = ()
output: str | None = None
on_fail: str | None = None
@dataclass(frozen=True)
class PipelineConfig:
max_task_retries: int
stages: tuple[StageConfig, ...]
@dataclass(frozen=True)
class NightShiftConfig:
path: Path
project: ProjectConfig
safety: SafetyConfig
agents: dict[str, AgentConfig]
pipeline: PipelineConfig
AGENT_STAGE_TYPES = {"agent", "agent_review", "review"}
COMMAND_STAGE_TYPES = {"command"}
SUPPORTED_STAGE_TYPES = AGENT_STAGE_TYPES | COMMAND_STAGE_TYPES | {"summarize"}
def load_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
"""Load and validate a NightShift YAML config file."""
config_path = Path(path).resolve()
if not config_path.exists():
raise ConfigError(f"Config file not found: {config_path}")
raw = _load_yaml_mapping(config_path)
return parse_config(raw, config_path)
def validate_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
"""Load a config and validate referenced local files."""
config = load_config(path)
try:
root = resolve_project_root(config.project.root)
safe_artifact_path(root, config.project.artifact_dir)
validate_scoped_paths(root, config.safety.scoped_paths)
except SafetyError as exc:
raise ConfigError(str(exc)) from exc
task_file = resolve_inside_root(root, config.project.task_file, "project.task_file")
if not task_file.exists():
raise ConfigError(f"Config error: task file does not exist: {task_file}")
for agent in config.agents.values():
prompt = resolve_inside_root(root, agent.system_prompt, f"agents.{agent.id}.system_prompt")
if not prompt.exists():
raise ConfigError(
"Config error: agent "
f"'{agent.id}' system prompt does not exist: {agent.system_prompt}"
)
for stage in config.pipeline.stages:
for command in stage.commands:
try:
ensure_command_allowed(
command,
config.safety.allowed_commands,
config.safety.forbidden_commands,
)
except SafetyError as exc:
raise ConfigError(f"Config error: stage '{stage.id}' {exc}") from exc
return config
def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
"""Convert a raw mapping into typed config objects."""
_require_mapping(raw, "config")
for section in ("project", "safety", "agents", "pipeline"):
if section not in raw:
raise ConfigError(f"Config error: missing required section '{section}'.")
project_raw = _require_mapping(raw["project"], "project")
project_name = _require_string(project_raw, "name", "project")
project_root_value = _require_string(project_raw, "root", "project")
project_root = (config_path.parent / project_root_value).resolve()
project = ProjectConfig(
name=project_name,
root=project_root,
task_file=Path(_require_string(project_raw, "task_file", "project")),
artifact_dir=Path(_require_string(project_raw, "artifact_dir", "project")),
)
safety_raw = _require_mapping(raw["safety"], "safety")
safety = SafetyConfig(
require_clean_worktree=bool(safety_raw.get("require_clean_worktree", False)),
scoped_paths=_string_tuple(safety_raw.get("scoped_paths", []), "safety.scoped_paths"),
allowed_commands=_string_tuple(safety_raw.get("allowed_commands", []), "safety.allowed_commands"),
forbidden_commands=_string_tuple(
safety_raw.get("forbidden_commands", []), "safety.forbidden_commands"
),
)
agents_raw = _require_mapping(raw["agents"], "agents")
if not agents_raw:
raise ConfigError("Config error: at least one agent must be defined.")
agents: dict[str, AgentConfig] = {}
for agent_id, agent_raw_value in agents_raw.items():
agent_raw = _require_mapping(agent_raw_value, f"agents.{agent_id}")
backend = _require_string(agent_raw, "backend", f"agents.{agent_id}")
command = _optional_string(agent_raw.get("command"), f"agents.{agent_id}.command")
system_prompt = Path(_require_string(agent_raw, "system_prompt", f"agents.{agent_id}"))
agents[str(agent_id)] = AgentConfig(
id=str(agent_id),
backend=backend,
command=command,
system_prompt=system_prompt,
model=_optional_string(agent_raw.get("model"), f"agents.{agent_id}.model"),
role=_optional_string(agent_raw.get("role"), f"agents.{agent_id}.role"),
)
pipeline_raw = _require_mapping(raw["pipeline"], "pipeline")
max_task_retries = int(pipeline_raw.get("max_task_retries", 0))
if max_task_retries < 0:
raise ConfigError("Config error: pipeline.max_task_retries must be zero or greater.")
stages_raw = pipeline_raw.get("stages")
if not isinstance(stages_raw, list) or not stages_raw:
raise ConfigError("Config error: pipeline.stages must be a non-empty list.")
stages: list[StageConfig] = []
seen_stage_ids: set[str] = set()
for index, stage_raw_value in enumerate(stages_raw):
stage_context = f"pipeline.stages[{index}]"
stage_raw = _require_mapping(stage_raw_value, stage_context)
stage_id = _require_string(stage_raw, "id", stage_context)
if stage_id in seen_stage_ids:
raise ConfigError(f"Config error: duplicate pipeline stage id '{stage_id}'.")
seen_stage_ids.add(stage_id)
stage_type = _require_string(stage_raw, "type", stage_context)
if stage_type not in SUPPORTED_STAGE_TYPES:
supported = ", ".join(sorted(SUPPORTED_STAGE_TYPES))
raise ConfigError(
f"Config error: stage '{stage_id}' has unsupported type '{stage_type}'. "
f"Supported types: {supported}."
)
agent = _optional_string(stage_raw.get("agent"), f"{stage_context}.agent")
commands = _string_tuple(stage_raw.get("commands", []), f"{stage_context}.commands")
if stage_type in AGENT_STAGE_TYPES:
if agent is None:
raise ConfigError(f"Config error: agent stage '{stage_id}' must reference an agent.")
if agent not in agents:
defined = ", ".join(sorted(agents))
raise ConfigError(
f"Config error: pipeline stage '{stage_id}' references unknown agent "
f"'{agent}'. Defined agents: {defined}."
)
if stage_type in COMMAND_STAGE_TYPES and not commands:
raise ConfigError(f"Config error: command stage '{stage_id}' must define commands.")
stages.append(
StageConfig(
id=stage_id,
type=stage_type,
agent=agent,
commands=commands,
output=_optional_string(stage_raw.get("output"), f"{stage_context}.output"),
on_fail=_optional_string(stage_raw.get("on_fail"), f"{stage_context}.on_fail"),
)
)
stage_ids = {stage.id for stage in stages}
for stage in stages:
if stage.on_fail and stage.on_fail not in stage_ids:
raise ConfigError(
f"Config error: stage '{stage.id}' on_fail references unknown stage '{stage.on_fail}'."
)
return NightShiftConfig(
path=config_path,
project=project,
safety=safety,
agents=agents,
pipeline=PipelineConfig(max_task_retries=max_task_retries, stages=tuple(stages)),
)
def _load_yaml_mapping(path: Path) -> dict[str, Any]:
text = path.read_text(encoding="utf-8")
try:
import yaml # type: ignore[import-not-found]
except ModuleNotFoundError:
data = _parse_simple_yaml(text)
else:
data = yaml.safe_load(text)
if data is None:
data = {}
if not isinstance(data, dict):
raise ConfigError("Config error: top-level YAML value must be a mapping.")
return data
def _parse_simple_yaml(text: str) -> dict[str, Any]:
"""Parse the small YAML subset used by NightShift starter configs.
PyYAML is used when available. This fallback keeps `nightshift init` and
`nightshift validate` usable in a fresh checkout with only the stdlib.
"""
lines = []
for line_number, raw_line in enumerate(text.splitlines(), start=1):
without_comment = raw_line.split("#", 1)[0].rstrip()
if without_comment.strip():
indent = len(without_comment) - len(without_comment.lstrip(" "))
lines.append((line_number, indent, without_comment.strip()))
index = 0
def parse_block(expected_indent: int) -> Any:
nonlocal index
if index >= len(lines):
return {}
_, current_indent, content = lines[index]
if current_indent < expected_indent:
return {}
if current_indent != expected_indent:
line_number = lines[index][0]
raise ConfigError(f"Config error: invalid indentation near line {line_number}.")
if content.startswith("- "):
sequence: list[Any] = []
while index < len(lines):
line_number, indent, item = lines[index]
if indent < expected_indent:
break
if indent != expected_indent or not item.startswith("- "):
break
item_content = item[2:].strip()
index += 1
if not item_content:
sequence.append(parse_block(expected_indent + 2))
elif _looks_like_key_value(item_content):
key, value = _split_key_value(item_content, line_number)
mapping: dict[str, Any] = {}
mapping[key] = (
parse_block(expected_indent + 2)
if value == ""
else _parse_scalar(value)
)
while index < len(lines):
_, child_indent, child_content = lines[index]
if child_indent <= expected_indent:
break
if child_indent != expected_indent + 2:
child_line = lines[index][0]
raise ConfigError(
f"Config error: invalid indentation near line {child_line}."
)
if child_content.startswith("- "):
break
child_key, child_value = _split_key_value(child_content, lines[index][0])
index += 1
mapping[child_key] = (
parse_block(expected_indent + 4)
if child_value == ""
else _parse_scalar(child_value)
)
sequence.append(mapping)
else:
sequence.append(_parse_scalar(item_content))
return sequence
mapping: dict[str, Any] = {}
while index < len(lines):
line_number, indent, item = lines[index]
if indent < expected_indent:
break
if indent != expected_indent:
break
if item.startswith("- "):
break
key, value = _split_key_value(item, line_number)
index += 1
mapping[key] = parse_block(expected_indent + 2) if value == "" else _parse_scalar(value)
return mapping
parsed = parse_block(0)
if not isinstance(parsed, dict):
raise ConfigError("Config error: top-level YAML value must be a mapping.")
return parsed
def _looks_like_key_value(value: str) -> bool:
return bool(re.match(r"^[A-Za-z0-9_-]+:", value))
def _split_key_value(value: str, line_number: int) -> tuple[str, str]:
if ":" not in value:
raise ConfigError(f"Config error: expected key/value pair near line {line_number}.")
key, raw_value = value.split(":", 1)
key = key.strip()
if not key:
raise ConfigError(f"Config error: empty key near line {line_number}.")
return key, raw_value.strip()
def _parse_scalar(value: str) -> Any:
if value in {"true", "True"}:
return True
if value in {"false", "False"}:
return False
if value in {"null", "Null", "~"}:
return None
if re.fullmatch(r"-?\d+", value):
return int(value)
if (value.startswith('"') and value.endswith('"')) or (
value.startswith("'") and value.endswith("'")
):
return value[1:-1]
return value
def _require_mapping(value: Any, context: str) -> dict[str, Any]:
if not isinstance(value, dict):
raise ConfigError(f"Config error: '{context}' must be a mapping.")
return value
def _require_string(mapping: dict[str, Any], key: str, context: str) -> str:
if key not in mapping:
raise ConfigError(f"Config error: missing required key '{context}.{key}'.")
value = mapping[key]
if not isinstance(value, str) or not value:
raise ConfigError(f"Config error: '{context}.{key}' must be a non-empty string.")
return value
def _optional_string(value: Any, context: str) -> str | None:
if value is None:
return None
if not isinstance(value, str) or not value:
raise ConfigError(f"Config error: '{context}' must be a non-empty string when set.")
return value
def _string_tuple(value: Any, context: str) -> tuple[str, ...]:
if value is None:
return ()
if not isinstance(value, list) or not all(isinstance(item, str) and item for item in value):
raise ConfigError(f"Config error: '{context}' must be a list of non-empty strings.")
return tuple(value)

29
nightshift/errors.py Normal file
View File

@ -0,0 +1,29 @@
"""Project-specific exceptions."""
class NightShiftError(Exception):
"""Base exception for NightShift failures."""
class ConfigError(NightShiftError):
"""Raised when a NightShift config is missing or invalid."""
class InitError(NightShiftError):
"""Raised when project initialization cannot proceed."""
class SafetyError(NightShiftError):
"""Raised when a path or command violates configured safety rules."""
class TaskError(NightShiftError):
"""Raised when task parsing or selection fails."""
class ArtifactError(NightShiftError):
"""Raised when artifact storage cannot proceed safely."""
class CommandError(NightShiftError):
"""Raised when command stage execution cannot proceed."""

43
nightshift/init.py Normal file
View File

@ -0,0 +1,43 @@
"""Project initialization helpers."""
from __future__ import annotations
from pathlib import Path
from .errors import InitError
from . import templates
STARTER_FILES = {
"nightshift.yaml": templates.NIGHTSHIFT_YAML,
"tasks.md": templates.TASKS_MD,
"agents/planner.md": templates.PLANNER_PROMPT,
"agents/implementer.md": templates.IMPLEMENTER_PROMPT,
"agents/reviewer.md": templates.REVIEWER_PROMPT,
}
def init_project(root: Path, force: bool = False) -> list[Path]:
"""Create starter NightShift files under root.
Existing files are left untouched unless force is true.
"""
root = root.resolve()
targets = [root / relative for relative in STARTER_FILES]
existing = [path for path in targets if path.exists()]
if existing and not force:
formatted = ", ".join(str(path.relative_to(root)) for path in existing)
raise InitError(
"Initialization would overwrite existing files. "
f"Use --force to replace: {formatted}"
)
written: list[Path] = []
for relative, content in STARTER_FILES.items():
path = root / relative
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(content, encoding="utf-8")
written.append(path)
return written

119
nightshift/safety.py Normal file
View File

@ -0,0 +1,119 @@
"""Safety helpers for paths and commands."""
from __future__ import annotations
from pathlib import Path
from .errors import SafetyError
def resolve_project_root(root: str | Path) -> Path:
"""Resolve and validate a project root directory."""
resolved = Path(root).resolve()
if not resolved.exists():
raise SafetyError(f"Safety error: project root does not exist: {resolved}")
if not resolved.is_dir():
raise SafetyError(f"Safety error: project root is not a directory: {resolved}")
return resolved
def resolve_inside_root(root: str | Path, path: str | Path, context: str = "path") -> Path:
"""Resolve a path and reject values outside the project root."""
resolved_root = resolve_project_root(root)
candidate = Path(path)
resolved = candidate.resolve() if candidate.is_absolute() else (resolved_root / candidate).resolve()
try:
resolved.relative_to(resolved_root)
except ValueError as exc:
raise SafetyError(
f"Safety error: {context} resolves outside project root: {path}"
) from exc
return resolved
def validate_scoped_paths(root: str | Path, scoped_paths: list[str] | tuple[str, ...]) -> tuple[Path, ...]:
"""Validate that every configured scoped path remains inside the root."""
return tuple(
resolve_inside_root(root, scoped_path, f"scoped path '{scoped_path}'")
for scoped_path in scoped_paths
)
def safe_artifact_path(
root: str | Path,
artifact_dir: str | Path,
*parts: str | Path,
create_parent: bool = False,
) -> Path:
"""Build an artifact path that cannot escape the configured artifact tree."""
artifact_root = resolve_inside_root(root, artifact_dir, "artifact directory")
path = artifact_root
for part in parts:
candidate = Path(part)
if candidate.is_absolute():
raise SafetyError(f"Safety error: artifact path segment must be relative: {part}")
path = path / candidate
resolved = path.resolve()
try:
resolved.relative_to(artifact_root)
except ValueError as exc:
raise SafetyError(f"Safety error: artifact path escapes artifact directory: {path}") from exc
if create_parent:
resolved.parent.mkdir(parents=True, exist_ok=True)
return resolved
def normalize_command(command: str) -> str:
"""Normalize command whitespace for safety comparisons."""
return " ".join(command.strip().split())
def ensure_command_allowed(
command: str,
allowed_commands: list[str] | tuple[str, ...],
forbidden_commands: list[str] | tuple[str, ...],
) -> str:
"""Validate one command against forbidden fragments and an exact allowlist."""
if not isinstance(command, str) or not command.strip():
raise SafetyError("Safety error: command must be a non-empty string.")
normalized = normalize_command(command)
lowered = normalized.lower()
for fragment in forbidden_commands:
normalized_fragment = normalize_command(fragment).lower()
if normalized_fragment and normalized_fragment in lowered:
raise SafetyError(
f"Safety error: command contains forbidden fragment '{fragment}': {command}"
)
allowed = {normalize_command(item) for item in allowed_commands}
if normalized not in allowed:
allowed_display = ", ".join(sorted(allowed)) or "<none>"
raise SafetyError(
f"Safety error: command is not allowlisted: {command}. "
f"Allowed commands: {allowed_display}."
)
return normalized
def validate_stage_commands(
commands: list[str] | tuple[str, ...],
allowed_commands: list[str] | tuple[str, ...],
forbidden_commands: list[str] | tuple[str, ...],
) -> tuple[str, ...]:
"""Validate each command in a command stage."""
return tuple(
ensure_command_allowed(command, allowed_commands, forbidden_commands)
for command in commands
)

19
nightshift/stages.py Normal file
View File

@ -0,0 +1,19 @@
"""Shared stage result types."""
from __future__ import annotations
from dataclasses import dataclass
from typing import Literal
StageStatus = Literal["pass", "fail", "retry", "escalate"]
@dataclass(frozen=True)
class StageResult:
stage_id: str
status: StageStatus
reason: str
output_path: str | None = None
next_stage: str | None = None
context_update: str | None = None

163
nightshift/tasks.py Normal file
View File

@ -0,0 +1,163 @@
"""Markdown task parsing and selection."""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
import re
from .errors import SafetyError, TaskError
from .safety import resolve_inside_root
TASK_HEADER_RE = re.compile(r"^\s*-\s+\[(?P<mark>[ xX])\]\s+(?P<id>[A-Z]+-\d+):\s+(?P<title>.+?)\s*$")
CHECKBOX_RE = re.compile(r"^\s*-\s+\[[^\]]*\]")
SECTION_RE = re.compile(r"^(?P<name>[A-Za-z][A-Za-z ]+):\s*$")
@dataclass(frozen=True)
class Task:
id: str
title: str
completed: bool
description: str
acceptance_criteria: tuple[str, ...]
dependencies: tuple[str, ...]
raw_markdown: str
line_number: int
def parse_task_file(project_root: str | Path, task_file: str | Path) -> list[Task]:
"""Load and parse a task markdown file inside the project root."""
try:
path = resolve_inside_root(project_root, task_file, "task file")
except SafetyError as exc:
raise TaskError(str(exc)) from exc
if not path.exists():
raise TaskError(f"Task error: task file does not exist: {path}")
return parse_tasks(path.read_text(encoding="utf-8"))
def parse_tasks(markdown: str) -> list[Task]:
"""Parse NightShift's documented markdown checklist task format."""
lines = markdown.splitlines()
tasks: list[Task] = []
seen_ids: set[str] = set()
index = 0
while index < len(lines):
line = lines[index]
header = TASK_HEADER_RE.match(line)
if not header:
if CHECKBOX_RE.match(line):
raise TaskError(
f"Task error: malformed task header on line {index + 1}. "
"Expected '- [ ] TASK-001: Task title'."
)
index += 1
continue
task_id = header.group("id")
if task_id in seen_ids:
raise TaskError(f"Task error: duplicate task id '{task_id}' on line {index + 1}.")
seen_ids.add(task_id)
start = index
index += 1
while index < len(lines) and not TASK_HEADER_RE.match(lines[index]):
if CHECKBOX_RE.match(lines[index]):
raise TaskError(
f"Task error: malformed task header on line {index + 1}. "
"Expected '- [ ] TASK-001: Task title'."
)
index += 1
block = lines[start:index]
description = _extract_section(block, "Description")
acceptance_criteria = tuple(_extract_bullets(block, "Acceptance Criteria"))
dependencies = tuple(_extract_bullets(block, "Dependencies"))
if not acceptance_criteria:
raise TaskError(
f"Task error: task '{task_id}' is missing Acceptance Criteria bullets."
)
tasks.append(
Task(
id=task_id,
title=header.group("title"),
completed=header.group("mark").lower() == "x",
description=description,
acceptance_criteria=acceptance_criteria,
dependencies=dependencies,
raw_markdown="\n".join(block).strip() + "\n",
line_number=start + 1,
)
)
if not tasks:
raise TaskError("Task error: no tasks found. Expected '- [ ] TASK-001: Task title'.")
return tasks
def select_next_incomplete_task(tasks: list[Task] | tuple[Task, ...]) -> Task:
"""Return the first incomplete task in file order."""
for task in tasks:
if not task.completed:
return task
raise TaskError("Task error: no incomplete tasks found.")
def select_task_by_id(tasks: list[Task] | tuple[Task, ...], task_id: str) -> Task:
"""Return a task by id."""
for task in tasks:
if task.id == task_id:
return task
available = ", ".join(task.id for task in tasks) or "<none>"
raise TaskError(f"Task error: unknown task id '{task_id}'. Available tasks: {available}.")
def _extract_section(block: list[str], section_name: str) -> str:
start = _find_section_index(block, section_name)
if start is None:
return ""
collected: list[str] = []
for line in block[start + 1 :]:
if SECTION_RE.match(line.strip()):
break
collected.append(line)
return "\n".join(collected).strip()
def _extract_bullets(block: list[str], section_name: str) -> list[str]:
start = _find_section_index(block, section_name)
if start is None:
return []
bullets: list[str] = []
for line in block[start + 1 :]:
stripped = line.strip()
if SECTION_RE.match(stripped):
break
if stripped.startswith("- "):
value = stripped[2:].strip()
if value:
bullets.append(value)
return bullets
def _find_section_index(block: list[str], section_name: str) -> int | None:
expected = f"{section_name}:".lower()
for index, line in enumerate(block):
if line.strip().lower() == expected:
return index
return None

124
nightshift/templates.py Normal file
View File

@ -0,0 +1,124 @@
"""Built-in starter file templates for `nightshift init`."""
NIGHTSHIFT_YAML = """project:
name: example-project
root: .
task_file: tasks.md
artifact_dir: .nightshift
safety:
require_clean_worktree: false
scoped_paths:
- .
allowed_commands:
- python -m unittest
forbidden_commands:
- rm -rf
- git push
- curl | bash
agents:
planner:
backend: command
command: echo
system_prompt: agents/planner.md
implementer:
backend: command
command: echo
system_prompt: agents/implementer.md
reviewer:
backend: command
command: echo
system_prompt: agents/reviewer.md
pipeline:
max_task_retries: 3
stages:
- id: plan
type: agent
agent: planner
output: plan.md
- id: review_plan
type: agent_review
agent: reviewer
on_fail: plan
output: plan-review.md
- id: implement
type: agent
agent: implementer
output: implementation-log.md
- id: test
type: command
commands:
- python -m unittest
output: test-output.txt
- id: review
type: agent_review
agent: reviewer
on_fail: implement
output: review.md
- id: summarize
type: summarize
output: final-notes.md
"""
TASKS_MD = """# Tasks
- [ ] TASK-001: Add your first NightShift task
Description:
Describe the coding task NightShift should work on.
Acceptance Criteria:
- The expected behavior is clear
- The task can be reviewed from generated artifacts
"""
PLANNER_PROMPT = """# Planner
You are the planning agent for NightShift.
Create a conservative implementation plan for one coding task.
Rules:
- Do not write code.
- Identify relevant files.
- Preserve existing behavior.
- Prefer small changes.
- Include test strategy.
- Include risks.
"""
IMPLEMENTER_PROMPT = """# Implementer
You are the implementation agent for NightShift.
Implement the approved plan inside the scoped project directory.
Rules:
- Make the smallest correct change.
- Do not edit files outside scope.
- Preserve existing style.
- Write useful implementation notes.
"""
REVIEWER_PROMPT = """# Reviewer
You are the review agent for NightShift.
Decide whether the current task should pass, retry implementation, retry planning, or fail.
Output exactly:
status: pass | fail | retry | escalate
reason: <short explanation>
next_stage: <optional stage id>
context_update: <compact useful note>
"""

20
pyproject.toml Normal file
View File

@ -0,0 +1,20 @@
[build-system]
requires = ["setuptools>=69"]
build-backend = "setuptools.build_meta"
[project]
name = "nightshift"
version = "0.1.0"
description = "Auditable local-first AI coding pipelines."
readme = "README.md"
requires-python = ">=3.11"
license = "GPL-3.0-only"
authors = [
{ name = "K455" }
]
[project.scripts]
nightshift = "nightshift.cli:main"
[tool.setuptools.packages.find]
include = ["nightshift*"]

1
tests/__init__.py Normal file
View File

@ -0,0 +1 @@
"""NightShift test suite."""

56
tests/test_artifacts.py Normal file
View File

@ -0,0 +1,56 @@
from pathlib import Path
import tempfile
import unittest
from nightshift.artifacts import ArtifactStore
from nightshift.errors import ArtifactError
from nightshift.init import init_project
from nightshift.tasks import parse_task_file
class ArtifactStoreTests(unittest.TestCase):
def test_initialize_run_creates_base_artifact_tree(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
store = ArtifactStore(root, ".nightshift", run_id="test-run")
store.initialize_run()
self.assertTrue((root / ".nightshift").is_dir())
self.assertTrue((root / ".nightshift" / "project-context.md").exists())
self.assertTrue((root / ".nightshift" / "runs" / "test-run").is_dir())
self.assertTrue((root / ".nightshift" / "runs" / "test-run" / "tasks").is_dir())
self.assertTrue((root / ".nightshift" / "runs" / "test-run" / "run-summary.md").exists())
def test_writes_config_task_stage_and_final_artifacts(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
task = parse_task_file(root, "tasks.md")[0]
store = ArtifactStore(root, ".nightshift", run_id="test-run")
config_path = store.write_config_snapshot(root / "nightshift.yaml")
task_path = store.write_task_snapshot(task)
stage_path = store.write_stage_output(task.id, "plan.md", "# Plan\n")
command_path = store.write_command_output(task.id, "test-output.txt", "ok\n")
notes_path = store.write_final_task_notes(task.id, "# Notes\n")
self.assertTrue(config_path.exists())
self.assertIn("project:", config_path.read_text(encoding="utf-8"))
self.assertTrue(task_path.exists())
self.assertIn(task.id, task_path.read_text(encoding="utf-8"))
self.assertEqual(stage_path.read_text(encoding="utf-8"), "# Plan\n")
self.assertEqual(command_path.read_text(encoding="utf-8"), "ok\n")
self.assertEqual(notes_path.read_text(encoding="utf-8"), "# Notes\n")
def test_stage_output_cannot_escape_task_directory(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
store = ArtifactStore(root, ".nightshift", run_id="test-run")
with self.assertRaisesRegex(ArtifactError, "escapes task directory"):
store.write_stage_output("TASK-001", "../leak.txt", "nope")
if __name__ == "__main__":
unittest.main()

94
tests/test_commands.py Normal file
View File

@ -0,0 +1,94 @@
from pathlib import Path
import tempfile
import unittest
from nightshift.artifacts import ArtifactStore
from nightshift.commands import CommandExecutor
from nightshift.config import SafetyConfig, StageConfig
from nightshift.errors import CommandError
PASSING_COMMAND = 'python -c "print(\'ok\')"'
FAILING_COMMAND = 'python -c "import sys; print(\'bad\'); sys.exit(7)"'
class CommandExecutorTests(unittest.TestCase):
def test_passing_command_stage_returns_pass_and_writes_output(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
executor = CommandExecutor(
root,
SafetyConfig(
require_clean_worktree=False,
scoped_paths=(".",),
allowed_commands=(PASSING_COMMAND,),
forbidden_commands=("rm -rf",),
),
artifacts,
)
stage = StageConfig(
id="test",
type="command",
commands=(PASSING_COMMAND,),
output="test-output.txt",
)
result = executor.run_stage(stage, "TASK-001")
self.assertEqual(result.status, "pass")
output_path = root / result.output_path
self.assertTrue(output_path.exists())
output = output_path.read_text(encoding="utf-8")
self.assertIn("Exit code: 0", output)
self.assertIn("ok", output)
def test_failing_command_stage_returns_fail_and_writes_output(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
executor = CommandExecutor(
root,
SafetyConfig(
require_clean_worktree=False,
scoped_paths=(".",),
allowed_commands=(FAILING_COMMAND,),
forbidden_commands=("rm -rf",),
),
artifacts,
)
stage = StageConfig(
id="test",
type="command",
commands=(FAILING_COMMAND,),
output="test-output.txt",
)
result = executor.run_stage(stage, "TASK-001")
self.assertEqual(result.status, "fail")
self.assertIn("code 7", result.reason)
output = (root / result.output_path).read_text(encoding="utf-8")
self.assertIn("Exit code: 7", output)
self.assertIn("bad", output)
def test_unallowlisted_command_is_rejected_before_execution(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
executor = CommandExecutor(
root,
SafetyConfig(
require_clean_worktree=False,
scoped_paths=(".",),
allowed_commands=(PASSING_COMMAND,),
forbidden_commands=("rm -rf",),
),
ArtifactStore(root, ".nightshift", run_id="test-run"),
)
with self.assertRaisesRegex(CommandError, "not allowlisted"):
executor.run_command(FAILING_COMMAND)
if __name__ == "__main__":
unittest.main()

84
tests/test_config.py Normal file
View File

@ -0,0 +1,84 @@
from pathlib import Path
import tempfile
import unittest
from nightshift.config import load_config, validate_config
from nightshift.errors import ConfigError
from nightshift.init import init_project
class ConfigTests(unittest.TestCase):
def test_valid_config_loads(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config = validate_config(root / "nightshift.yaml")
self.assertEqual(config.project.name, "example-project")
self.assertIn("planner", config.agents)
self.assertEqual(config.pipeline.max_task_retries, 3)
self.assertEqual(config.pipeline.stages[0].id, "plan")
def test_missing_required_section_fails_clearly(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
config_path = root / "nightshift.yaml"
config_path.write_text("project:\n name: broken\n", encoding="utf-8")
with self.assertRaisesRegex(ConfigError, "missing required section 'safety'"):
load_config(config_path)
def test_pipeline_stage_cannot_reference_missing_agent(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_text = config_path.read_text(encoding="utf-8").replace(
"agent: planner", "agent: critic", 1
)
config_path.write_text(config_text, encoding="utf-8")
with self.assertRaisesRegex(ConfigError, "references unknown agent 'critic'"):
load_config(config_path)
def test_on_fail_must_reference_existing_stage(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_text = config_path.read_text(encoding="utf-8").replace(
"on_fail: plan", "on_fail: missing_stage", 1
)
config_path.write_text(config_text, encoding="utf-8")
with self.assertRaisesRegex(ConfigError, "on_fail references unknown stage"):
load_config(config_path)
def test_validate_requires_prompt_files(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
(root / "agents" / "planner.md").unlink()
with self.assertRaisesRegex(ConfigError, "system prompt does not exist"):
validate_config(root / "nightshift.yaml")
def test_validate_rejects_unallowlisted_stage_command(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
config_path = root / "nightshift.yaml"
config_text = config_path.read_text(encoding="utf-8").replace(
"- python -m unittest",
"- python -m pytest",
1,
)
config_path.write_text(config_text, encoding="utf-8")
with self.assertRaisesRegex(ConfigError, "not allowlisted"):
validate_config(config_path)
if __name__ == "__main__":
unittest.main()

43
tests/test_init.py Normal file
View File

@ -0,0 +1,43 @@
from pathlib import Path
import tempfile
import unittest
from nightshift.errors import InitError
from nightshift.init import init_project
class InitProjectTests(unittest.TestCase):
def test_init_creates_expected_files(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
written = init_project(root)
self.assertIn(root / "nightshift.yaml", written)
self.assertTrue((root / "nightshift.yaml").exists())
self.assertTrue((root / "tasks.md").exists())
self.assertTrue((root / "agents" / "planner.md").exists())
self.assertTrue((root / "agents" / "implementer.md").exists())
self.assertTrue((root / "agents" / "reviewer.md").exists())
def test_init_refuses_to_overwrite_without_force(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
with self.assertRaises(InitError):
init_project(root)
def test_init_can_overwrite_with_force(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
init_project(root)
(root / "tasks.md").write_text("changed", encoding="utf-8")
init_project(root, force=True)
self.assertIn("TASK-001", (root / "tasks.md").read_text(encoding="utf-8"))
if __name__ == "__main__":
unittest.main()

70
tests/test_safety.py Normal file
View File

@ -0,0 +1,70 @@
from pathlib import Path
import tempfile
import unittest
from nightshift.errors import SafetyError
from nightshift.safety import (
ensure_command_allowed,
resolve_inside_root,
resolve_project_root,
safe_artifact_path,
validate_scoped_paths,
)
class SafetyTests(unittest.TestCase):
def test_resolve_project_root_requires_directory(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
self.assertEqual(resolve_project_root(root), root.resolve())
def test_resolve_inside_root_accepts_relative_path(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
resolved = resolve_inside_root(root, "src/module.py")
self.assertEqual(resolved, (root / "src" / "module.py").resolve())
def test_resolve_inside_root_rejects_traversal(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
with self.assertRaisesRegex(SafetyError, "outside project root"):
resolve_inside_root(root, "../outside.txt")
def test_validate_scoped_paths_rejects_escape(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
with self.assertRaisesRegex(SafetyError, "outside project root"):
validate_scoped_paths(root, ("src", "../elsewhere"))
def test_safe_artifact_path_rejects_escape(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
with self.assertRaisesRegex(SafetyError, "escapes artifact directory"):
safe_artifact_path(root, ".nightshift", "runs", "..", "..", "leak.txt")
def test_command_allowlist_accepts_exact_allowed_command(self) -> None:
command = ensure_command_allowed(
"python -m unittest",
("python -m unittest",),
("rm -rf", "git push"),
)
self.assertEqual(command, "python -m unittest")
def test_command_allowlist_rejects_unlisted_command(self) -> None:
with self.assertRaisesRegex(SafetyError, "not allowlisted"):
ensure_command_allowed("python -m pytest", ("python -m unittest",), ())
def test_forbidden_fragment_rejects_dangerous_command(self) -> None:
with self.assertRaisesRegex(SafetyError, "forbidden fragment"):
ensure_command_allowed("echo ok && rm -rf build", ("echo ok && rm -rf build",), ("rm -rf",))
if __name__ == "__main__":
unittest.main()

112
tests/test_tasks.py Normal file
View File

@ -0,0 +1,112 @@
from pathlib import Path
import tempfile
import unittest
from nightshift.errors import TaskError
from nightshift.tasks import (
parse_task_file,
parse_tasks,
select_next_incomplete_task,
select_task_by_id,
)
TASKS_MD = """# Tasks
- [x] TASK-001: Completed task
Description:
Already done.
Acceptance Criteria:
- It is complete
- [ ] TASK-002: Add artifact directory creation
Description:
Create per-run and per-task artifact directories.
Dependencies:
- TASK-001
Acceptance Criteria:
- Creates `.nightshift/runs/<timestamp>/`
- Creates task-specific folder
- Writes task snapshot
"""
class TaskParserTests(unittest.TestCase):
def test_parse_documented_task_format(self) -> None:
tasks = parse_tasks(TASKS_MD)
self.assertEqual(len(tasks), 2)
self.assertEqual(tasks[1].id, "TASK-002")
self.assertEqual(tasks[1].title, "Add artifact directory creation")
self.assertFalse(tasks[1].completed)
self.assertEqual(
tasks[1].description,
"Create per-run and per-task artifact directories.",
)
self.assertEqual(tasks[1].dependencies, ("TASK-001",))
self.assertEqual(len(tasks[1].acceptance_criteria), 3)
self.assertIn("TASK-002", tasks[1].raw_markdown)
def test_select_next_incomplete_task(self) -> None:
tasks = parse_tasks(TASKS_MD)
selected = select_next_incomplete_task(tasks)
self.assertEqual(selected.id, "TASK-002")
def test_select_task_by_id(self) -> None:
tasks = parse_tasks(TASKS_MD)
selected = select_task_by_id(tasks, "TASK-001")
self.assertTrue(selected.completed)
def test_select_task_by_id_reports_available_tasks(self) -> None:
tasks = parse_tasks(TASKS_MD)
with self.assertRaisesRegex(TaskError, "Available tasks: TASK-001, TASK-002"):
select_task_by_id(tasks, "TASK-999")
def test_parse_task_file_rejects_path_traversal(self) -> None:
with tempfile.TemporaryDirectory() as directory:
root = Path(directory)
with self.assertRaisesRegex(TaskError, "outside project root"):
parse_task_file(root, "../tasks.md")
def test_malformed_task_header_has_useful_error(self) -> None:
markdown = """# Tasks
- [ ] Add YAML config loading
Acceptance Criteria:
- Loads config
"""
with self.assertRaisesRegex(TaskError, "malformed task header"):
parse_tasks(markdown)
def test_missing_acceptance_criteria_fails(self) -> None:
markdown = """# Tasks
- [ ] TASK-001: Missing criteria
Description:
No acceptance criteria.
"""
with self.assertRaisesRegex(TaskError, "missing Acceptance Criteria"):
parse_tasks(markdown)
def test_no_tasks_fails(self) -> None:
with self.assertRaisesRegex(TaskError, "no tasks found"):
parse_tasks("# Tasks\n\nNothing here.\n")
if __name__ == "__main__":
unittest.main()