Implement NightShift MVP phases 1-6

Includes starter project generation, validation for configs/tasks/commands, artifact snapshot writing, structured stage results, command output capture, devlogs for phases 1-6, and unit coverage for the implemented MVP layers.
2026-06-14 10:08:37 +00:00 · 2026-05-17 00:17:13 -07:00 · 2026-05-17 00:17:13 -07:00 · c1baf9b7d8
commit c1baf9b7d8
parent 5958c82cf9
26 changed files with 1873 additions and 1 deletions
--- a/docs/devlog/phase1.md
+++ b/docs/devlog/phase1.md
@ -0,0 +1,25 @@
+# Phase 1 Devlog: Skeleton
+
+## Implemented
+
+- Created the `nightshift` Python package.
+- Added a CLI module with `nightshift init`, `nightshift validate`, and placeholder `run` / `status` commands.
+- Added `pyproject.toml` with a console entry point.
+- Added starter file generation for:
+  - `nightshift.yaml`
+  - `tasks.md`
+  - `agents/planner.md`
+  - `agents/implementer.md`
+  - `agents/reviewer.md`
+- Added unit tests for initialization behavior.
+
+## Decisions Made
+
+- Used `argparse` instead of a CLI dependency so the MVP works from a clean Python checkout.
+- Implemented overwrite protection with a `--force` flag. Interactive confirmation was deferred to keep the command deterministic and scriptable.
+- Added `run` and `status` as CLI placeholders only. The phase required an entry point, but actual execution belongs to later phases.
+- Kept starter prompts short and human-readable so they can be revised easily as agent execution is implemented.
+
+## Notes
+
+- Phase 1 establishes the file layout expected by later phases without introducing model or pipeline execution behavior early.
--- a/docs/devlog/phase2.md
+++ b/docs/devlog/phase2.md
@ -0,0 +1,29 @@
+# Phase 2 Devlog: Config Loading
+
+## Implemented
+
+- Added typed configuration objects for project, safety, agents, pipeline, and stages.
+- Added `load_config()` for parsing `nightshift.yaml`.
+- Added `validate_config()` for checking referenced task and prompt files.
+- Added validation for:
+  - required top-level sections
+  - required project fields
+  - non-empty agents
+  - supported stage types
+  - agent stage references
+  - command stage command lists
+  - duplicate stage IDs
+  - `on_fail` references
+- Added unit tests for valid config loading and key invalid config cases.
+
+## Decisions Made
+
+- Used PyYAML automatically when available, but added a small standard-library fallback parser for the YAML subset emitted by `nightshift init`.
+- Deferred full YAML edge-case support to a future dependency/install pass. The fallback is intentionally documented as a starter-config parser, not a general YAML implementation.
+- Validation currently confirms that scoped paths resolve inside the project root, but it does not require every scoped path to already exist. That allows users to scaffold configs before creating all source/test directories.
+- Kept config validation focused on structural correctness and references. Command safety enforcement is left for Phase 3.
+
+## Notes
+
+- The config layer now catches missing agent references with explicit messages such as `pipeline stage 'plan' references unknown agent 'critic'`.
+- Tests use `unittest` from the standard library so they can run before development dependencies are introduced.
--- a/docs/devlog/phase3.md
+++ b/docs/devlog/phase3.md
@ -0,0 +1,24 @@
+# Phase 3 Devlog: Safety Layer
+
+## Implemented
+
+- Added `nightshift/safety.py`.
+- Implemented project root resolution.
+- Implemented path resolution that rejects traversal outside the configured project root.
+- Implemented scoped path validation.
+- Implemented safe artifact path construction that rejects escapes from the artifact directory.
+- Implemented command allowlist checks.
+- Implemented forbidden command fragment checks.
+- Wired command and path safety checks into `validate_config()`.
+- Added tests for path traversal, artifact escapes, allowlist behavior, and forbidden command fragments.
+
+## Decisions Made
+
+- Command matching uses normalized whitespace and exact allowlist entries. This keeps v1 predictable while still handling harmless spacing differences.
+- Forbidden fragments are checked before allowlist acceptance, so a dangerous command cannot be made valid by adding it to `allowed_commands`.
+- Scoped paths are validated for containment inside the project root, but they are not required to exist yet. This preserves the Phase 2 decision that configs can be scaffolded before all source directories exist.
+- The safety layer raises `SafetyError`; config validation wraps those failures as config errors when they come from `nightshift validate`.
+
+## Notes
+
+- This phase does not execute commands. It only validates whether a command would be permitted. Process execution belongs to Phase 6.
--- a/docs/devlog/phase4.md
+++ b/docs/devlog/phase4.md
@ -0,0 +1,22 @@
+# Phase 4 Devlog: Task Parser
+
+## Implemented
+
+- Added `nightshift/tasks.py`.
+- Implemented parsing for documented markdown checklist tasks.
+- Extracted task id, title, completion state, description, acceptance criteria, dependency bullets, raw task markdown, and source line number.
+- Added selection of the next incomplete task.
+- Added selection of a specific task id.
+- Added useful errors for malformed task headers, duplicate ids, missing acceptance criteria, missing files, traversal attempts, and unknown task ids.
+- Added parser and selection tests.
+
+## Decisions Made
+
+- The parser intentionally supports the documented v1 format rather than broad Markdown. This keeps failure behavior explicit and testable.
+- Acceptance criteria are required for each task because downstream pipeline stages need concrete review targets.
+- Dependencies are parsed as simple bullets under a `Dependencies:` section, but no dependency solver is implemented in this phase.
+- Completed tasks use `[x]` or `[X]`; incomplete tasks use `[ ]`.
+
+## Notes
+
+- Task mutation, completion updates, and dependency enforcement are deferred until later pipeline phases.
--- a/docs/devlog/phase5.md
+++ b/docs/devlog/phase5.md
@ -0,0 +1,24 @@
+# Phase 5 Devlog: Artifact Store
+
+## Implemented
+
+- Added `nightshift/artifacts.py`.
+- Created `.nightshift/`, per-run directories, and per-task directories.
+- Created `project-context.md` and `run-summary.md` placeholders when a run is initialized.
+- Added config snapshot copying to `config.snapshot.yaml`.
+- Added task snapshot writing to `task.md`.
+- Added generic stage output writing.
+- Added command output writing.
+- Added final task notes writing.
+- Added tests for artifact tree creation, snapshot writing, and task-directory escape rejection.
+
+## Decisions Made
+
+- `ArtifactStore` accepts an optional `run_id` so tests and future pipeline code can produce deterministic artifact paths.
+- Default run ids use UTC timestamps in `YYYYMMDDTHHMMSSZ` format.
+- Stage output filenames are relative to the task artifact directory and may include subdirectories, but they cannot escape that task directory.
+- Project context and run summary files are initialized with simple markdown headers. Later phases can append richer content.
+
+## Notes
+
+- The artifact store is intentionally independent from pipeline execution so command, agent, context, and report phases can reuse it.
--- a/docs/devlog/phase6.md
+++ b/docs/devlog/phase6.md
@ -0,0 +1,21 @@
+# Phase 6 Devlog: Command Executor
+
+## Implemented
+
+- Added `nightshift/commands.py`.
+- Added command-stage execution for configured `command` stages.
+- Captured stdout, stderr, exit code, duration, and timeout state.
+- Persisted command transcripts through the artifact store.
+- Returned structured `StageResult` objects.
+- Added tests for passing commands, failing commands, output persistence, and allowlist rejection.
+
+## Decisions Made
+
+- Commands are validated through the Phase 3 safety layer immediately before execution, even though config validation also checks them. This keeps command execution safe if called directly in later code.
+- Command stages stop at the first failing or timed-out command and persist the commands that ran.
+- Commands run with `shell=True` because v1 config stores commands as shell-style strings. This is constrained by exact allowlist matching and forbidden fragment checks.
+- The default timeout is 300 seconds. Tests can override it later if timeout-specific behavior needs coverage.
+
+## Notes
+
+- This phase does not wire command execution into a full pipeline runner. That belongs to Phase 8.
--- a/docs/vibe.md
+++ b/docs/vibe.md
@ -1,6 +1,6 @@
 # NIGHTSHIFT_CODEX.md

-You are Codex working on **NightShift**, a local-first AI coding pipeline runner.
+You are Codex working on **NightShift**, a local-first AI coding pipeline runner in python.

 This file is the implementation-driving context document. Treat it as the project brief, architectural guide, and task checklist.

--- a/nightshift/init.py
+++ b/nightshift/init.py
@ -0,0 +1,3 @@
+"""NightShift package."""
+
+__version__ = "0.1.0"
--- a/nightshift/artifacts.py
+++ b/nightshift/artifacts.py
@ -0,0 +1,124 @@
+"""Artifact storage for NightShift runs."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from datetime import datetime, timezone
+from pathlib import Path
+import shutil
+
+from .config import NightShiftConfig
+from .errors import ArtifactError, SafetyError
+from .safety import resolve_inside_root, resolve_project_root, safe_artifact_path
+from .tasks import Task
+
+
+@dataclass(frozen=True)
+class TaskArtifactPaths:
+    task_id: str
+    directory: Path
+    task_snapshot: Path
+
+
+class ArtifactStore:
+    """Create and write the durable artifact tree for one run."""
+
+    def __init__(self, project_root: str | Path, artifact_dir: str | Path, run_id: str | None = None) -> None:
+        try:
+            self.project_root = resolve_project_root(project_root)
+            self.artifact_root = resolve_inside_root(
+                self.project_root, artifact_dir, "artifact directory"
+            )
+        except SafetyError as exc:
+            raise ArtifactError(str(exc)) from exc
+
+        self.run_id = run_id or default_run_id()
+        self.run_dir = self._artifact_path("runs", self.run_id)
+        self.tasks_dir = self.run_dir / "tasks"
+        self.project_context_path = self.artifact_root / "project-context.md"
+        self.run_summary_path = self.run_dir / "run-summary.md"
+        self.config_snapshot_path = self.run_dir / "config.snapshot.yaml"
+
+    @classmethod
+    def from_config(cls, config: NightShiftConfig, run_id: str | None = None) -> "ArtifactStore":
+        return cls(config.project.root, config.project.artifact_dir, run_id=run_id)
+
+    def initialize_run(self) -> None:
+        """Create the base artifact tree for this run."""
+
+        self.artifact_root.mkdir(parents=True, exist_ok=True)
+        self.tasks_dir.mkdir(parents=True, exist_ok=True)
+        if not self.project_context_path.exists():
+            self.project_context_path.write_text("# Project Context\n\n", encoding="utf-8")
+        if not self.run_summary_path.exists():
+            self.run_summary_path.write_text("# Run Summary\n\n", encoding="utf-8")
+
+    def write_config_snapshot(self, config_path: str | Path) -> Path:
+        """Copy the input config into the run artifact directory."""
+
+        self.initialize_run()
+        source = Path(config_path).resolve()
+        try:
+            source.relative_to(self.project_root)
+        except ValueError as exc:
+            raise ArtifactError(f"Artifact error: config path is outside project root: {source}") from exc
+        if not source.exists():
+            raise ArtifactError(f"Artifact error: config path does not exist: {source}")
+        shutil.copyfile(source, self.config_snapshot_path)
+        return self.config_snapshot_path
+
+    def create_task_dir(self, task_id: str) -> TaskArtifactPaths:
+        """Create the artifact directory for one task."""
+
+        self.initialize_run()
+        task_dir = self._artifact_path("runs", self.run_id, "tasks", task_id)
+        task_dir.mkdir(parents=True, exist_ok=True)
+        return TaskArtifactPaths(
+            task_id=task_id,
+            directory=task_dir,
+            task_snapshot=task_dir / "task.md",
+        )
+
+    def write_task_snapshot(self, task: Task) -> Path:
+        paths = self.create_task_dir(task.id)
+        paths.task_snapshot.write_text(task.raw_markdown, encoding="utf-8")
+        return paths.task_snapshot
+
+    def write_stage_output(self, task_id: str, filename: str, content: str) -> Path:
+        """Write a stage artifact under a task directory."""
+
+        task_dir = self.create_task_dir(task_id).directory
+        output_path = self._task_artifact_path(task_dir, filename)
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+        output_path.write_text(content, encoding="utf-8")
+        return output_path
+
+    def write_command_output(self, task_id: str, filename: str, content: str) -> Path:
+        return self.write_stage_output(task_id, filename, content)
+
+    def write_final_task_notes(self, task_id: str, content: str, filename: str = "final-notes.md") -> Path:
+        return self.write_stage_output(task_id, filename, content)
+
+    def _artifact_path(self, *parts: str | Path) -> Path:
+        try:
+            return safe_artifact_path(self.project_root, self.artifact_root, *parts)
+        except SafetyError as exc:
+            raise ArtifactError(str(exc)) from exc
+
+    def _task_artifact_path(self, task_dir: Path, filename: str) -> Path:
+        candidate = Path(filename)
+        if candidate.is_absolute():
+            raise ArtifactError(f"Artifact error: stage output filename must be relative: {filename}")
+        resolved = (task_dir / candidate).resolve()
+        try:
+            resolved.relative_to(task_dir.resolve())
+        except ValueError as exc:
+            raise ArtifactError(f"Artifact error: stage output escapes task directory: {filename}") from exc
+        return resolved
+
+
+def default_run_id(now: datetime | None = None) -> str:
+    """Return a filesystem-friendly UTC run id."""
+
+    value = now or datetime.now(timezone.utc)
+    return value.strftime("%Y%m%dT%H%M%SZ")
--- a/nightshift/cli.py
+++ b/nightshift/cli.py
@ -0,0 +1,68 @@
+"""Command line interface for NightShift."""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+import sys
+
+from .config import validate_config
+from .errors import NightShiftError
+from .init import init_project
+from .tasks import parse_task_file
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(prog="nightshift", description="Auditable AI pipeline runner.")
+    parser.add_argument("--version", action="version", version="nightshift 0.1.0")
+
+    subparsers = parser.add_subparsers(dest="command", required=True)
+
+    init_parser = subparsers.add_parser("init", help="Create starter NightShift files.")
+    init_parser.add_argument("--root", default=".", help="Directory to initialize.")
+    init_parser.add_argument("--force", action="store_true", help="Overwrite existing starter files.")
+
+    validate_parser = subparsers.add_parser("validate", help="Validate nightshift.yaml.")
+    validate_parser.add_argument("--config", default="nightshift.yaml", help="Config file to validate.")
+
+    subparsers.add_parser("run", help="Pipeline execution is planned for a later phase.")
+    subparsers.add_parser("status", help="Status reporting is planned for a later phase.")
+
+    return parser
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = build_parser()
+    args = parser.parse_args(argv)
+
+    try:
+        if args.command == "init":
+            written = init_project(Path(args.root), force=args.force)
+            print("Created NightShift starter files:")
+            for path in written:
+                print(f"- {path}")
+            return 0
+
+        if args.command == "validate":
+            config = validate_config(args.config)
+            tasks = parse_task_file(config.project.root, config.project.task_file)
+            incomplete = sum(1 for task in tasks if not task.completed)
+            print(f"Config valid: {config.path}")
+            print(f"Project: {config.project.name}")
+            print(f"Stages: {len(config.pipeline.stages)}")
+            print(f"Tasks: {len(tasks)}")
+            print(f"Incomplete tasks: {incomplete}")
+            return 0
+
+        if args.command in {"run", "status"}:
+            parser.error(f"'{args.command}' is not implemented yet.")
+
+    except NightShiftError as exc:
+        print(str(exc), file=sys.stderr)
+        return 1
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/nightshift/commands.py
+++ b/nightshift/commands.py
@ -0,0 +1,148 @@
+"""Command stage execution."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from pathlib import Path
+import subprocess
+import time
+
+from .artifacts import ArtifactStore
+from .config import SafetyConfig, StageConfig
+from .errors import CommandError, SafetyError
+from .safety import ensure_command_allowed, resolve_project_root
+from .stages import StageResult
+
+
+DEFAULT_COMMAND_TIMEOUT_SECONDS = 300
+
+
+@dataclass(frozen=True)
+class CommandRun:
+    command: str
+    exit_code: int
+    stdout: str
+    stderr: str
+    duration_seconds: float
+    timed_out: bool = False
+
+
+class CommandExecutor:
+    """Run configured command stages and persist their output."""
+
+    def __init__(
+        self,
+        project_root: str | Path,
+        safety: SafetyConfig,
+        artifacts: ArtifactStore,
+        timeout_seconds: int = DEFAULT_COMMAND_TIMEOUT_SECONDS,
+    ) -> None:
+        self.project_root = resolve_project_root(project_root)
+        self.safety = safety
+        self.artifacts = artifacts
+        self.timeout_seconds = timeout_seconds
+
+    def run_stage(self, stage: StageConfig, task_id: str) -> StageResult:
+        if stage.type != "command":
+            raise CommandError(
+                f"Command error: stage '{stage.id}' has type '{stage.type}', expected 'command'."
+            )
+        if not stage.commands:
+            raise CommandError(f"Command error: stage '{stage.id}' has no commands.")
+
+        runs: list[CommandRun] = []
+        status = "pass"
+        reason = "All commands passed."
+
+        for command in stage.commands:
+            run = self.run_command(command)
+            runs.append(run)
+            if run.timed_out:
+                status = "fail"
+                reason = f"Command timed out after {self.timeout_seconds}s: {run.command}"
+                break
+            if run.exit_code != 0:
+                status = "fail"
+                reason = f"Command exited with code {run.exit_code}: {run.command}"
+                break
+
+        output_filename = stage.output or f"{stage.id}-output.txt"
+        output_path = self.artifacts.write_command_output(
+            task_id,
+            output_filename,
+            format_command_runs(stage.id, runs),
+        )
+        return StageResult(
+            stage_id=stage.id,
+            status=status,  # type: ignore[arg-type]
+            reason=reason,
+            output_path=str(output_path.relative_to(self.project_root)),
+        )
+
+    def run_command(self, command: str) -> CommandRun:
+        try:
+            normalized = ensure_command_allowed(
+                command,
+                self.safety.allowed_commands,
+                self.safety.forbidden_commands,
+            )
+        except SafetyError as exc:
+            raise CommandError(str(exc)) from exc
+
+        started = time.monotonic()
+        try:
+            completed = subprocess.run(
+                normalized,
+                cwd=self.project_root,
+                shell=True,
+                capture_output=True,
+                text=True,
+                timeout=self.timeout_seconds,
+            )
+            duration = time.monotonic() - started
+            return CommandRun(
+                command=normalized,
+                exit_code=completed.returncode,
+                stdout=completed.stdout,
+                stderr=completed.stderr,
+                duration_seconds=duration,
+            )
+        except subprocess.TimeoutExpired as exc:
+            duration = time.monotonic() - started
+            return CommandRun(
+                command=normalized,
+                exit_code=-1,
+                stdout=exc.stdout or "",
+                stderr=exc.stderr or "",
+                duration_seconds=duration,
+                timed_out=True,
+            )
+
+
+def format_command_runs(stage_id: str, runs: list[CommandRun]) -> str:
+    lines = [f"# Command Output: {stage_id}", ""]
+    for index, run in enumerate(runs, start=1):
+        lines.extend(
+            [
+                f"## Command {index}",
+                "",
+                f"Command: `{run.command}`",
+                f"Exit code: {run.exit_code}",
+                f"Duration seconds: {run.duration_seconds:.3f}",
+                f"Timed out: {str(run.timed_out).lower()}",
+                "",
+                "### stdout",
+                "",
+                "```text",
+                run.stdout.rstrip(),
+                "```",
+                "",
+                "### stderr",
+                "",
+                "```text",
+                run.stderr.rstrip(),
+                "```",
+                "",
+            ]
+        )
+    return "\n".join(lines)
--- a/nightshift/config.py
+++ b/nightshift/config.py
@ -0,0 +1,407 @@
+"""Typed NightShift configuration loading and validation."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from pathlib import Path
+import re
+from typing import Any
+
+from .errors import ConfigError
+from .errors import SafetyError
+from .safety import (
+    ensure_command_allowed,
+    resolve_inside_root,
+    resolve_project_root,
+    safe_artifact_path,
+    validate_scoped_paths,
+)
+
+
+@dataclass(frozen=True)
+class ProjectConfig:
+    name: str
+    root: Path
+    task_file: Path
+    artifact_dir: Path
+
+
+@dataclass(frozen=True)
+class SafetyConfig:
+    require_clean_worktree: bool
+    scoped_paths: tuple[str, ...]
+    allowed_commands: tuple[str, ...]
+    forbidden_commands: tuple[str, ...]
+
+
+@dataclass(frozen=True)
+class AgentConfig:
+    id: str
+    backend: str
+    command: str | None
+    system_prompt: Path
+    model: str | None = None
+    role: str | None = None
+
+
+@dataclass(frozen=True)
+class StageConfig:
+    id: str
+    type: str
+    agent: str | None = None
+    commands: tuple[str, ...] = ()
+    output: str | None = None
+    on_fail: str | None = None
+
+
+@dataclass(frozen=True)
+class PipelineConfig:
+    max_task_retries: int
+    stages: tuple[StageConfig, ...]
+
+
+@dataclass(frozen=True)
+class NightShiftConfig:
+    path: Path
+    project: ProjectConfig
+    safety: SafetyConfig
+    agents: dict[str, AgentConfig]
+    pipeline: PipelineConfig
+
+
+AGENT_STAGE_TYPES = {"agent", "agent_review", "review"}
+COMMAND_STAGE_TYPES = {"command"}
+SUPPORTED_STAGE_TYPES = AGENT_STAGE_TYPES | COMMAND_STAGE_TYPES | {"summarize"}
+
+
+def load_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
+    """Load and validate a NightShift YAML config file."""
+
+    config_path = Path(path).resolve()
+    if not config_path.exists():
+        raise ConfigError(f"Config file not found: {config_path}")
+
+    raw = _load_yaml_mapping(config_path)
+    return parse_config(raw, config_path)
+
+
+def validate_config(path: str | Path = "nightshift.yaml") -> NightShiftConfig:
+    """Load a config and validate referenced local files."""
+
+    config = load_config(path)
+    try:
+        root = resolve_project_root(config.project.root)
+        safe_artifact_path(root, config.project.artifact_dir)
+        validate_scoped_paths(root, config.safety.scoped_paths)
+    except SafetyError as exc:
+        raise ConfigError(str(exc)) from exc
+
+    task_file = resolve_inside_root(root, config.project.task_file, "project.task_file")
+    if not task_file.exists():
+        raise ConfigError(f"Config error: task file does not exist: {task_file}")
+
+    for agent in config.agents.values():
+        prompt = resolve_inside_root(root, agent.system_prompt, f"agents.{agent.id}.system_prompt")
+        if not prompt.exists():
+            raise ConfigError(
+                "Config error: agent "
+                f"'{agent.id}' system prompt does not exist: {agent.system_prompt}"
+            )
+
+    for stage in config.pipeline.stages:
+        for command in stage.commands:
+            try:
+                ensure_command_allowed(
+                    command,
+                    config.safety.allowed_commands,
+                    config.safety.forbidden_commands,
+                )
+            except SafetyError as exc:
+                raise ConfigError(f"Config error: stage '{stage.id}' {exc}") from exc
+
+    return config
+
+
+def parse_config(raw: dict[str, Any], config_path: Path) -> NightShiftConfig:
+    """Convert a raw mapping into typed config objects."""
+
+    _require_mapping(raw, "config")
+    for section in ("project", "safety", "agents", "pipeline"):
+        if section not in raw:
+            raise ConfigError(f"Config error: missing required section '{section}'.")
+
+    project_raw = _require_mapping(raw["project"], "project")
+    project_name = _require_string(project_raw, "name", "project")
+    project_root_value = _require_string(project_raw, "root", "project")
+    project_root = (config_path.parent / project_root_value).resolve()
+    project = ProjectConfig(
+        name=project_name,
+        root=project_root,
+        task_file=Path(_require_string(project_raw, "task_file", "project")),
+        artifact_dir=Path(_require_string(project_raw, "artifact_dir", "project")),
+    )
+
+    safety_raw = _require_mapping(raw["safety"], "safety")
+    safety = SafetyConfig(
+        require_clean_worktree=bool(safety_raw.get("require_clean_worktree", False)),
+        scoped_paths=_string_tuple(safety_raw.get("scoped_paths", []), "safety.scoped_paths"),
+        allowed_commands=_string_tuple(safety_raw.get("allowed_commands", []), "safety.allowed_commands"),
+        forbidden_commands=_string_tuple(
+            safety_raw.get("forbidden_commands", []), "safety.forbidden_commands"
+        ),
+    )
+
+    agents_raw = _require_mapping(raw["agents"], "agents")
+    if not agents_raw:
+        raise ConfigError("Config error: at least one agent must be defined.")
+    agents: dict[str, AgentConfig] = {}
+    for agent_id, agent_raw_value in agents_raw.items():
+        agent_raw = _require_mapping(agent_raw_value, f"agents.{agent_id}")
+        backend = _require_string(agent_raw, "backend", f"agents.{agent_id}")
+        command = _optional_string(agent_raw.get("command"), f"agents.{agent_id}.command")
+        system_prompt = Path(_require_string(agent_raw, "system_prompt", f"agents.{agent_id}"))
+        agents[str(agent_id)] = AgentConfig(
+            id=str(agent_id),
+            backend=backend,
+            command=command,
+            system_prompt=system_prompt,
+            model=_optional_string(agent_raw.get("model"), f"agents.{agent_id}.model"),
+            role=_optional_string(agent_raw.get("role"), f"agents.{agent_id}.role"),
+        )
+
+    pipeline_raw = _require_mapping(raw["pipeline"], "pipeline")
+    max_task_retries = int(pipeline_raw.get("max_task_retries", 0))
+    if max_task_retries < 0:
+        raise ConfigError("Config error: pipeline.max_task_retries must be zero or greater.")
+
+    stages_raw = pipeline_raw.get("stages")
+    if not isinstance(stages_raw, list) or not stages_raw:
+        raise ConfigError("Config error: pipeline.stages must be a non-empty list.")
+
+    stages: list[StageConfig] = []
+    seen_stage_ids: set[str] = set()
+    for index, stage_raw_value in enumerate(stages_raw):
+        stage_context = f"pipeline.stages[{index}]"
+        stage_raw = _require_mapping(stage_raw_value, stage_context)
+        stage_id = _require_string(stage_raw, "id", stage_context)
+        if stage_id in seen_stage_ids:
+            raise ConfigError(f"Config error: duplicate pipeline stage id '{stage_id}'.")
+        seen_stage_ids.add(stage_id)
+
+        stage_type = _require_string(stage_raw, "type", stage_context)
+        if stage_type not in SUPPORTED_STAGE_TYPES:
+            supported = ", ".join(sorted(SUPPORTED_STAGE_TYPES))
+            raise ConfigError(
+                f"Config error: stage '{stage_id}' has unsupported type '{stage_type}'. "
+                f"Supported types: {supported}."
+            )
+
+        agent = _optional_string(stage_raw.get("agent"), f"{stage_context}.agent")
+        commands = _string_tuple(stage_raw.get("commands", []), f"{stage_context}.commands")
+
+        if stage_type in AGENT_STAGE_TYPES:
+            if agent is None:
+                raise ConfigError(f"Config error: agent stage '{stage_id}' must reference an agent.")
+            if agent not in agents:
+                defined = ", ".join(sorted(agents))
+                raise ConfigError(
+                    f"Config error: pipeline stage '{stage_id}' references unknown agent "
+                    f"'{agent}'. Defined agents: {defined}."
+                )
+
+        if stage_type in COMMAND_STAGE_TYPES and not commands:
+            raise ConfigError(f"Config error: command stage '{stage_id}' must define commands.")
+
+        stages.append(
+            StageConfig(
+                id=stage_id,
+                type=stage_type,
+                agent=agent,
+                commands=commands,
+                output=_optional_string(stage_raw.get("output"), f"{stage_context}.output"),
+                on_fail=_optional_string(stage_raw.get("on_fail"), f"{stage_context}.on_fail"),
+            )
+        )
+
+    stage_ids = {stage.id for stage in stages}
+    for stage in stages:
+        if stage.on_fail and stage.on_fail not in stage_ids:
+            raise ConfigError(
+                f"Config error: stage '{stage.id}' on_fail references unknown stage '{stage.on_fail}'."
+            )
+
+    return NightShiftConfig(
+        path=config_path,
+        project=project,
+        safety=safety,
+        agents=agents,
+        pipeline=PipelineConfig(max_task_retries=max_task_retries, stages=tuple(stages)),
+    )
+
+
+def _load_yaml_mapping(path: Path) -> dict[str, Any]:
+    text = path.read_text(encoding="utf-8")
+    try:
+        import yaml  # type: ignore[import-not-found]
+    except ModuleNotFoundError:
+        data = _parse_simple_yaml(text)
+    else:
+        data = yaml.safe_load(text)
+
+    if data is None:
+        data = {}
+    if not isinstance(data, dict):
+        raise ConfigError("Config error: top-level YAML value must be a mapping.")
+    return data
+
+
+def _parse_simple_yaml(text: str) -> dict[str, Any]:
+    """Parse the small YAML subset used by NightShift starter configs.
+
+    PyYAML is used when available. This fallback keeps `nightshift init` and
+    `nightshift validate` usable in a fresh checkout with only the stdlib.
+    """
+
+    lines = []
+    for line_number, raw_line in enumerate(text.splitlines(), start=1):
+        without_comment = raw_line.split("#", 1)[0].rstrip()
+        if without_comment.strip():
+            indent = len(without_comment) - len(without_comment.lstrip(" "))
+            lines.append((line_number, indent, without_comment.strip()))
+
+    index = 0
+
+    def parse_block(expected_indent: int) -> Any:
+        nonlocal index
+        if index >= len(lines):
+            return {}
+
+        _, current_indent, content = lines[index]
+        if current_indent < expected_indent:
+            return {}
+        if current_indent != expected_indent:
+            line_number = lines[index][0]
+            raise ConfigError(f"Config error: invalid indentation near line {line_number}.")
+
+        if content.startswith("- "):
+            sequence: list[Any] = []
+            while index < len(lines):
+                line_number, indent, item = lines[index]
+                if indent < expected_indent:
+                    break
+                if indent != expected_indent or not item.startswith("- "):
+                    break
+                item_content = item[2:].strip()
+                index += 1
+                if not item_content:
+                    sequence.append(parse_block(expected_indent + 2))
+                elif _looks_like_key_value(item_content):
+                    key, value = _split_key_value(item_content, line_number)
+                    mapping: dict[str, Any] = {}
+                    mapping[key] = (
+                        parse_block(expected_indent + 2)
+                        if value == ""
+                        else _parse_scalar(value)
+                    )
+                    while index < len(lines):
+                        _, child_indent, child_content = lines[index]
+                        if child_indent <= expected_indent:
+                            break
+                        if child_indent != expected_indent + 2:
+                            child_line = lines[index][0]
+                            raise ConfigError(
+                                f"Config error: invalid indentation near line {child_line}."
+                            )
+                        if child_content.startswith("- "):
+                            break
+                        child_key, child_value = _split_key_value(child_content, lines[index][0])
+                        index += 1
+                        mapping[child_key] = (
+                            parse_block(expected_indent + 4)
+                            if child_value == ""
+                            else _parse_scalar(child_value)
+                        )
+                    sequence.append(mapping)
+                else:
+                    sequence.append(_parse_scalar(item_content))
+            return sequence
+
+        mapping: dict[str, Any] = {}
+        while index < len(lines):
+            line_number, indent, item = lines[index]
+            if indent < expected_indent:
+                break
+            if indent != expected_indent:
+                break
+            if item.startswith("- "):
+                break
+            key, value = _split_key_value(item, line_number)
+            index += 1
+            mapping[key] = parse_block(expected_indent + 2) if value == "" else _parse_scalar(value)
+        return mapping
+
+    parsed = parse_block(0)
+    if not isinstance(parsed, dict):
+        raise ConfigError("Config error: top-level YAML value must be a mapping.")
+    return parsed
+
+
+def _looks_like_key_value(value: str) -> bool:
+    return bool(re.match(r"^[A-Za-z0-9_-]+:", value))
+
+
+def _split_key_value(value: str, line_number: int) -> tuple[str, str]:
+    if ":" not in value:
+        raise ConfigError(f"Config error: expected key/value pair near line {line_number}.")
+    key, raw_value = value.split(":", 1)
+    key = key.strip()
+    if not key:
+        raise ConfigError(f"Config error: empty key near line {line_number}.")
+    return key, raw_value.strip()
+
+
+def _parse_scalar(value: str) -> Any:
+    if value in {"true", "True"}:
+        return True
+    if value in {"false", "False"}:
+        return False
+    if value in {"null", "Null", "~"}:
+        return None
+    if re.fullmatch(r"-?\d+", value):
+        return int(value)
+    if (value.startswith('"') and value.endswith('"')) or (
+        value.startswith("'") and value.endswith("'")
+    ):
+        return value[1:-1]
+    return value
+
+
+def _require_mapping(value: Any, context: str) -> dict[str, Any]:
+    if not isinstance(value, dict):
+        raise ConfigError(f"Config error: '{context}' must be a mapping.")
+    return value
+
+
+def _require_string(mapping: dict[str, Any], key: str, context: str) -> str:
+    if key not in mapping:
+        raise ConfigError(f"Config error: missing required key '{context}.{key}'.")
+    value = mapping[key]
+    if not isinstance(value, str) or not value:
+        raise ConfigError(f"Config error: '{context}.{key}' must be a non-empty string.")
+    return value
+
+
+def _optional_string(value: Any, context: str) -> str | None:
+    if value is None:
+        return None
+    if not isinstance(value, str) or not value:
+        raise ConfigError(f"Config error: '{context}' must be a non-empty string when set.")
+    return value
+
+
+def _string_tuple(value: Any, context: str) -> tuple[str, ...]:
+    if value is None:
+        return ()
+    if not isinstance(value, list) or not all(isinstance(item, str) and item for item in value):
+        raise ConfigError(f"Config error: '{context}' must be a list of non-empty strings.")
+    return tuple(value)
--- a/nightshift/errors.py
+++ b/nightshift/errors.py
@ -0,0 +1,29 @@
+"""Project-specific exceptions."""
+
+
+class NightShiftError(Exception):
+    """Base exception for NightShift failures."""
+
+
+class ConfigError(NightShiftError):
+    """Raised when a NightShift config is missing or invalid."""
+
+
+class InitError(NightShiftError):
+    """Raised when project initialization cannot proceed."""
+
+
+class SafetyError(NightShiftError):
+    """Raised when a path or command violates configured safety rules."""
+
+
+class TaskError(NightShiftError):
+    """Raised when task parsing or selection fails."""
+
+
+class ArtifactError(NightShiftError):
+    """Raised when artifact storage cannot proceed safely."""
+
+
+class CommandError(NightShiftError):
+    """Raised when command stage execution cannot proceed."""
--- a/nightshift/init.py
+++ b/nightshift/init.py
@ -0,0 +1,43 @@
+"""Project initialization helpers."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from .errors import InitError
+from . import templates
+
+
+STARTER_FILES = {
+    "nightshift.yaml": templates.NIGHTSHIFT_YAML,
+    "tasks.md": templates.TASKS_MD,
+    "agents/planner.md": templates.PLANNER_PROMPT,
+    "agents/implementer.md": templates.IMPLEMENTER_PROMPT,
+    "agents/reviewer.md": templates.REVIEWER_PROMPT,
+}
+
+
+def init_project(root: Path, force: bool = False) -> list[Path]:
+    """Create starter NightShift files under root.
+
+    Existing files are left untouched unless force is true.
+    """
+
+    root = root.resolve()
+    targets = [root / relative for relative in STARTER_FILES]
+    existing = [path for path in targets if path.exists()]
+    if existing and not force:
+        formatted = ", ".join(str(path.relative_to(root)) for path in existing)
+        raise InitError(
+            "Initialization would overwrite existing files. "
+            f"Use --force to replace: {formatted}"
+        )
+
+    written: list[Path] = []
+    for relative, content in STARTER_FILES.items():
+        path = root / relative
+        path.parent.mkdir(parents=True, exist_ok=True)
+        path.write_text(content, encoding="utf-8")
+        written.append(path)
+
+    return written
--- a/nightshift/safety.py
+++ b/nightshift/safety.py
@ -0,0 +1,119 @@
+"""Safety helpers for paths and commands."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from .errors import SafetyError
+
+
+def resolve_project_root(root: str | Path) -> Path:
+    """Resolve and validate a project root directory."""
+
+    resolved = Path(root).resolve()
+    if not resolved.exists():
+        raise SafetyError(f"Safety error: project root does not exist: {resolved}")
+    if not resolved.is_dir():
+        raise SafetyError(f"Safety error: project root is not a directory: {resolved}")
+    return resolved
+
+
+def resolve_inside_root(root: str | Path, path: str | Path, context: str = "path") -> Path:
+    """Resolve a path and reject values outside the project root."""
+
+    resolved_root = resolve_project_root(root)
+    candidate = Path(path)
+    resolved = candidate.resolve() if candidate.is_absolute() else (resolved_root / candidate).resolve()
+    try:
+        resolved.relative_to(resolved_root)
+    except ValueError as exc:
+        raise SafetyError(
+            f"Safety error: {context} resolves outside project root: {path}"
+        ) from exc
+    return resolved
+
+
+def validate_scoped_paths(root: str | Path, scoped_paths: list[str] | tuple[str, ...]) -> tuple[Path, ...]:
+    """Validate that every configured scoped path remains inside the root."""
+
+    return tuple(
+        resolve_inside_root(root, scoped_path, f"scoped path '{scoped_path}'")
+        for scoped_path in scoped_paths
+    )
+
+
+def safe_artifact_path(
+    root: str | Path,
+    artifact_dir: str | Path,
+    *parts: str | Path,
+    create_parent: bool = False,
+) -> Path:
+    """Build an artifact path that cannot escape the configured artifact tree."""
+
+    artifact_root = resolve_inside_root(root, artifact_dir, "artifact directory")
+    path = artifact_root
+    for part in parts:
+        candidate = Path(part)
+        if candidate.is_absolute():
+            raise SafetyError(f"Safety error: artifact path segment must be relative: {part}")
+        path = path / candidate
+
+    resolved = path.resolve()
+    try:
+        resolved.relative_to(artifact_root)
+    except ValueError as exc:
+        raise SafetyError(f"Safety error: artifact path escapes artifact directory: {path}") from exc
+
+    if create_parent:
+        resolved.parent.mkdir(parents=True, exist_ok=True)
+    return resolved
+
+
+def normalize_command(command: str) -> str:
+    """Normalize command whitespace for safety comparisons."""
+
+    return " ".join(command.strip().split())
+
+
+def ensure_command_allowed(
+    command: str,
+    allowed_commands: list[str] | tuple[str, ...],
+    forbidden_commands: list[str] | tuple[str, ...],
+) -> str:
+    """Validate one command against forbidden fragments and an exact allowlist."""
+
+    if not isinstance(command, str) or not command.strip():
+        raise SafetyError("Safety error: command must be a non-empty string.")
+
+    normalized = normalize_command(command)
+    lowered = normalized.lower()
+
+    for fragment in forbidden_commands:
+        normalized_fragment = normalize_command(fragment).lower()
+        if normalized_fragment and normalized_fragment in lowered:
+            raise SafetyError(
+                f"Safety error: command contains forbidden fragment '{fragment}': {command}"
+            )
+
+    allowed = {normalize_command(item) for item in allowed_commands}
+    if normalized not in allowed:
+        allowed_display = ", ".join(sorted(allowed)) or "<none>"
+        raise SafetyError(
+            f"Safety error: command is not allowlisted: {command}. "
+            f"Allowed commands: {allowed_display}."
+        )
+
+    return normalized
+
+
+def validate_stage_commands(
+    commands: list[str] | tuple[str, ...],
+    allowed_commands: list[str] | tuple[str, ...],
+    forbidden_commands: list[str] | tuple[str, ...],
+) -> tuple[str, ...]:
+    """Validate each command in a command stage."""
+
+    return tuple(
+        ensure_command_allowed(command, allowed_commands, forbidden_commands)
+        for command in commands
+    )
--- a/nightshift/stages.py
+++ b/nightshift/stages.py
@ -0,0 +1,19 @@
+"""Shared stage result types."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Literal
+
+
+StageStatus = Literal["pass", "fail", "retry", "escalate"]
+
+
+@dataclass(frozen=True)
+class StageResult:
+    stage_id: str
+    status: StageStatus
+    reason: str
+    output_path: str | None = None
+    next_stage: str | None = None
+    context_update: str | None = None
--- a/nightshift/tasks.py
+++ b/nightshift/tasks.py
@ -0,0 +1,163 @@
+"""Markdown task parsing and selection."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from pathlib import Path
+import re
+
+from .errors import SafetyError, TaskError
+from .safety import resolve_inside_root
+
+
+TASK_HEADER_RE = re.compile(r"^\s*-\s+\[(?P<mark>[ xX])\]\s+(?P<id>[A-Z]+-\d+):\s+(?P<title>.+?)\s*$")
+CHECKBOX_RE = re.compile(r"^\s*-\s+\[[^\]]*\]")
+SECTION_RE = re.compile(r"^(?P<name>[A-Za-z][A-Za-z ]+):\s*$")
+
+
+@dataclass(frozen=True)
+class Task:
+    id: str
+    title: str
+    completed: bool
+    description: str
+    acceptance_criteria: tuple[str, ...]
+    dependencies: tuple[str, ...]
+    raw_markdown: str
+    line_number: int
+
+
+def parse_task_file(project_root: str | Path, task_file: str | Path) -> list[Task]:
+    """Load and parse a task markdown file inside the project root."""
+
+    try:
+        path = resolve_inside_root(project_root, task_file, "task file")
+    except SafetyError as exc:
+        raise TaskError(str(exc)) from exc
+
+    if not path.exists():
+        raise TaskError(f"Task error: task file does not exist: {path}")
+
+    return parse_tasks(path.read_text(encoding="utf-8"))
+
+
+def parse_tasks(markdown: str) -> list[Task]:
+    """Parse NightShift's documented markdown checklist task format."""
+
+    lines = markdown.splitlines()
+    tasks: list[Task] = []
+    seen_ids: set[str] = set()
+    index = 0
+
+    while index < len(lines):
+        line = lines[index]
+        header = TASK_HEADER_RE.match(line)
+        if not header:
+            if CHECKBOX_RE.match(line):
+                raise TaskError(
+                    f"Task error: malformed task header on line {index + 1}. "
+                    "Expected '- [ ] TASK-001: Task title'."
+                )
+            index += 1
+            continue
+
+        task_id = header.group("id")
+        if task_id in seen_ids:
+            raise TaskError(f"Task error: duplicate task id '{task_id}' on line {index + 1}.")
+        seen_ids.add(task_id)
+
+        start = index
+        index += 1
+        while index < len(lines) and not TASK_HEADER_RE.match(lines[index]):
+            if CHECKBOX_RE.match(lines[index]):
+                raise TaskError(
+                    f"Task error: malformed task header on line {index + 1}. "
+                    "Expected '- [ ] TASK-001: Task title'."
+                )
+            index += 1
+
+        block = lines[start:index]
+        description = _extract_section(block, "Description")
+        acceptance_criteria = tuple(_extract_bullets(block, "Acceptance Criteria"))
+        dependencies = tuple(_extract_bullets(block, "Dependencies"))
+
+        if not acceptance_criteria:
+            raise TaskError(
+                f"Task error: task '{task_id}' is missing Acceptance Criteria bullets."
+            )
+
+        tasks.append(
+            Task(
+                id=task_id,
+                title=header.group("title"),
+                completed=header.group("mark").lower() == "x",
+                description=description,
+                acceptance_criteria=acceptance_criteria,
+                dependencies=dependencies,
+                raw_markdown="\n".join(block).strip() + "\n",
+                line_number=start + 1,
+            )
+        )
+
+    if not tasks:
+        raise TaskError("Task error: no tasks found. Expected '- [ ] TASK-001: Task title'.")
+
+    return tasks
+
+
+def select_next_incomplete_task(tasks: list[Task] | tuple[Task, ...]) -> Task:
+    """Return the first incomplete task in file order."""
+
+    for task in tasks:
+        if not task.completed:
+            return task
+    raise TaskError("Task error: no incomplete tasks found.")
+
+
+def select_task_by_id(tasks: list[Task] | tuple[Task, ...], task_id: str) -> Task:
+    """Return a task by id."""
+
+    for task in tasks:
+        if task.id == task_id:
+            return task
+    available = ", ".join(task.id for task in tasks) or "<none>"
+    raise TaskError(f"Task error: unknown task id '{task_id}'. Available tasks: {available}.")
+
+
+def _extract_section(block: list[str], section_name: str) -> str:
+    start = _find_section_index(block, section_name)
+    if start is None:
+        return ""
+
+    collected: list[str] = []
+    for line in block[start + 1 :]:
+        if SECTION_RE.match(line.strip()):
+            break
+        collected.append(line)
+
+    return "\n".join(collected).strip()
+
+
+def _extract_bullets(block: list[str], section_name: str) -> list[str]:
+    start = _find_section_index(block, section_name)
+    if start is None:
+        return []
+
+    bullets: list[str] = []
+    for line in block[start + 1 :]:
+        stripped = line.strip()
+        if SECTION_RE.match(stripped):
+            break
+        if stripped.startswith("- "):
+            value = stripped[2:].strip()
+            if value:
+                bullets.append(value)
+    return bullets
+
+
+def _find_section_index(block: list[str], section_name: str) -> int | None:
+    expected = f"{section_name}:".lower()
+    for index, line in enumerate(block):
+        if line.strip().lower() == expected:
+            return index
+    return None
--- a/nightshift/templates.py
+++ b/nightshift/templates.py
@ -0,0 +1,124 @@
+"""Built-in starter file templates for `nightshift init`."""
+
+NIGHTSHIFT_YAML = """project:
+  name: example-project
+  root: .
+  task_file: tasks.md
+  artifact_dir: .nightshift
+
+safety:
+  require_clean_worktree: false
+  scoped_paths:
+    - .
+  allowed_commands:
+    - python -m unittest
+  forbidden_commands:
+    - rm -rf
+    - git push
+    - curl | bash
+
+agents:
+  planner:
+    backend: command
+    command: echo
+    system_prompt: agents/planner.md
+
+  implementer:
+    backend: command
+    command: echo
+    system_prompt: agents/implementer.md
+
+  reviewer:
+    backend: command
+    command: echo
+    system_prompt: agents/reviewer.md
+
+pipeline:
+  max_task_retries: 3
+  stages:
+    - id: plan
+      type: agent
+      agent: planner
+      output: plan.md
+
+    - id: review_plan
+      type: agent_review
+      agent: reviewer
+      on_fail: plan
+      output: plan-review.md
+
+    - id: implement
+      type: agent
+      agent: implementer
+      output: implementation-log.md
+
+    - id: test
+      type: command
+      commands:
+        - python -m unittest
+      output: test-output.txt
+
+    - id: review
+      type: agent_review
+      agent: reviewer
+      on_fail: implement
+      output: review.md
+
+    - id: summarize
+      type: summarize
+      output: final-notes.md
+"""
+
+TASKS_MD = """# Tasks
+
+- [ ] TASK-001: Add your first NightShift task
+
+Description:
+Describe the coding task NightShift should work on.
+
+Acceptance Criteria:
+- The expected behavior is clear
+- The task can be reviewed from generated artifacts
+"""
+
+PLANNER_PROMPT = """# Planner
+
+You are the planning agent for NightShift.
+
+Create a conservative implementation plan for one coding task.
+
+Rules:
+- Do not write code.
+- Identify relevant files.
+- Preserve existing behavior.
+- Prefer small changes.
+- Include test strategy.
+- Include risks.
+"""
+
+IMPLEMENTER_PROMPT = """# Implementer
+
+You are the implementation agent for NightShift.
+
+Implement the approved plan inside the scoped project directory.
+
+Rules:
+- Make the smallest correct change.
+- Do not edit files outside scope.
+- Preserve existing style.
+- Write useful implementation notes.
+"""
+
+REVIEWER_PROMPT = """# Reviewer
+
+You are the review agent for NightShift.
+
+Decide whether the current task should pass, retry implementation, retry planning, or fail.
+
+Output exactly:
+
+status: pass | fail | retry | escalate
+reason: <short explanation>
+next_stage: <optional stage id>
+context_update: <compact useful note>
+"""
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,20 @@
+[build-system]
+requires = ["setuptools>=69"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "nightshift"
+version = "0.1.0"
+description = "Auditable local-first AI coding pipelines."
+readme = "README.md"
+requires-python = ">=3.11"
+license = "GPL-3.0-only"
+authors = [
+  { name = "K455" }
+]
+
+[project.scripts]
+nightshift = "nightshift.cli:main"
+
+[tool.setuptools.packages.find]
+include = ["nightshift*"]
--- a/tests/init.py
+++ b/tests/init.py
@ -0,0 +1 @@
+"""NightShift test suite."""
--- a/tests/test_artifacts.py
+++ b/tests/test_artifacts.py
@ -0,0 +1,56 @@
+from pathlib import Path
+import tempfile
+import unittest
+
+from nightshift.artifacts import ArtifactStore
+from nightshift.errors import ArtifactError
+from nightshift.init import init_project
+from nightshift.tasks import parse_task_file
+
+
+class ArtifactStoreTests(unittest.TestCase):
+    def test_initialize_run_creates_base_artifact_tree(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            store = ArtifactStore(root, ".nightshift", run_id="test-run")
+
+            store.initialize_run()
+
+            self.assertTrue((root / ".nightshift").is_dir())
+            self.assertTrue((root / ".nightshift" / "project-context.md").exists())
+            self.assertTrue((root / ".nightshift" / "runs" / "test-run").is_dir())
+            self.assertTrue((root / ".nightshift" / "runs" / "test-run" / "tasks").is_dir())
+            self.assertTrue((root / ".nightshift" / "runs" / "test-run" / "run-summary.md").exists())
+
+    def test_writes_config_task_stage_and_final_artifacts(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+            task = parse_task_file(root, "tasks.md")[0]
+            store = ArtifactStore(root, ".nightshift", run_id="test-run")
+
+            config_path = store.write_config_snapshot(root / "nightshift.yaml")
+            task_path = store.write_task_snapshot(task)
+            stage_path = store.write_stage_output(task.id, "plan.md", "# Plan\n")
+            command_path = store.write_command_output(task.id, "test-output.txt", "ok\n")
+            notes_path = store.write_final_task_notes(task.id, "# Notes\n")
+
+            self.assertTrue(config_path.exists())
+            self.assertIn("project:", config_path.read_text(encoding="utf-8"))
+            self.assertTrue(task_path.exists())
+            self.assertIn(task.id, task_path.read_text(encoding="utf-8"))
+            self.assertEqual(stage_path.read_text(encoding="utf-8"), "# Plan\n")
+            self.assertEqual(command_path.read_text(encoding="utf-8"), "ok\n")
+            self.assertEqual(notes_path.read_text(encoding="utf-8"), "# Notes\n")
+
+    def test_stage_output_cannot_escape_task_directory(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            store = ArtifactStore(root, ".nightshift", run_id="test-run")
+
+            with self.assertRaisesRegex(ArtifactError, "escapes task directory"):
+                store.write_stage_output("TASK-001", "../leak.txt", "nope")
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tests/test_commands.py
+++ b/tests/test_commands.py
@ -0,0 +1,94 @@
+from pathlib import Path
+import tempfile
+import unittest
+
+from nightshift.artifacts import ArtifactStore
+from nightshift.commands import CommandExecutor
+from nightshift.config import SafetyConfig, StageConfig
+from nightshift.errors import CommandError
+
+
+PASSING_COMMAND = 'python -c "print(\'ok\')"'
+FAILING_COMMAND = 'python -c "import sys; print(\'bad\'); sys.exit(7)"'
+
+
+class CommandExecutorTests(unittest.TestCase):
+    def test_passing_command_stage_returns_pass_and_writes_output(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
+            executor = CommandExecutor(
+                root,
+                SafetyConfig(
+                    require_clean_worktree=False,
+                    scoped_paths=(".",),
+                    allowed_commands=(PASSING_COMMAND,),
+                    forbidden_commands=("rm -rf",),
+                ),
+                artifacts,
+            )
+            stage = StageConfig(
+                id="test",
+                type="command",
+                commands=(PASSING_COMMAND,),
+                output="test-output.txt",
+            )
+
+            result = executor.run_stage(stage, "TASK-001")
+
+            self.assertEqual(result.status, "pass")
+            output_path = root / result.output_path
+            self.assertTrue(output_path.exists())
+            output = output_path.read_text(encoding="utf-8")
+            self.assertIn("Exit code: 0", output)
+            self.assertIn("ok", output)
+
+    def test_failing_command_stage_returns_fail_and_writes_output(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            artifacts = ArtifactStore(root, ".nightshift", run_id="test-run")
+            executor = CommandExecutor(
+                root,
+                SafetyConfig(
+                    require_clean_worktree=False,
+                    scoped_paths=(".",),
+                    allowed_commands=(FAILING_COMMAND,),
+                    forbidden_commands=("rm -rf",),
+                ),
+                artifacts,
+            )
+            stage = StageConfig(
+                id="test",
+                type="command",
+                commands=(FAILING_COMMAND,),
+                output="test-output.txt",
+            )
+
+            result = executor.run_stage(stage, "TASK-001")
+
+            self.assertEqual(result.status, "fail")
+            self.assertIn("code 7", result.reason)
+            output = (root / result.output_path).read_text(encoding="utf-8")
+            self.assertIn("Exit code: 7", output)
+            self.assertIn("bad", output)
+
+    def test_unallowlisted_command_is_rejected_before_execution(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            executor = CommandExecutor(
+                root,
+                SafetyConfig(
+                    require_clean_worktree=False,
+                    scoped_paths=(".",),
+                    allowed_commands=(PASSING_COMMAND,),
+                    forbidden_commands=("rm -rf",),
+                ),
+                ArtifactStore(root, ".nightshift", run_id="test-run"),
+            )
+
+            with self.assertRaisesRegex(CommandError, "not allowlisted"):
+                executor.run_command(FAILING_COMMAND)
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tests/test_config.py
+++ b/tests/test_config.py
@ -0,0 +1,84 @@
+from pathlib import Path
+import tempfile
+import unittest
+
+from nightshift.config import load_config, validate_config
+from nightshift.errors import ConfigError
+from nightshift.init import init_project
+
+
+class ConfigTests(unittest.TestCase):
+    def test_valid_config_loads(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+
+            config = validate_config(root / "nightshift.yaml")
+
+            self.assertEqual(config.project.name, "example-project")
+            self.assertIn("planner", config.agents)
+            self.assertEqual(config.pipeline.max_task_retries, 3)
+            self.assertEqual(config.pipeline.stages[0].id, "plan")
+
+    def test_missing_required_section_fails_clearly(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            config_path = root / "nightshift.yaml"
+            config_path.write_text("project:\n  name: broken\n", encoding="utf-8")
+
+            with self.assertRaisesRegex(ConfigError, "missing required section 'safety'"):
+                load_config(config_path)
+
+    def test_pipeline_stage_cannot_reference_missing_agent(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+            config_path = root / "nightshift.yaml"
+            config_text = config_path.read_text(encoding="utf-8").replace(
+                "agent: planner", "agent: critic", 1
+            )
+            config_path.write_text(config_text, encoding="utf-8")
+
+            with self.assertRaisesRegex(ConfigError, "references unknown agent 'critic'"):
+                load_config(config_path)
+
+    def test_on_fail_must_reference_existing_stage(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+            config_path = root / "nightshift.yaml"
+            config_text = config_path.read_text(encoding="utf-8").replace(
+                "on_fail: plan", "on_fail: missing_stage", 1
+            )
+            config_path.write_text(config_text, encoding="utf-8")
+
+            with self.assertRaisesRegex(ConfigError, "on_fail references unknown stage"):
+                load_config(config_path)
+
+    def test_validate_requires_prompt_files(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+            (root / "agents" / "planner.md").unlink()
+
+            with self.assertRaisesRegex(ConfigError, "system prompt does not exist"):
+                validate_config(root / "nightshift.yaml")
+
+    def test_validate_rejects_unallowlisted_stage_command(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+            config_path = root / "nightshift.yaml"
+            config_text = config_path.read_text(encoding="utf-8").replace(
+                "- python -m unittest",
+                "- python -m pytest",
+                1,
+            )
+            config_path.write_text(config_text, encoding="utf-8")
+
+            with self.assertRaisesRegex(ConfigError, "not allowlisted"):
+                validate_config(config_path)
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tests/test_init.py
+++ b/tests/test_init.py
@ -0,0 +1,43 @@
+from pathlib import Path
+import tempfile
+import unittest
+
+from nightshift.errors import InitError
+from nightshift.init import init_project
+
+
+class InitProjectTests(unittest.TestCase):
+    def test_init_creates_expected_files(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            written = init_project(root)
+
+            self.assertIn(root / "nightshift.yaml", written)
+            self.assertTrue((root / "nightshift.yaml").exists())
+            self.assertTrue((root / "tasks.md").exists())
+            self.assertTrue((root / "agents" / "planner.md").exists())
+            self.assertTrue((root / "agents" / "implementer.md").exists())
+            self.assertTrue((root / "agents" / "reviewer.md").exists())
+
+    def test_init_refuses_to_overwrite_without_force(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+
+            with self.assertRaises(InitError):
+                init_project(root)
+
+    def test_init_can_overwrite_with_force(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+            init_project(root)
+            (root / "tasks.md").write_text("changed", encoding="utf-8")
+
+            init_project(root, force=True)
+
+            self.assertIn("TASK-001", (root / "tasks.md").read_text(encoding="utf-8"))
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tests/test_safety.py
+++ b/tests/test_safety.py
@ -0,0 +1,70 @@
+from pathlib import Path
+import tempfile
+import unittest
+
+from nightshift.errors import SafetyError
+from nightshift.safety import (
+    ensure_command_allowed,
+    resolve_inside_root,
+    resolve_project_root,
+    safe_artifact_path,
+    validate_scoped_paths,
+)
+
+
+class SafetyTests(unittest.TestCase):
+    def test_resolve_project_root_requires_directory(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            self.assertEqual(resolve_project_root(root), root.resolve())
+
+    def test_resolve_inside_root_accepts_relative_path(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            resolved = resolve_inside_root(root, "src/module.py")
+
+            self.assertEqual(resolved, (root / "src" / "module.py").resolve())
+
+    def test_resolve_inside_root_rejects_traversal(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            with self.assertRaisesRegex(SafetyError, "outside project root"):
+                resolve_inside_root(root, "../outside.txt")
+
+    def test_validate_scoped_paths_rejects_escape(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            with self.assertRaisesRegex(SafetyError, "outside project root"):
+                validate_scoped_paths(root, ("src", "../elsewhere"))
+
+    def test_safe_artifact_path_rejects_escape(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            with self.assertRaisesRegex(SafetyError, "escapes artifact directory"):
+                safe_artifact_path(root, ".nightshift", "runs", "..", "..", "leak.txt")
+
+    def test_command_allowlist_accepts_exact_allowed_command(self) -> None:
+        command = ensure_command_allowed(
+            "python   -m   unittest",
+            ("python -m unittest",),
+            ("rm -rf", "git push"),
+        )
+
+        self.assertEqual(command, "python -m unittest")
+
+    def test_command_allowlist_rejects_unlisted_command(self) -> None:
+        with self.assertRaisesRegex(SafetyError, "not allowlisted"):
+            ensure_command_allowed("python -m pytest", ("python -m unittest",), ())
+
+    def test_forbidden_fragment_rejects_dangerous_command(self) -> None:
+        with self.assertRaisesRegex(SafetyError, "forbidden fragment"):
+            ensure_command_allowed("echo ok && rm   -rf build", ("echo ok && rm -rf build",), ("rm -rf",))
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tests/test_tasks.py
+++ b/tests/test_tasks.py
@ -0,0 +1,112 @@
+from pathlib import Path
+import tempfile
+import unittest
+
+from nightshift.errors import TaskError
+from nightshift.tasks import (
+    parse_task_file,
+    parse_tasks,
+    select_next_incomplete_task,
+    select_task_by_id,
+)
+
+
+TASKS_MD = """# Tasks
+
+- [x] TASK-001: Completed task
+
+Description:
+Already done.
+
+Acceptance Criteria:
+- It is complete
+
+- [ ] TASK-002: Add artifact directory creation
+
+Description:
+Create per-run and per-task artifact directories.
+
+Dependencies:
+- TASK-001
+
+Acceptance Criteria:
+- Creates `.nightshift/runs/<timestamp>/`
+- Creates task-specific folder
+- Writes task snapshot
+"""
+
+
+class TaskParserTests(unittest.TestCase):
+    def test_parse_documented_task_format(self) -> None:
+        tasks = parse_tasks(TASKS_MD)
+
+        self.assertEqual(len(tasks), 2)
+        self.assertEqual(tasks[1].id, "TASK-002")
+        self.assertEqual(tasks[1].title, "Add artifact directory creation")
+        self.assertFalse(tasks[1].completed)
+        self.assertEqual(
+            tasks[1].description,
+            "Create per-run and per-task artifact directories.",
+        )
+        self.assertEqual(tasks[1].dependencies, ("TASK-001",))
+        self.assertEqual(len(tasks[1].acceptance_criteria), 3)
+        self.assertIn("TASK-002", tasks[1].raw_markdown)
+
+    def test_select_next_incomplete_task(self) -> None:
+        tasks = parse_tasks(TASKS_MD)
+
+        selected = select_next_incomplete_task(tasks)
+
+        self.assertEqual(selected.id, "TASK-002")
+
+    def test_select_task_by_id(self) -> None:
+        tasks = parse_tasks(TASKS_MD)
+
+        selected = select_task_by_id(tasks, "TASK-001")
+
+        self.assertTrue(selected.completed)
+
+    def test_select_task_by_id_reports_available_tasks(self) -> None:
+        tasks = parse_tasks(TASKS_MD)
+
+        with self.assertRaisesRegex(TaskError, "Available tasks: TASK-001, TASK-002"):
+            select_task_by_id(tasks, "TASK-999")
+
+    def test_parse_task_file_rejects_path_traversal(self) -> None:
+        with tempfile.TemporaryDirectory() as directory:
+            root = Path(directory)
+
+            with self.assertRaisesRegex(TaskError, "outside project root"):
+                parse_task_file(root, "../tasks.md")
+
+    def test_malformed_task_header_has_useful_error(self) -> None:
+        markdown = """# Tasks
+
+- [ ] Add YAML config loading
+
+Acceptance Criteria:
+- Loads config
+"""
+
+        with self.assertRaisesRegex(TaskError, "malformed task header"):
+            parse_tasks(markdown)
+
+    def test_missing_acceptance_criteria_fails(self) -> None:
+        markdown = """# Tasks
+
+- [ ] TASK-001: Missing criteria
+
+Description:
+No acceptance criteria.
+"""
+
+        with self.assertRaisesRegex(TaskError, "missing Acceptance Criteria"):
+            parse_tasks(markdown)
+
+    def test_no_tasks_fails(self) -> None:
+        with self.assertRaisesRegex(TaskError, "no tasks found"):
+            parse_tasks("# Tasks\n\nNothing here.\n")
+
+
+if __name__ == "__main__":
+    unittest.main()