nightshift/docs/config-reference.md
RJS 67ea77baa6 Replace on_pass with on_status dict for per-status stage routing
on_status replaces the single on_pass field with a mapping that routes
each review status (pass/fail/retry/escalate) to a different target
stage. The lookup order for non-pass statuses is:
  on_status[status] -> on_fail -> next_stage (agent output)

Config parsing validates that on_status keys are valid status names
and all referenced stages exist. Includes test coverage for parsing,
validation errors, pass/fail/escalate routing, and on_fail fallback.
2026-05-23 14:43:50 +00:00

7.3 KiB

NightShift Config Reference

NightShift config is YAML.

project

  • name: project display name.
  • root: project root, resolved relative to the config file.
  • task_file: markdown task file inside the project root.
  • artifact_dir: artifact directory inside the project root.

safety

  • require_clean_worktree: when true, block runs if git status --short is dirty or unavailable.
  • scoped_paths: paths that must resolve inside the project root.
  • allowed_commands: exact command-stage allowlist entries after whitespace normalization.
  • forbidden_commands: dangerous fragments blocked before allowlist acceptance.
  • allowed_env: optional environment variable names to pass to command stages.

experiment

  • label: optional run experiment label.
  • prompt_variant: optional prompt variant label.

agents

Supported backends:

  • command: runs a local command with the prompt on stdin.
  • ollama: calls the local Ollama HTTP API at http://localhost:11434/api/generate by default.
  • openai_compatible: calls a Chat Completions-compatible HTTP API.

Command agent:

planner:
  backend: command
  command: echo
  system_prompt: agents/planner.md

Agent roles:

  • role: debugger marks an agent as diagnosis-only. When a stage fails and a debugger is configured, NightShift sends the task, failed stage output, and retry history to that agent before the next retry.

Stage model routing:

agent_pool:
  - small-implementer
  - larger-implementer

When agent_pool is set, NightShift uses the first agent initially and advances through the list as retry count increases. Each agent still owns its own backend, model, and temperature.

Telemetry:

NightShift writes telemetry-summary.md at both run and task scope. The summary estimates prompt/output tokens from captured prompts and responses, records stage runtime, retry count, status, agent id, and model, and groups success/failure statistics per model.

Ollama agent:

planner:
  backend: ollama
  model: qwen3-coder:30b
  base_url: http://localhost:11434
  system_prompt: agents/planner.md
  temperature: 0.2
  num_ctx: 8192
  num_predict: 4096
  seed: 1
  stop:
    - STOP

Optional Ollama generation options currently supported by NightShift are temperature, num_ctx, num_predict, seed, and stop.

pipeline

  • max_task_retries: task retry limit.
  • continue_on_task_failure: for run --all, continue after failed/blocked tasks.
  • stages: ordered state-machine stages.

Command stage options:

  • commands: command strings.
  • Command strings may use task placeholders: {task_id}, {task_id_lower}, {task_id_slug}, and {task_id_compact}.
  • shell: defaults to true. Set false for argv-style execution.
  • timeout_seconds: per-stage timeout override.
  • working_dir: command working directory inside project root.

Patch validator stage options:

  • max_files: max files changed.
  • max_lines: max changed lines.
  • max_delete_ratio: reject deletion-heavy patches above this deleted-line share, from 0.0 to 1.0.
  • allowed_paths: optional stage-specific allowlist. If set, every changed path must be inside one of these paths.
  • forbidden_paths: paths the patch must not touch.
  • Unified diff hunk line prefixes and hunk line counts are validated before patch apply.
  • The patch normalizer recomputes hunk line counts from hunk bodies for direct unified diff output.

Writer stages:

  • code_writer: agent returns a unified diff directly.
  • file_writer: agent returns complete file content blocks; NightShift generates the unified diff deterministically. Prefer this for local models that wrap or miscount long patch hunks.

file_writer blocks use this form:

```file:relative/path.py
<complete file content>
```

Semantic context stage:

- id: semantic_context
  type: semantic_context
  output: semantic-context.md

This stage builds a lightweight repository index of files, Python symbols, imports, and tests, then writes compact relevant snippets for the current task. It is keyword based with symbol-aware scoring, so it works without a vector database or network dependency.

on_status Stage Routing

Instead of a single on_fail catch-all, use on_status to route each review status to a different stage:

- id: review
  type: agent_review
  agent: reviewer
  output: review.md
  on_status:
    pass: summarize
    retry: implement
    fail: plan
    escalate: human

on_status supports pass, fail, retry, and escalate keys. For pass, it overrides sequential progression and any agent-supplied next_stage. For non-pass statuses, the lookup order is: on_status[status]on_failnext_stage (agent output).

Failure, Retry, and Resource Artifacts

Failed command and validation stages write deterministic diagnostics under the task artifact directory:

  • diagnostics/<stage>-failure.md: failure category, probable root cause, confidence, recommended next action, retry recommendation, modified files, and failing tests.
  • diagnostics/dependency-diagnostic.md: Python missing-import and manifest guidance when the classifier detects dependency failures.
  • retry-memory.md: compact summaries of previous attempts.
  • escalation-policy.md: churn detection result and recommended escalation action.
  • resource-requests.md plus resources/: generated run-local fixtures for supported blocked requests.

Agents can request generated run-local fixtures with a line like:

blocked_request: json fixtures/input.json missing fixture for test

Supported fixture types are png, jpg, json, sqlite, text, and blob.

Integration Runs

nightshift integ-run creates a timestamped directory under integ_runs/ with an isolated virtual environment, initialized template project, logs, transcript, patch, and artifact directories. integ_runs/ is ignored by git.

Create a local integration sandbox from the NightShift repository root:

python -m nightshift.cli integ-run --template tutorial-deaddrop

Create, set up, validate, and run one task from the generated project directory:

python -m nightshift.cli integ-test --template tutorial-deaddrop --task TASK-001

Set up the generated Python project:

python -m nightshift.cli integ-setup --project integ_runs/<timestamp>/project

The setup helper:

  • finds or creates the integration virtual environment
  • installs this NightShift checkout into that venv
  • installs the target project with pip install -e
  • installs extra packages, defaulting to pytest
  • runs nightshift validate unless --skip-validate is set

Preview commands without running them:

python -m nightshift.cli integ-setup --project integ_runs/<timestamp>/project --dry-run

Summarize the latest integration artifact run:

python -m nightshift.cli integ-report --latest

To clean up old sandboxes before creating a new one, keep only the newest three existing runs:

python -m nightshift.cli integ-run --template tutorial-deaddrop --keep 3

DeadDrop Tutorial

nightshift init --template tutorial-deaddrop creates a small Flask snippet sharing utility with deterministic tests and incremental NightShift tasks. Its pipeline includes semantic context retrieval, telemetry, debugger support, fixed task-specific tests, and a single default qwen3-coder:30b model path.